Initial commit: YouTube Summarizer project setup

- Created project structure following new standardized layout
- Added FastAPI-based main application
- Configured requirements with YouTube and AI integrations
- Added comprehensive README documentation
- Set up environment configuration template
This commit is contained in:
enias 2025-08-24 22:15:38 -04:00
commit 1bd82158fc
5 changed files with 416 additions and 0 deletions

40
.env.example Normal file
View File

@ -0,0 +1,40 @@
# YouTube Summarizer Configuration
# Server Configuration
APP_HOST=0.0.0.0
APP_PORT=8082
DEBUG=False
# YouTube API (optional but recommended for metadata)
YOUTUBE_API_KEY=your_youtube_api_key_here
# AI Service Configuration (choose one or multiple)
# OpenAI
OPENAI_API_KEY=your_openai_api_key_here
OPENAI_MODEL=gpt-4o-mini
# Anthropic Claude
ANTHROPIC_API_KEY=your_anthropic_api_key_here
ANTHROPIC_MODEL=claude-3-haiku-20240307
# DeepSeek (cost-effective option)
DEEPSEEK_API_KEY=your_deepseek_api_key_here
DEEPSEEK_MODEL=deepseek-chat
# Database
DATABASE_URL=sqlite:///./data/youtube_summarizer.db
# Session Configuration
SECRET_KEY=your-secret-key-here-change-in-production
# Rate Limiting
RATE_LIMIT_PER_MINUTE=30
MAX_VIDEO_LENGTH_MINUTES=180
# Cache Configuration
ENABLE_CACHE=True
CACHE_TTL_HOURS=24
# Logging
LOG_LEVEL=INFO
LOG_FILE=logs/app.log

53
.gitignore vendored Normal file
View File

@ -0,0 +1,53 @@
# Python
__pycache__/
*.py[cod]
*$py.class
*.so
.Python
env/
venv/
ENV/
.venv
# IDE
.vscode/
.idea/
*.swp
*.swo
*~
.DS_Store
# Environment
.env
.env.local
.env.*.local
# Testing
.coverage
htmlcov/
.pytest_cache/
.tox/
# Build
build/
dist/
*.egg-info/
.eggs/
# Logs
*.log
logs/
# Database
*.db
*.sqlite
*.sqlite3
# Media/Data
data/
downloads/
cache/
# Task Master
.taskmaster/reports/
.taskmaster/backups/

173
README.md Normal file
View File

@ -0,0 +1,173 @@
# YouTube Summarizer Web Application
An AI-powered web application that automatically extracts, transcribes, and summarizes YouTube videos, providing intelligent insights and key takeaways.
## 🎯 Features
- **Video Transcript Extraction**: Automatically fetch transcripts from YouTube videos
- **AI-Powered Summarization**: Generate concise summaries using multiple AI models
- **Multi-Model Support**: Choose between OpenAI GPT, Anthropic Claude, or DeepSeek
- **Key Points Extraction**: Identify and highlight main topics and insights
- **Chapter Generation**: Automatically create timestamped chapters
- **Export Options**: Save summaries as Markdown, PDF, or plain text
- **Caching System**: Reduce API calls with intelligent caching
- **Rate Limiting**: Built-in protection against API overuse
## 🏗️ Architecture
```
[Web Interface] → [FastAPI Backend] → [YouTube API/Transcript API]
[AI Service] ← [Summary Generation] ← [Transcript Processing]
[Database Cache] → [Summary Storage] → [Export Service]
```
## 🚀 Quick Start
### Prerequisites
- Python 3.11+
- YouTube API Key (optional but recommended)
- At least one AI service API key (OpenAI, Anthropic, or DeepSeek)
### Installation
1. **Clone the repository**
```bash
git clone https://eniasgit.zeabur.app/demo/youtube-summarizer.git
cd youtube-summarizer
```
2. **Set up virtual environment**
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. **Install dependencies**
```bash
pip install -r requirements.txt
```
4. **Configure environment**
```bash
cp .env.example .env
# Edit .env with your API keys and configuration
```
5. **Initialize database**
```bash
alembic init alembic
alembic revision --autogenerate -m "Initial migration"
alembic upgrade head
```
6. **Run the application**
```bash
python src/main.py
```
The application will be available at `http://localhost:8082`
## 📁 Project Structure
```
youtube-summarizer/
├── src/
│ ├── api/ # API endpoints
│ │ ├── routes.py # Main API routes
│ │ └── websocket.py # Real-time updates
│ ├── services/ # Business logic
│ │ ├── youtube.py # YouTube integration
│ │ ├── summarizer.py # AI summarization
│ │ └── cache.py # Caching service
│ ├── utils/ # Utility functions
│ │ ├── validators.py # Input validation
│ │ └── formatters.py # Output formatting
│ └── main.py # Application entry point
├── tests/ # Test suite
├── docs/ # Documentation
├── alembic/ # Database migrations
├── static/ # Frontend assets
├── templates/ # HTML templates
├── requirements.txt # Python dependencies
├── .env.example # Environment template
└── README.md # This file
```
## 🔧 Configuration
### Essential Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| `YOUTUBE_API_KEY` | YouTube Data API v3 key | Optional* |
| `OPENAI_API_KEY` | OpenAI API key | One of these |
| `ANTHROPIC_API_KEY` | Anthropic Claude API key | is required |
| `DEEPSEEK_API_KEY` | DeepSeek API key | for AI |
| `DATABASE_URL` | Database connection string | Yes |
| `SECRET_KEY` | Session secret key | Yes |
*YouTube API key improves metadata fetching but transcript extraction works without it.
## 🧪 Testing
Run the test suite:
```bash
pytest tests/ -v
pytest tests/ --cov=src --cov-report=html # With coverage
```
## 📝 API Documentation
Once running, visit:
- Interactive API docs: `http://localhost:8082/docs`
- Alternative docs: `http://localhost:8082/redoc`
### Key Endpoints
- `POST /api/summarize` - Submit a YouTube URL for summarization
- `GET /api/summary/{id}` - Retrieve a summary
- `GET /api/summaries` - List all summaries
- `POST /api/export/{id}` - Export summary in different formats
## 🚢 Deployment
### Docker
```bash
docker build -t youtube-summarizer .
docker run -p 8082:8082 --env-file .env youtube-summarizer
```
### Production Considerations
1. Use PostgreSQL instead of SQLite for production
2. Configure proper CORS settings
3. Set up SSL/TLS certificates
4. Implement user authentication
5. Configure rate limiting per user
6. Set up monitoring and logging
## 🤝 Contributing
1. Fork the repository
2. Create a feature branch
3. Commit your changes
4. Push to the branch
5. Create a Pull Request
## 📄 License
This project is part of the Personal AI Assistant ecosystem.
## 🔗 Related Projects
- [Personal AI Assistant](https://eniasgit.zeabur.app/demo/my-ai-projects)
- [YouTube Automation Service](https://eniasgit.zeabur.app/demo/youtube-automation)
- [PDF Translator](https://eniasgit.zeabur.app/demo/pdf-translator)
## 📞 Support
For issues and questions, please create an issue in the repository.

44
requirements.txt Normal file
View File

@ -0,0 +1,44 @@
# YouTube Summarizer Web Application Requirements
# Web framework
fastapi==0.104.1
uvicorn[standard]==0.24.0
python-multipart==0.0.6
# YouTube integration
youtube-transcript-api==0.6.1
pytube==15.0.0
yt-dlp==2024.8.6
# AI and NLP
openai==1.35.0
anthropic==0.28.0
langchain==0.2.5
tiktoken==0.7.0
# Database
sqlalchemy==2.0.23
alembic==1.13.0
# HTTP client
httpx==0.25.2
requests==2.31.0
# Environment and configuration
python-dotenv==1.0.0
pydantic==2.5.2
pydantic-settings==2.1.0
# Utilities
python-dateutil==2.8.2
pytz==2024.1
# Testing
pytest==7.4.3
pytest-asyncio==0.21.1
pytest-cov==4.1.0
# Development
black==23.12.0
ruff==0.1.8
mypy==1.7.1

106
src/main.py Normal file
View File

@ -0,0 +1,106 @@
#!/usr/bin/env python3
"""
YouTube Summarizer Web Application
Main entry point for the FastAPI application
"""
import os
import sys
from pathlib import Path
from contextlib import asynccontextmanager
from fastapi import FastAPI
from fastapi.middleware.cors import CORSMiddleware
from fastapi.staticfiles import StaticFiles
from dotenv import load_dotenv
# Add project root to path
sys.path.insert(0, str(Path(__file__).parent.parent))
# Load environment variables
load_dotenv()
# Import routers (to be created)
# from api.routes import api_router
# from api.websocket import websocket_router
@asynccontextmanager
async def lifespan(app: FastAPI):
"""
Manage application lifecycle
"""
# Startup
print("🚀 Starting YouTube Summarizer Application...")
# Initialize services here
# await initialize_database()
# await initialize_cache()
yield
# Shutdown
print("🛑 Shutting down YouTube Summarizer Application...")
# await cleanup_resources()
# Create FastAPI application
app = FastAPI(
title="YouTube Summarizer",
description="AI-powered YouTube video summarization service",
version="1.0.0",
lifespan=lifespan
)
# Configure CORS
app.add_middleware(
CORSMiddleware,
allow_origins=["*"], # Configure appropriately for production
allow_credentials=True,
allow_methods=["*"],
allow_headers=["*"],
)
# Mount static files (if needed)
# app.mount("/static", StaticFiles(directory="static"), name="static")
# Include routers
# app.include_router(api_router, prefix="/api")
# app.include_router(websocket_router)
@app.get("/")
async def root():
"""
Root endpoint
"""
return {
"message": "YouTube Summarizer API",
"version": "1.0.0",
"docs": "/docs",
"health": "/health"
}
@app.get("/health")
async def health_check():
"""
Health check endpoint
"""
return {
"status": "healthy",
"service": "youtube-summarizer"
}
if __name__ == "__main__":
import uvicorn
host = os.getenv("APP_HOST", "0.0.0.0")
port = int(os.getenv("APP_PORT", 8082))
debug = os.getenv("DEBUG", "False").lower() == "true"
print(f"🎬 YouTube Summarizer starting on http://{host}:{port}")
uvicorn.run(
"main:app",
host=host,
port=port,
reload=debug,
log_level="debug" if debug else "info"
)