youtube-summarizer/README.md

578 lines
22 KiB
Markdown

# YouTube Summarizer API & Web Application
A comprehensive AI-powered API ecosystem and web application that automatically extracts, transcribes, and summarizes YouTube videos. Features enterprise-grade developer tools, SDKs, agent framework integrations, and autonomous operations.
## 🚀 What's New: Advanced API Ecosystem
### Developer API Platform
- **🔌 MCP Server**: Model Context Protocol integration for AI development tools
- **📦 Native SDKs**: Python and JavaScript/TypeScript SDKs with full async support
- **🤖 Agent Frameworks**: LangChain, CrewAI, and AutoGen integrations
- **🔄 Webhooks**: Real-time event notifications with HMAC authentication
- **🤖 Autonomous Operations**: Self-managing system with intelligent automation
- **🔑 API Authentication**: Enterprise-grade API key management and rate limiting
- **📊 OpenAPI 3.0**: Comprehensive API documentation and client generation
## 🎯 Features
### Core Features
- **Dual Transcript Options** ✅ **UPGRADED**: Choose between YouTube captions, AI Whisper transcription, or compare both
- **YouTube Captions**: Fast extraction (~3s) with standard quality
- **Faster-Whisper AI** ⚡ **NEW**: **20-32x speed improvement** with large-v3-turbo model
- **Performance**: 2.3x faster than realtime processing (3.6 min video in 94 seconds)
- **Quality**: Perfect transcription accuracy (1.000 quality score, 0.962 confidence)
- **Technology**: CTranslate2 optimization engine with GPU acceleration
- **Intelligence**: Voice Activity Detection, int8 quantization, native MP3 support
- **Smart Comparison**: Side-by-side analysis with quality metrics and recommendations
- **Processing Time Estimates**: Real-time speed ratios and performance metrics
- **Quality Scoring**: Advanced confidence levels and improvement analysis
- **Video Transcript Extraction**: Automatically fetch transcripts from YouTube videos
- **AI-Powered Summarization**: Generate concise summaries using multiple AI models
- **Multi-Model Support**: Choose between OpenAI GPT, Anthropic Claude, or DeepSeek
- **Key Points Extraction**: Identify and highlight main topics and insights
- **Chapter Generation**: Automatically create timestamped chapters
- **Export Options**: Save summaries as Markdown, PDF, HTML, JSON, or plain text ✅
- **Template System**: Customizable export templates with Jinja2 support ✅
- **Bulk Export**: Export multiple summaries as organized ZIP archives ✅
- **Caching System**: Reduce API calls with intelligent caching
- **Rate Limiting**: Built-in protection against API overuse
### Authentication & Security ✅
- **Flexible Authentication**: Configurable auth system for development and production
- **Development Mode**: No authentication required by default - perfect for testing
- **Production Mode**: Automatic JWT-based authentication with user sessions
- **Environment Controls**: `VITE_FORCE_AUTH_MODE`, `VITE_AUTH_DISABLED` for fine control
- **User Registration & Login**: Secure email/password authentication with JWT tokens
- **Email Verification**: Required email verification for new accounts
- **Password Reset**: Secure password recovery via email
- **Session Management**: JWT access tokens with refresh token rotation
- **Protected Routes**: User-specific summaries and history (when auth enabled)
- **API Key Management**: Generate and manage personal API keys
- **Security Features**: bcrypt password hashing, token expiration, CORS protection
### Summary Management & History ✅
- **Persistent Job History**: Comprehensive history system that discovers all processed jobs from storage
- **High-Density Views**: See 12+ jobs in grid view, 15+ jobs in list view
- **Smart Discovery**: Automatically indexes existing files from `video_storage/` directories
- **Rich Metadata**: File status, processing times, word counts, storage usage
- **Enhanced Detail Modal**: Tabbed interface with transcript viewer, files, and metadata
- **Search & Filtering**: Real-time search with status, date, and tag filtering
- **History Tracking**: View all your processed summaries with search and filtering
- **Favorites**: Star important summaries for quick access
- **Tags & Notes**: Organize summaries with custom tags and personal notes
- **Sharing**: Generate shareable links for public summaries
- **Bulk Operations**: Select and manage multiple summaries at once
### Batch Processing ✅
- **Multiple URL Processing**: Process up to 100 YouTube videos in a single batch
- **File Upload Support**: Upload .txt or .csv files with YouTube URLs
- **Sequential Processing**: Smart queue management to control API costs
- **Real-time Progress**: WebSocket-powered live progress updates
- **Individual Item Tracking**: See status, errors, and processing time per video
- **Retry Failed Items**: Automatically retry videos that failed processing
- **Batch Export**: Download all summaries as a organized ZIP archive
- **Cost Tracking**: Monitor API usage costs in real-time ($0.0025/1k tokens)
### Real-time Updates ✅
- **WebSocket Progress Tracking**: Live updates for all processing stages
- **Granular Progress**: Detailed percentage and sub-task progress
- **Time Estimation**: Intelligent time remaining based on historical data
- **Connection Recovery**: Automatic reconnection with message queuing
- **Job Cancellation**: Cancel any processing job with immediate termination
- **Visual Progress UI**: Beautiful progress component with stage indicators
- **Heartbeat Monitoring**: Connection health checks and status indicators
- **Offline Recovery**: Queued updates delivered when reconnected
### Enhanced Export System (NEW) ✅
- **Professional Document Generation**: Business-grade markdown with AI intelligence
- **Executive Summaries**: C-suite ready summaries with ROI analysis and strategic insights
- **Timestamped Navigation**: Clickable `[HH:MM:SS]` YouTube links for easy video navigation
- **6 Domain-Specific Templates**: Optimized for Educational, Business, Technical, Content Creation, Research, and General content
- **AI-Powered Recommendations**: Intelligent content analysis suggests best template for your video
- **Custom Template Creation**: Build and manage your own AI prompt templates with A/B testing
- **Quality Scoring**: Automated quality assessment for generated exports
- **Template Analytics**: Usage statistics and performance metrics for template optimization
## 🏗️ Architecture
```
[Web Interface] → [Authentication Layer] → [FastAPI Backend]
↓ ↓
[User Management] ← [JWT Auth] → [Dual Transcript Service] ← [YouTube API]
↓ ↓ ↓
[AI Service] ← [Summary Generation] ← [YouTube Captions] | [Whisper AI]
↓ ↓ ↓
[Database] → [User Summaries] → [Quality Comparison] → [Export Service]
```
### Enhanced Transcript Extraction (v5.1) ✅
- **9-Tier Fallback Chain**: Guaranteed transcript extraction with multiple methods
- YouTube Transcript API (primary)
- Auto-generated captions
- Whisper AI transcription
- PyTubeFix, YT-DLP, Playwright fallbacks
- External tools and web services
- **Audio Retention System**: Save audio files for re-transcription
- MP3 format (192kbps) for storage efficiency
- Metadata tracking (duration, quality, download date)
- Re-transcription without re-downloading
- **Dual Transcript Architecture**:
- **TranscriptSelector Component**: Choose between YouTube captions, Whisper AI, or both
- **DualTranscriptService**: Orchestrates parallel extraction and quality comparison
- **WhisperTranscriptService**: High-quality AI transcription with chunking support
- **Quality Comparison Engine**: Analyzes differences and provides recommendations
- **Real-time Progress**: WebSocket updates for long-running Whisper jobs
## 🚀 Quick Start
### Prerequisites
- Python 3.11+
- YouTube API Key (optional but recommended)
- At least one AI service API key (OpenAI, Anthropic, or DeepSeek)
### 🎯 Quick Testing (No Authentication Required)
**For immediate testing and development with our flexible authentication system:**
```bash
# Easy server management with restart scripts
./scripts/restart-backend.sh # Starts backend on port 8000
./scripts/restart-frontend.sh # Starts frontend on port 3002
./scripts/restart-both.sh # Starts both servers
# Visit main app (no login required by default)
open http://localhost:3002/
```
**Development Mode Features:**
- 🔓 **No authentication required** by default - perfect for development
- 🛡️ **Admin mode indicators** show you're in development mode
- 🔄 **Server restart scripts** handle backend changes seamlessly
- 🌐 **Full functionality** available without login barriers
**Production Authentication:**
```bash
# Enable authentication for production-like testing
VITE_FORCE_AUTH_MODE=true npm run dev
```
### Installation
1. **Clone the repository**
```bash
git clone https://eniasgit.zeabur.app/demo/youtube-summarizer.git
cd youtube-summarizer
```
2. **Set up virtual environment**
```bash
python3 -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```
3. **Install dependencies**
```bash
# Backend dependencies
cd backend
pip install -r requirements.txt
# Frontend dependencies (if applicable)
cd ../frontend
npm install
```
4. **Configure environment**
```bash
cp .env.example .env
# Edit .env with your API keys and configuration
```
5. **Initialize database**
```bash
cd backend
python3 -m alembic upgrade head # Apply existing migrations
```
6. **Run the application**
```bash
# Recommended: Use restart scripts for easy development
./scripts/restart-backend.sh # Backend on http://localhost:8000
./scripts/restart-frontend.sh # Frontend on http://localhost:3002
# Or run manually
cd backend && python3 main.py # Backend
cd frontend && npm run dev # Frontend
# Full stack restart after major changes
./scripts/restart-both.sh
```
## 📁 Project Structure
```
youtube-summarizer/
├── scripts/ # Development tools ✅ NEW
│ ├── restart-backend.sh # Backend restart script
│ ├── restart-frontend.sh # Frontend restart script
│ └── restart-both.sh # Full stack restart
├── logs/ # Server logs (auto-created)
├── backend/
│ ├── api/ # API endpoints
│ │ ├── auth.py # Authentication endpoints
│ │ ├── history.py # Job history API ✅ NEW
│ │ ├── pipeline.py # Pipeline management
│ │ ├── export.py # Export functionality
│ │ └── videos.py # Video operations
│ ├── services/ # Business logic
│ │ ├── job_history_service.py # History management ✅ NEW
│ │ ├── auth_service.py # JWT authentication
│ │ ├── email_service.py # Email notifications
│ │ ├── youtube_service.py # YouTube integration
│ │ └── ai_service.py # AI summarization
│ ├── models/ # Database models
│ │ ├── job_history.py # Job history models ✅ NEW
│ │ ├── user.py # User & auth models
│ │ ├── summary.py # Summary models
│ │ ├── batch_job.py # Batch processing models
│ │ └── video.py # Video models
│ ├── core/ # Core utilities
│ │ ├── config.py # Configuration
│ │ ├── database.py # Database setup
│ │ └── exceptions.py # Custom exceptions
│ ├── alembic/ # Database migrations
│ ├── tests/ # Test suite
│ │ ├── unit/ # Unit tests
│ │ └── integration/ # Integration tests
│ ├── main.py # Application entry point
│ └── requirements.txt # Python dependencies
├── frontend/ # React frontend
│ ├── src/ # Source code
│ │ ├── components/ # React components
│ │ │ ├── history/ # History components ✅ NEW
│ │ │ ├── auth/ # Auth components
│ │ │ └── forms/ # Form components
│ │ ├── pages/ # Page components
│ │ │ ├── MainPage.tsx # Unified main page ✅ NEW
│ │ │ ├── HistoryPage.tsx # Job history page ✅ NEW
│ │ │ └── auth/ # Auth pages
│ │ ├── config/ # Configuration ✅ NEW
│ │ │ └── app.config.ts # App & auth config ✅ NEW
│ │ ├── api/ # API clients
│ │ │ └── historyAPI.ts # History API client ✅ NEW
│ │ └── hooks/ # React hooks
│ ├── public/ # Static assets
│ ├── .env.example # Environment variables ✅ NEW
│ └── package.json # Node dependencies
├── docs/ # Documentation
│ ├── stories/ # BMad story files
│ └── architecture.md # System design
└── README.md # This file
```
## 🔧 Configuration
### Essential Environment Variables
| Variable | Description | Required |
|----------|-------------|----------|
| **Authentication** | | |
| `JWT_SECRET_KEY` | Secret key for JWT tokens | Production |
| `JWT_ALGORITHM` | JWT algorithm (default: HS256) | No |
| `ACCESS_TOKEN_EXPIRE_MINUTES` | Access token expiry (default: 15) | No |
| `REFRESH_TOKEN_EXPIRE_DAYS` | Refresh token expiry (default: 7) | No |
| **Frontend Authentication****NEW** | | |
| `VITE_FORCE_AUTH_MODE` | Enable auth in development (`true`) | No |
| `VITE_AUTH_REQUIRED` | Force authentication requirement | No |
| `VITE_AUTH_DISABLED` | Disable auth even in production | No |
| `VITE_SHOW_AUTH_UI` | Show login/register buttons | No |
| **Email Service** | | |
| `SMTP_HOST` | SMTP server host | For production |
| `SMTP_PORT` | SMTP server port | For production |
| `SMTP_USER` | SMTP username | For production |
| `SMTP_PASSWORD` | SMTP password | For production |
| `SMTP_FROM_EMAIL` | Sender email address | For production |
| **AI Services** | | |
| `YOUTUBE_API_KEY` | YouTube Data API v3 key | Optional* |
| `OPENAI_API_KEY` | OpenAI API key | One of these |
| `ANTHROPIC_API_KEY` | Anthropic Claude API key | is required |
| `DEEPSEEK_API_KEY` | DeepSeek API key | for AI |
| **Database** | | |
| `DATABASE_URL` | Database connection string | Yes |
| **Application** | | |
| `SECRET_KEY` | Application secret key | Yes |
| `ENVIRONMENT` | dev/staging/production | Yes |
| `APP_NAME` | Application name (default: YouTube Summarizer) | No |
*YouTube API key improves metadata fetching but transcript extraction works without it.
## 🧪 Testing
Run the test suite:
```bash
cd backend
# Run all tests
python3 -m pytest tests/ -v
# Run unit tests only
python3 -m pytest tests/unit/ -v
# Run integration tests
python3 -m pytest tests/integration/ -v
# With coverage report
python3 -m pytest tests/ --cov=backend --cov-report=html
```
## 📝 API Documentation
Once running, visit:
- Interactive API docs: `http://localhost:8000/docs`
- Alternative docs: `http://localhost:8000/redoc`
### Authentication Endpoints
- `POST /api/auth/register` - Register a new user
- `POST /api/auth/login` - Login and receive JWT tokens
- `POST /api/auth/refresh` - Refresh access token
- `POST /api/auth/logout` - Logout and revoke tokens
- `GET /api/auth/me` - Get current user info
- `POST /api/auth/verify-email` - Verify email address
- `POST /api/auth/reset-password` - Request password reset
- `POST /api/auth/reset-password/confirm` - Confirm password reset
### Core Endpoints
- `POST /api/pipeline/process` - Submit a YouTube URL for summarization
- `GET /api/pipeline/status/{job_id}` - Get processing status
- `GET /api/pipeline/result/{job_id}` - Retrieve summary result
- `GET /api/summaries` - List user's summaries (requires auth)
- `POST /api/export/{id}` - Export summary in different formats
- `POST /api/export/bulk` - Export multiple summaries as ZIP
### Batch Processing Endpoints
- `POST /api/batch/create` - Create new batch processing job
- `GET /api/batch/{job_id}` - Get batch job status and progress
- `GET /api/batch/` - List all batch jobs for user
- `POST /api/batch/{job_id}/cancel` - Cancel running batch job
- `POST /api/batch/{job_id}/retry` - Retry failed items in batch
- `GET /api/batch/{job_id}/download` - Download batch results as ZIP
- `DELETE /api/batch/{job_id}` - Delete batch job and results
## 🔧 Developer API Ecosystem
### 🔌 MCP Server Integration
The YouTube Summarizer includes a FastMCP server providing Model Context Protocol tools:
```python
# Use with Claude Code or other MCP-compatible tools
mcp_tools = [
"extract_transcript", # Extract video transcripts
"generate_summary", # Create AI summaries
"batch_process", # Process multiple videos
"search_summaries", # Search processed content
"analyze_video" # Deep video analysis
]
# MCP Resources for monitoring
mcp_resources = [
"yt-summarizer://video-metadata/{video_id}",
"yt-summarizer://processing-queue",
"yt-summarizer://analytics"
]
```
### 📦 Native SDKs
#### Python SDK
```python
from youtube_summarizer import YouTubeSummarizerClient
async with YouTubeSummarizerClient(api_key="your-api-key") as client:
# Extract transcript
transcript = await client.extract_transcript("https://youtube.com/watch?v=...")
# Generate summary
summary = await client.generate_summary(
video_url="https://youtube.com/watch?v=...",
summary_type="comprehensive"
)
# Batch processing
batch = await client.batch_process(["url1", "url2", "url3"])
```
#### JavaScript/TypeScript SDK
```typescript
import { YouTubeSummarizerClient } from '@youtube-summarizer/sdk';
const client = new YouTubeSummarizerClient({ apiKey: 'your-api-key' });
// Extract transcript with progress tracking
const transcript = await client.extractTranscript('https://youtube.com/watch?v=...', {
onProgress: (progress) => console.log(`Progress: ${progress.percentage}%`)
});
// Generate summary with streaming
const summary = await client.generateSummary({
videoUrl: 'https://youtube.com/watch?v=...',
stream: true,
onChunk: (chunk) => process.stdout.write(chunk)
});
```
### 🤖 Agent Framework Integration
#### LangChain Tools
```python
from backend.integrations.langchain_tools import get_youtube_langchain_tools
from langchain.agents import create_react_agent
tools = get_youtube_langchain_tools()
agent = create_react_agent(llm=your_llm, tools=tools)
result = await agent.invoke({
"input": "Summarize this YouTube video: https://youtube.com/watch?v=..."
})
```
#### Multi-Framework Support
```python
from backend.integrations.agent_framework import create_youtube_agent_orchestrator
orchestrator = create_youtube_agent_orchestrator()
# Works with LangChain, CrewAI, AutoGen
result = await orchestrator.process_video(
"https://youtube.com/watch?v=...",
framework=FrameworkType.LANGCHAIN
)
```
### 🔄 Webhooks & Autonomous Operations
#### Webhook Events
```javascript
// Register webhook endpoint
POST /api/autonomous/webhooks/my-app
{
"url": "https://myapp.com/webhooks",
"events": [
"transcription.completed",
"summarization.completed",
"batch.completed",
"error.occurred"
],
"security_type": "hmac_sha256"
}
// Webhook payload example
{
"event": "transcription.completed",
"timestamp": "2024-01-20T10:30:00Z",
"data": {
"video_id": "abc123",
"transcript": "...",
"quality_score": 0.92,
"processing_time": 45.2
}
}
```
#### Autonomous Rules
```python
# Configure autonomous operations
POST /api/autonomous/automation/rules
{
"name": "Auto-Process Queue",
"trigger": "queue_based",
"action": "batch_process",
"parameters": {
"queue_threshold": 10,
"batch_size": 5
}
}
```
### 🔑 API Authentication
```bash
# Generate API key
POST /api/auth/api-keys
Authorization: Bearer {jwt-token}
# Use API key in requests
curl -H "X-API-Key: your-api-key" \
https://api.yoursummarizer.com/v1/extract
```
### 📊 Rate Limiting
- **Free Tier**: 100 requests/hour, 1000 requests/day
- **Pro Tier**: 1000 requests/hour, 10000 requests/day
- **Enterprise**: Unlimited with custom limits
### 🌐 API Endpoints
#### Developer API v1
- `POST /api/v1/extract` - Extract transcript with options
- `POST /api/v1/summarize` - Generate summary
- `POST /api/v1/batch` - Batch processing
- `GET /api/v1/status/{job_id}` - Check job status
- `POST /api/v1/search` - Search processed content
- `POST /api/v1/analyze` - Deep video analysis
- `GET /api/v1/webhooks` - Manage webhooks
- `POST /api/v1/automation` - Configure automation
## 🚢 Deployment
### Docker
```bash
docker build -t youtube-summarizer .
docker run -p 8082:8082 --env-file .env youtube-summarizer
```
### Production Considerations
1. **Database**: Use PostgreSQL instead of SQLite for production
2. **Security**:
- Configure proper CORS settings
- Set up SSL/TLS certificates
- Use strong JWT secret keys
- Enable HTTPS-only cookies
3. **Email Service**: Configure production SMTP server (SendGrid, AWS SES, etc.)
4. **Rate Limiting**: Configure per-user rate limits
5. **Monitoring**:
- Set up application monitoring (Sentry, New Relic)
- Configure structured logging
- Monitor JWT token usage
6. **Scaling**:
- Use Redis for session storage and caching
- Implement horizontal scaling with load balancer
- Use CDN for static assets
## 🤝 Contributing
1. Fork the repository
2. Create a feature branch
3. Commit your changes
4. Push to the branch
5. Create a Pull Request
## 📄 License
This project is part of the Personal AI Assistant ecosystem.
## 🔗 Related Projects
- [Personal AI Assistant](https://eniasgit.zeabur.app/demo/my-ai-projects)
- [YouTube Automation Service](https://eniasgit.zeabur.app/demo/youtube-automation)
- [PDF Translator](https://eniasgit.zeabur.app/demo/pdf-translator)
## 📞 Support
For issues and questions, please create an issue in the repository.