youtube-summarizer/docs/architecture.md

750 lines
24 KiB
Markdown

# YouTube Summarizer - Technical Architecture
## Architecture Overview
This document defines the comprehensive technical architecture for the YouTube Summarizer application, designed as a self-hosted, hobby-scale system with professional code quality.
### Design Principles
1. **Self-Hosted Priority**: All components run locally without external cloud dependencies (except AI API calls)
2. **Hobby Scale Optimization**: Simple deployment with Docker Compose, cost-effective (~$0.10/month)
3. **Professional Code Quality**: Modern technologies, type safety, comprehensive testing
4. **Background Processing**: User-requested priority for reliable video processing
5. **Learning-Friendly**: Technologies that provide quick feedback loops and satisfying development experience
## Technology Stack
### Backend Stack
| Component | Technology | Version | Purpose |
|-----------|------------|---------|---------|
| **Runtime** | Python | 3.11+ | AI library compatibility |
| **Framework** | FastAPI | Latest | High-performance async API |
| **Database** | SQLite → PostgreSQL | Latest | Development → Production |
| **ORM** | SQLAlchemy | 2.0+ | Async database operations |
| **Validation** | Pydantic | V2 | Request/response validation |
| **ASGI Server** | Uvicorn | Latest | Production ASGI server |
| **Testing** | pytest | Latest | Unit and integration testing |
### Frontend Stack
| Component | Technology | Version | Purpose |
|-----------|------------|---------|---------|
| **Framework** | React | 18+ | Modern UI framework |
| **Language** | TypeScript | Latest | Type-safe development |
| **Build Tool** | Vite | Latest | Fast development and building |
| **UI Library** | shadcn/ui | Latest | Component design system |
| **Styling** | Tailwind CSS | Latest | Utility-first CSS |
| **State Management** | Zustand | Latest | Global state management |
| **Server State** | React Query | Latest | API calls and caching |
| **Testing** | Vitest + RTL | Latest | Component and unit testing |
### AI & External Services
| Service | Provider | Model | Purpose |
|---------|----------|-------|---------|
| **Primary AI** | OpenAI | GPT-4o-mini | Cost-effective summarization |
| **Fallback AI** | Anthropic | Claude 3 Haiku | Backup model |
| **Alternative** | DeepSeek | DeepSeek Chat | Budget option |
| **Video APIs** | YouTube | youtube-transcript-api | Transcript extraction |
| **Metadata** | YouTube | yt-dlp | Video metadata |
### Development & Deployment
| Component | Technology | Purpose |
|-----------|------------|---------|
| **Containerization** | Docker + Docker Compose | Self-hosted deployment |
| **Code Quality** | Black + Ruff + mypy | Python formatting and linting |
| **Frontend Quality** | ESLint + Prettier | TypeScript/React standards |
| **Pre-commit** | pre-commit hooks | Automated quality checks |
| **Documentation** | FastAPI Auto Docs | API documentation |
## System Architecture
### High-Level Architecture
```
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ React Frontend │ │ FastAPI Backend │ │ AI Services │
│ │ │ │ │ │
│ • shadcn/ui │◄──►│ • REST API │◄──►│ • OpenAI │
│ • TypeScript │ │ • Background │ │ • Anthropic │
│ • Zustand │ │ Tasks │ │ • DeepSeek │
│ • React Query │ │ • SQLAlchemy │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │
│ ▼
│ ┌─────────────────┐
│ │ SQLite DB │
└──────────────►│ │
│ • Summaries │
│ • Jobs │
│ • Cache │
└─────────────────┘
```
### Project Structure
```
youtube-summarizer/
├── frontend/ # React TypeScript frontend
│ ├── src/
│ │ ├── components/ # UI components
│ │ │ ├── ui/ # shadcn/ui base components
│ │ │ ├── forms/ # Form components
│ │ │ ├── summary/ # Summary display components
│ │ │ ├── history/ # History management
│ │ │ ├── processing/ # Status and progress
│ │ │ ├── layout/ # Layout components
│ │ │ └── error/ # Error handling components
│ │ ├── hooks/ # Custom React hooks
│ │ │ ├── api/ # API-specific hooks
│ │ │ └── ui/ # UI utility hooks
│ │ ├── api/ # API client layer
│ │ ├── stores/ # Zustand stores
│ │ ├── types/ # TypeScript definitions
│ │ └── test/ # Test utilities
│ ├── public/ # Static assets
│ ├── package.json # Dependencies and scripts
│ ├── vite.config.ts # Build configuration
│ ├── vitest.config.ts # Test configuration
│ └── tailwind.config.js # Styling configuration
├── backend/ # FastAPI Python backend
│ ├── api/ # API endpoints
│ │ ├── __init__.py
│ │ ├── summarize.py # Main summarization endpoints
│ │ ├── summaries.py # Summary retrieval endpoints
│ │ └── health.py # Health check endpoints
│ ├── services/ # Business logic
│ │ ├── __init__.py
│ │ ├── video_service.py # YouTube integration
│ │ ├── ai_service.py # AI model integration
│ │ └── cache_service.py # Caching logic
│ ├── models/ # Database models
│ │ ├── __init__.py
│ │ ├── summary.py # Summary data model
│ │ └── job.py # Processing job model
│ ├── repositories/ # Data access layer
│ │ ├── __init__.py
│ │ ├── summary_repository.py
│ │ └── job_repository.py
│ ├── core/ # Core utilities
│ │ ├── __init__.py
│ │ ├── config.py # Configuration management
│ │ ├── database.py # Database connection
│ │ ├── exceptions.py # Custom exception classes
│ │ ├── security.py # Rate limiting and validation
│ │ └── cache.py # Caching implementation
│ ├── tests/ # Test suite
│ │ ├── unit/ # Unit tests
│ │ ├── integration/ # Integration tests
│ │ └── conftest.py # Test configuration
│ ├── main.py # FastAPI application entry
│ ├── requirements.txt # Python dependencies
│ └── Dockerfile # Container configuration
├── docker-compose.yml # Self-hosted deployment
├── .env.example # Environment template
├── .pre-commit-config.yaml # Code quality hooks
├── .gitignore # Git ignore patterns
└── README.md # Setup and usage guide
```
## Data Models
### Summary Model
```python
class Summary(Base):
__tablename__ = "summaries"
# Primary key
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
# Video information
video_id = Column(String(20), nullable=False, index=True)
video_title = Column(Text)
video_url = Column(Text, nullable=False)
video_duration = Column(Integer) # Duration in seconds
video_channel = Column(String(255))
video_upload_date = Column(String(20)) # YYYY-MM-DD format
video_thumbnail_url = Column(Text)
video_view_count = Column(Integer)
# Transcript data
transcript_text = Column(Text)
transcript_language = Column(String(10), default='en')
transcript_type = Column(String(20)) # 'manual' or 'auto-generated'
# Summary data
summary_text = Column(Text)
key_points = Column(JSON) # Array of strings
chapters = Column(JSON) # Array of chapter objects
# Processing metadata
model_used = Column(String(50), nullable=False)
processing_time = Column(Float) # Processing time in seconds
token_count = Column(Integer) # Total tokens used
cost_estimate = Column(Float) # Estimated cost in USD
# Timestamps
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
# Cache keys for invalidation
cache_key = Column(String(255), index=True) # Hash of video_id + model + options
```
### Processing Job Model
```python
class ProcessingJob(Base):
__tablename__ = "processing_jobs"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
video_url = Column(Text, nullable=False)
video_id = Column(String(20), nullable=False)
# Job configuration
model_name = Column(String(50), nullable=False)
options = Column(JSON) # Summary options (length, focus, etc.)
# Job status
status = Column(Enum(JobStatus), default=JobStatus.PENDING, nullable=False)
progress_percentage = Column(Integer, default=0)
current_step = Column(String(50)) # "validating", "extracting", "summarizing"
# Results
summary_id = Column(UUID(as_uuid=True)) # Foreign key to Summary
error_message = Column(Text)
error_code = Column(String(50))
# Timing
created_at = Column(DateTime, default=datetime.utcnow)
started_at = Column(DateTime)
completed_at = Column(DateTime)
```
## API Specification
### Core Endpoints
#### POST /api/summarize
**Purpose**: Submit a YouTube URL for summarization
**Request**:
```typescript
interface SummarizeRequest {
url: string; // YouTube URL
model?: string; // AI model selection (default: "openai")
options?: {
length?: "brief" | "standard" | "detailed";
focus?: string;
};
}
```
**Response**:
```typescript
interface SummarizeResponse {
id: string; // Summary ID
video: VideoMetadata; // Video information
summary: SummaryData; // Generated summary
status: "completed" | "processing";
processing_time: number;
}
```
#### GET /api/summary/{id}
**Purpose**: Retrieve a specific summary
**Response**:
```typescript
interface SummaryResponse {
id: string;
video: VideoMetadata;
summary: SummaryData;
created_at: string;
metadata: ProcessingMetadata;
}
```
#### GET /api/summaries
**Purpose**: List recent summaries with optional filtering
**Query Parameters**:
- `limit`: Number of results (default: 20)
- `search`: Search term for title/content
- `model`: Filter by AI model used
### Error Handling
#### Error Response Format
```typescript
interface APIErrorResponse {
error: {
code: string; // Error code (e.g., "INVALID_URL")
message: string; // Human-readable message
details: object; // Additional error context
recoverable: boolean; // Whether retry might succeed
timestamp: string; // ISO timestamp
path: string; // Request path
}
}
```
#### Error Codes
- `INVALID_URL`: Invalid YouTube URL format
- `VIDEO_NOT_FOUND`: Video is unavailable or private
- `TRANSCRIPT_UNAVAILABLE`: No transcript available for video
- `AI_SERVICE_ERROR`: AI service temporarily unavailable
- `RATE_LIMITED`: Too many requests from this IP
- `TOKEN_LIMIT_EXCEEDED`: Video transcript too long for model
- `UNKNOWN_ERROR`: Unexpected server error
## Frontend Architecture
### Component Architecture
#### Core Components
- **SummarizeForm**: Main URL input form with validation
- **SummaryDisplay**: Comprehensive summary viewer with export options
- **ProcessingStatus**: Real-time progress updates
- **SummaryHistory**: Searchable list of previous summaries
- **ErrorBoundary**: React error boundaries with recovery options
#### State Management
**Zustand Stores**:
```typescript
interface AppStore {
// UI state
theme: 'light' | 'dark';
sidebarOpen: boolean;
// Processing state
currentJob: ProcessingJob | null;
processingHistory: ProcessingJob[];
// Settings
defaultModel: string;
summaryLength: string;
}
interface SummaryStore {
summaries: Summary[];
currentSummary: Summary | null;
searchResults: Summary[];
// Actions
addSummary: (summary: Summary) => void;
updateSummary: (id: string, updates: Partial<Summary>) => void;
searchSummaries: (query: string) => void;
}
```
#### API Client Architecture
**TypeScript API Client**:
```typescript
class APIClient {
private baseURL: string;
private httpClient: AxiosInstance;
// Configure automatic retries and error handling
constructor(baseURL: string) {
this.httpClient = axios.create({
baseURL,
timeout: 30000,
});
this.setupInterceptors();
}
// Type-safe API methods
async summarizeVideo(request: SummarizeRequest): Promise<SummarizeResponse>;
async getSummary(id: string): Promise<SummaryResponse>;
async getSummaries(params?: SummaryListParams): Promise<SummaryListResponse>;
async exportSummary(id: string, format: ExportFormat): Promise<Blob>;
}
```
## Backend Services
### Video Service
**Purpose**: Handle YouTube URL processing and transcript extraction
**Key Methods**:
```python
class VideoService:
async def extract_video_id(self, url: str) -> str:
"""Extract video ID with comprehensive URL format support"""
async def get_transcript(self, video_id: str) -> Dict[str, Any]:
"""Get transcript with fallback chain:
1. Manual captions (preferred)
2. Auto-generated captions
3. Error with helpful message
"""
async def get_video_metadata(self, video_id: str) -> Dict[str, Any]:
"""Extract metadata using yt-dlp for rich video information"""
```
### AI Service
**Purpose**: Manage AI model integration with provider abstraction
**Key Methods**:
```python
class AIService:
def __init__(self, provider: str, api_key: str):
self.provider = provider
self.client = self._get_client(provider, api_key)
async def generate_summary(
self,
transcript: str,
video_metadata: Dict[str, Any],
options: Dict[str, Any] = None
) -> Dict[str, Any]:
"""Generate structured summary with:
- Overview paragraph
- Key points list
- Chapter breakdown (if applicable)
- Cost tracking
"""
```
### Cache Service
**Purpose**: Intelligent caching to minimize API costs
**Caching Strategy**:
```python
class CacheService:
def get_cache_key(self, video_id: str, model: str, options: Dict) -> str:
"""Generate cache key from video_id + model + options hash"""
async def get_cached_summary(self, cache_key: str) -> Optional[Summary]:
"""Retrieve cached summary if within TTL"""
async def cache_summary(self, cache_key: str, summary: Summary, ttl: int = 86400):
"""Store summary with 24-hour default TTL"""
```
## Testing Strategy
### Backend Testing
**Test Structure**:
```
backend/tests/
├── unit/
│ ├── test_video_service.py # URL parsing, transcript extraction
│ ├── test_ai_service.py # AI integration, prompt engineering
│ ├── test_cache_service.py # Cache logic, key generation
│ └── test_repositories.py # Database operations
├── integration/
│ ├── test_api.py # End-to-end API testing
│ ├── test_background_jobs.py # Background processing
│ └── test_error_handling.py # Error scenarios
└── conftest.py # Test configuration and fixtures
```
**Testing Patterns**:
- **Repository Pattern Testing**: Mock database, test data operations
- **Service Layer Testing**: Mock external APIs, test business logic
- **API Endpoint Testing**: FastAPI TestClient for request/response testing
- **Error Scenario Testing**: Comprehensive error condition coverage
### Frontend Testing
**Test Structure**:
```
frontend/src/
├── components/
│ ├── SummarizeForm.test.tsx # Form validation, submission
│ ├── SummaryDisplay.test.tsx # Summary rendering, export
│ └── ErrorBoundary.test.tsx # Error handling components
├── hooks/
│ ├── api/
│ │ └── useSummarization.test.ts # API hook testing
│ └── ui/
├── test/
│ ├── setup.ts # Global test configuration
│ ├── mocks/ # API and component mocks
│ └── utils.tsx # Test utilities and wrappers
└── api/
└── client.test.ts # API client testing
```
**Testing Patterns**:
- **Component Testing**: Render, interaction, and state testing
- **Custom Hook Testing**: Logic testing with renderHook
- **API Client Testing**: Mock HTTP responses, error handling
- **Integration Testing**: Full user flow testing
### Test Configuration
**pytest Configuration** (`backend/pytest.ini`):
```ini
[tool:pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts =
--verbose
--cov=.
--cov-report=html
--cov-report=term-missing
--asyncio-mode=auto
```
**Vitest Configuration** (`frontend/vitest.config.ts`):
```typescript
export default defineConfig({
plugins: [react()],
test: {
environment: 'jsdom',
setupFiles: ['./src/test/setup.ts'],
globals: true,
css: true,
coverage: {
reporter: ['text', 'html', 'json'],
exclude: ['node_modules/', 'src/test/']
}
}
});
```
## Deployment Architecture
### Self-Hosted Docker Deployment
**Docker Compose Configuration**:
```yaml
version: '3.8'
services:
backend:
build: ./backend
ports:
- "8000:8000"
environment:
- DATABASE_URL=sqlite:///./data/youtube_summarizer.db
- OPENAI_API_KEY=${OPENAI_API_KEY}
volumes:
- ./data:/app/data
- ./logs:/app/logs
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
frontend:
build: ./frontend
ports:
- "3000:3000"
environment:
- REACT_APP_API_URL=http://localhost:8000
depends_on:
- backend
restart: unless-stopped
```
### Environment Configuration
**Required Environment Variables**:
```bash
# API Keys (at least one required)
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
DEEPSEEK_API_KEY=sk-your-deepseek-key
# Database
DATABASE_URL=sqlite:///./data/youtube_summarizer.db
# Security
SECRET_KEY=your-secret-key-here
CORS_ORIGINS=http://localhost:3000,http://localhost:5173
# Optional: YouTube API for metadata
YOUTUBE_API_KEY=your-youtube-api-key
# Application Settings
MAX_VIDEO_LENGTH_MINUTES=180
RATE_LIMIT_PER_MINUTE=30
CACHE_TTL_HOURS=24
# Frontend Environment Variables
REACT_APP_API_URL=http://localhost:8000
REACT_APP_ENVIRONMENT=development
```
## Security Considerations
### Input Validation
- **URL Validation**: Comprehensive YouTube URL format checking
- **Input Sanitization**: HTML escaping and XSS prevention
- **Request Size Limits**: Prevent oversized requests
### Rate Limiting
```python
class RateLimiter:
def __init__(self, max_requests: int = 30, window_seconds: int = 60):
self.max_requests = max_requests
self.window_seconds = window_seconds
def is_allowed(self, client_ip: str) -> bool:
"""Check if request is allowed for this IP"""
```
### API Key Management
- Environment variable storage (never commit to repository)
- Rotation capability for production deployments
- Separate keys for different environments
### CORS Configuration
```python
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000", "http://localhost:5173"],
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE"],
allow_headers=["*"],
)
```
## Performance Optimization
### Backend Optimization
- **Async Everything**: All I/O operations use async/await
- **Background Processing**: Long-running tasks don't block requests
- **Intelligent Caching**: Memory and database caching layers
- **Connection Pooling**: Database connection reuse
### Frontend Optimization
- **Virtual Scrolling**: Handle large summary lists efficiently
- **Debounced Search**: Reduce API calls during user input
- **Code Splitting**: Load components only when needed
- **React Query Caching**: Automatic request deduplication and caching
### Caching Strategy
```python
# Multi-layer caching approach
# 1. Memory cache for hot data (current session)
# 2. Database cache for persistence (24-hour TTL)
# 3. Smart cache keys: hash(video_id + model + options)
def get_cache_key(video_id: str, model: str, options: dict) -> str:
key_data = f"{video_id}:{model}:{json.dumps(options, sort_keys=True)}"
return hashlib.sha256(key_data.encode()).hexdigest()
```
## Cost Optimization
### AI API Cost Management
- **Model Selection**: Default to GPT-4o-mini (~$0.01/1K tokens)
- **Token Optimization**: Efficient prompts and transcript chunking
- **Caching Strategy**: 24-hour cache reduces repeat API calls
- **Usage Tracking**: Monitor and alert on cost thresholds
### Target Cost Structure (Hobby Scale)
- **Base Cost**: ~$0.10/month for typical usage
- **Video Processing**: ~$0.001-0.005 per 30-minute video
- **Caching Benefit**: ~80% reduction in repeat processing costs
## Development Workflow
### Quick Start Commands
```bash
# Development setup
git clone <repository>
cd youtube-summarizer
cp .env.example .env
# Edit .env with your API keys
# Single command startup
docker-compose up
# Access points
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs
```
### Development Scripts
```json
{
"scripts": {
"dev": "docker-compose up",
"dev:backend": "cd backend && uvicorn main:app --reload",
"dev:frontend": "cd frontend && npm run dev",
"test": "npm run test:backend && npm run test:frontend",
"test:backend": "cd backend && pytest",
"test:frontend": "cd frontend && npm test",
"build": "docker-compose build",
"lint": "npm run lint:backend && npm run lint:frontend",
"lint:backend": "cd backend && ruff . && black . && mypy .",
"lint:frontend": "cd frontend && eslint src && prettier --check src"
}
}
```
### Git Hooks
```yaml
# .pre-commit-config.yaml
repos:
- repo: https://github.com/psf/black
rev: 23.3.0
hooks:
- id: black
files: ^backend/
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.0.270
hooks:
- id: ruff
files: ^backend/
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.3.0
hooks:
- id: mypy
files: ^backend/
additional_dependencies: [types-all]
- repo: https://github.com/pre-commit/mirrors-eslint
rev: v8.42.0
hooks:
- id: eslint
files: ^frontend/src/
types: [file]
types_or: [typescript, tsx]
```
---
## Architecture Decision Records
### ADR-001: Self-Hosted Architecture Choice
**Status**: Accepted
**Context**: User explicitly requested "no imselfhosting" and hobby-scale deployment
**Decision**: Docker Compose deployment with local database storage
**Consequences**: Simplified deployment, reduced costs, requires local resource management
### ADR-002: AI Model Strategy
**Status**: Accepted
**Context**: Cost optimization for hobby use while maintaining quality
**Decision**: Primary OpenAI GPT-4o-mini, fallback to other models
**Consequences**: ~$0.10/month costs, good quality summaries, multiple provider support
### ADR-003: Database Evolution Path
**Status**: Accepted
**Context**: Start simple but allow growth to production scale
**Decision**: SQLite for development/hobby, PostgreSQL migration path for production
**Consequences**: Zero-config development start, clear upgrade path when needed
---
*This architecture document serves as the definitive technical guide for implementing the YouTube Summarizer application.*