24 KiB
YouTube Summarizer - Technical Architecture
Architecture Overview
This document defines the comprehensive technical architecture for the YouTube Summarizer application, designed as a self-hosted, hobby-scale system with professional code quality.
Design Principles
- Self-Hosted Priority: All components run locally without external cloud dependencies (except AI API calls)
- Hobby Scale Optimization: Simple deployment with Docker Compose, cost-effective (~$0.10/month)
- Professional Code Quality: Modern technologies, type safety, comprehensive testing
- Background Processing: User-requested priority for reliable video processing
- Learning-Friendly: Technologies that provide quick feedback loops and satisfying development experience
Technology Stack
Backend Stack
| Component | Technology | Version | Purpose |
|---|---|---|---|
| Runtime | Python | 3.11+ | AI library compatibility |
| Framework | FastAPI | Latest | High-performance async API |
| Database | SQLite → PostgreSQL | Latest | Development → Production |
| ORM | SQLAlchemy | 2.0+ | Async database operations |
| Validation | Pydantic | V2 | Request/response validation |
| ASGI Server | Uvicorn | Latest | Production ASGI server |
| Testing | pytest | Latest | Unit and integration testing |
Frontend Stack
| Component | Technology | Version | Purpose |
|---|---|---|---|
| Framework | React | 18+ | Modern UI framework |
| Language | TypeScript | Latest | Type-safe development |
| Build Tool | Vite | Latest | Fast development and building |
| UI Library | shadcn/ui | Latest | Component design system |
| Styling | Tailwind CSS | Latest | Utility-first CSS |
| State Management | Zustand | Latest | Global state management |
| Server State | React Query | Latest | API calls and caching |
| Testing | Vitest + RTL | Latest | Component and unit testing |
AI & External Services
| Service | Provider | Model | Purpose |
|---|---|---|---|
| Primary AI | OpenAI | GPT-4o-mini | Cost-effective summarization |
| Fallback AI | Anthropic | Claude 3 Haiku | Backup model |
| Alternative | DeepSeek | DeepSeek Chat | Budget option |
| Video APIs | YouTube | youtube-transcript-api | Transcript extraction |
| Metadata | YouTube | yt-dlp | Video metadata |
Development & Deployment
| Component | Technology | Purpose |
|---|---|---|
| Containerization | Docker + Docker Compose | Self-hosted deployment |
| Code Quality | Black + Ruff + mypy | Python formatting and linting |
| Frontend Quality | ESLint + Prettier | TypeScript/React standards |
| Pre-commit | pre-commit hooks | Automated quality checks |
| Documentation | FastAPI Auto Docs | API documentation |
System Architecture
High-Level Architecture
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ React Frontend │ │ FastAPI Backend │ │ AI Services │
│ │ │ │ │ │
│ • shadcn/ui │◄──►│ • REST API │◄──►│ • OpenAI │
│ • TypeScript │ │ • Background │ │ • Anthropic │
│ • Zustand │ │ Tasks │ │ • DeepSeek │
│ • React Query │ │ • SQLAlchemy │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │
│ ▼
│ ┌─────────────────┐
│ │ SQLite DB │
└──────────────►│ │
│ • Summaries │
│ • Jobs │
│ • Cache │
└─────────────────┘
Project Structure
youtube-summarizer/
├── frontend/ # React TypeScript frontend
│ ├── src/
│ │ ├── components/ # UI components
│ │ │ ├── ui/ # shadcn/ui base components
│ │ │ ├── forms/ # Form components
│ │ │ ├── summary/ # Summary display components
│ │ │ ├── history/ # History management
│ │ │ ├── processing/ # Status and progress
│ │ │ ├── layout/ # Layout components
│ │ │ └── error/ # Error handling components
│ │ ├── hooks/ # Custom React hooks
│ │ │ ├── api/ # API-specific hooks
│ │ │ └── ui/ # UI utility hooks
│ │ ├── api/ # API client layer
│ │ ├── stores/ # Zustand stores
│ │ ├── types/ # TypeScript definitions
│ │ └── test/ # Test utilities
│ ├── public/ # Static assets
│ ├── package.json # Dependencies and scripts
│ ├── vite.config.ts # Build configuration
│ ├── vitest.config.ts # Test configuration
│ └── tailwind.config.js # Styling configuration
├── backend/ # FastAPI Python backend
│ ├── api/ # API endpoints
│ │ ├── __init__.py
│ │ ├── summarize.py # Main summarization endpoints
│ │ ├── summaries.py # Summary retrieval endpoints
│ │ └── health.py # Health check endpoints
│ ├── services/ # Business logic
│ │ ├── __init__.py
│ │ ├── video_service.py # YouTube integration
│ │ ├── ai_service.py # AI model integration
│ │ └── cache_service.py # Caching logic
│ ├── models/ # Database models
│ │ ├── __init__.py
│ │ ├── summary.py # Summary data model
│ │ └── job.py # Processing job model
│ ├── repositories/ # Data access layer
│ │ ├── __init__.py
│ │ ├── summary_repository.py
│ │ └── job_repository.py
│ ├── core/ # Core utilities
│ │ ├── __init__.py
│ │ ├── config.py # Configuration management
│ │ ├── database.py # Database connection
│ │ ├── exceptions.py # Custom exception classes
│ │ ├── security.py # Rate limiting and validation
│ │ └── cache.py # Caching implementation
│ ├── tests/ # Test suite
│ │ ├── unit/ # Unit tests
│ │ ├── integration/ # Integration tests
│ │ └── conftest.py # Test configuration
│ ├── main.py # FastAPI application entry
│ ├── requirements.txt # Python dependencies
│ └── Dockerfile # Container configuration
├── docker-compose.yml # Self-hosted deployment
├── .env.example # Environment template
├── .pre-commit-config.yaml # Code quality hooks
├── .gitignore # Git ignore patterns
└── README.md # Setup and usage guide
Data Models
Summary Model
class Summary(Base):
__tablename__ = "summaries"
# Primary key
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
# Video information
video_id = Column(String(20), nullable=False, index=True)
video_title = Column(Text)
video_url = Column(Text, nullable=False)
video_duration = Column(Integer) # Duration in seconds
video_channel = Column(String(255))
video_upload_date = Column(String(20)) # YYYY-MM-DD format
video_thumbnail_url = Column(Text)
video_view_count = Column(Integer)
# Transcript data
transcript_text = Column(Text)
transcript_language = Column(String(10), default='en')
transcript_type = Column(String(20)) # 'manual' or 'auto-generated'
# Summary data
summary_text = Column(Text)
key_points = Column(JSON) # Array of strings
chapters = Column(JSON) # Array of chapter objects
# Processing metadata
model_used = Column(String(50), nullable=False)
processing_time = Column(Float) # Processing time in seconds
token_count = Column(Integer) # Total tokens used
cost_estimate = Column(Float) # Estimated cost in USD
# Timestamps
created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
# Cache keys for invalidation
cache_key = Column(String(255), index=True) # Hash of video_id + model + options
Processing Job Model
class ProcessingJob(Base):
__tablename__ = "processing_jobs"
id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
video_url = Column(Text, nullable=False)
video_id = Column(String(20), nullable=False)
# Job configuration
model_name = Column(String(50), nullable=False)
options = Column(JSON) # Summary options (length, focus, etc.)
# Job status
status = Column(Enum(JobStatus), default=JobStatus.PENDING, nullable=False)
progress_percentage = Column(Integer, default=0)
current_step = Column(String(50)) # "validating", "extracting", "summarizing"
# Results
summary_id = Column(UUID(as_uuid=True)) # Foreign key to Summary
error_message = Column(Text)
error_code = Column(String(50))
# Timing
created_at = Column(DateTime, default=datetime.utcnow)
started_at = Column(DateTime)
completed_at = Column(DateTime)
API Specification
Core Endpoints
POST /api/summarize
Purpose: Submit a YouTube URL for summarization
Request:
interface SummarizeRequest {
url: string; // YouTube URL
model?: string; // AI model selection (default: "openai")
options?: {
length?: "brief" | "standard" | "detailed";
focus?: string;
};
}
Response:
interface SummarizeResponse {
id: string; // Summary ID
video: VideoMetadata; // Video information
summary: SummaryData; // Generated summary
status: "completed" | "processing";
processing_time: number;
}
GET /api/summary/{id}
Purpose: Retrieve a specific summary
Response:
interface SummaryResponse {
id: string;
video: VideoMetadata;
summary: SummaryData;
created_at: string;
metadata: ProcessingMetadata;
}
GET /api/summaries
Purpose: List recent summaries with optional filtering
Query Parameters:
limit: Number of results (default: 20)search: Search term for title/contentmodel: Filter by AI model used
Error Handling
Error Response Format
interface APIErrorResponse {
error: {
code: string; // Error code (e.g., "INVALID_URL")
message: string; // Human-readable message
details: object; // Additional error context
recoverable: boolean; // Whether retry might succeed
timestamp: string; // ISO timestamp
path: string; // Request path
}
}
Error Codes
INVALID_URL: Invalid YouTube URL formatVIDEO_NOT_FOUND: Video is unavailable or privateTRANSCRIPT_UNAVAILABLE: No transcript available for videoAI_SERVICE_ERROR: AI service temporarily unavailableRATE_LIMITED: Too many requests from this IPTOKEN_LIMIT_EXCEEDED: Video transcript too long for modelUNKNOWN_ERROR: Unexpected server error
Frontend Architecture
Component Architecture
Core Components
- SummarizeForm: Main URL input form with validation
- SummaryDisplay: Comprehensive summary viewer with export options
- ProcessingStatus: Real-time progress updates
- SummaryHistory: Searchable list of previous summaries
- ErrorBoundary: React error boundaries with recovery options
State Management
Zustand Stores:
interface AppStore {
// UI state
theme: 'light' | 'dark';
sidebarOpen: boolean;
// Processing state
currentJob: ProcessingJob | null;
processingHistory: ProcessingJob[];
// Settings
defaultModel: string;
summaryLength: string;
}
interface SummaryStore {
summaries: Summary[];
currentSummary: Summary | null;
searchResults: Summary[];
// Actions
addSummary: (summary: Summary) => void;
updateSummary: (id: string, updates: Partial<Summary>) => void;
searchSummaries: (query: string) => void;
}
API Client Architecture
TypeScript API Client:
class APIClient {
private baseURL: string;
private httpClient: AxiosInstance;
// Configure automatic retries and error handling
constructor(baseURL: string) {
this.httpClient = axios.create({
baseURL,
timeout: 30000,
});
this.setupInterceptors();
}
// Type-safe API methods
async summarizeVideo(request: SummarizeRequest): Promise<SummarizeResponse>;
async getSummary(id: string): Promise<SummaryResponse>;
async getSummaries(params?: SummaryListParams): Promise<SummaryListResponse>;
async exportSummary(id: string, format: ExportFormat): Promise<Blob>;
}
Backend Services
Video Service
Purpose: Handle YouTube URL processing and transcript extraction
Key Methods:
class VideoService:
async def extract_video_id(self, url: str) -> str:
"""Extract video ID with comprehensive URL format support"""
async def get_transcript(self, video_id: str) -> Dict[str, Any]:
"""Get transcript with fallback chain:
1. Manual captions (preferred)
2. Auto-generated captions
3. Error with helpful message
"""
async def get_video_metadata(self, video_id: str) -> Dict[str, Any]:
"""Extract metadata using yt-dlp for rich video information"""
AI Service
Purpose: Manage AI model integration with provider abstraction
Key Methods:
class AIService:
def __init__(self, provider: str, api_key: str):
self.provider = provider
self.client = self._get_client(provider, api_key)
async def generate_summary(
self,
transcript: str,
video_metadata: Dict[str, Any],
options: Dict[str, Any] = None
) -> Dict[str, Any]:
"""Generate structured summary with:
- Overview paragraph
- Key points list
- Chapter breakdown (if applicable)
- Cost tracking
"""
Cache Service
Purpose: Intelligent caching to minimize API costs
Caching Strategy:
class CacheService:
def get_cache_key(self, video_id: str, model: str, options: Dict) -> str:
"""Generate cache key from video_id + model + options hash"""
async def get_cached_summary(self, cache_key: str) -> Optional[Summary]:
"""Retrieve cached summary if within TTL"""
async def cache_summary(self, cache_key: str, summary: Summary, ttl: int = 86400):
"""Store summary with 24-hour default TTL"""
Testing Strategy
Backend Testing
Test Structure:
backend/tests/
├── unit/
│ ├── test_video_service.py # URL parsing, transcript extraction
│ ├── test_ai_service.py # AI integration, prompt engineering
│ ├── test_cache_service.py # Cache logic, key generation
│ └── test_repositories.py # Database operations
├── integration/
│ ├── test_api.py # End-to-end API testing
│ ├── test_background_jobs.py # Background processing
│ └── test_error_handling.py # Error scenarios
└── conftest.py # Test configuration and fixtures
Testing Patterns:
- Repository Pattern Testing: Mock database, test data operations
- Service Layer Testing: Mock external APIs, test business logic
- API Endpoint Testing: FastAPI TestClient for request/response testing
- Error Scenario Testing: Comprehensive error condition coverage
Frontend Testing
Test Structure:
frontend/src/
├── components/
│ ├── SummarizeForm.test.tsx # Form validation, submission
│ ├── SummaryDisplay.test.tsx # Summary rendering, export
│ └── ErrorBoundary.test.tsx # Error handling components
├── hooks/
│ ├── api/
│ │ └── useSummarization.test.ts # API hook testing
│ └── ui/
├── test/
│ ├── setup.ts # Global test configuration
│ ├── mocks/ # API and component mocks
│ └── utils.tsx # Test utilities and wrappers
└── api/
└── client.test.ts # API client testing
Testing Patterns:
- Component Testing: Render, interaction, and state testing
- Custom Hook Testing: Logic testing with renderHook
- API Client Testing: Mock HTTP responses, error handling
- Integration Testing: Full user flow testing
Test Configuration
pytest Configuration (backend/pytest.ini):
[tool:pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts =
--verbose
--cov=.
--cov-report=html
--cov-report=term-missing
--asyncio-mode=auto
Vitest Configuration (frontend/vitest.config.ts):
export default defineConfig({
plugins: [react()],
test: {
environment: 'jsdom',
setupFiles: ['./src/test/setup.ts'],
globals: true,
css: true,
coverage: {
reporter: ['text', 'html', 'json'],
exclude: ['node_modules/', 'src/test/']
}
}
});
Deployment Architecture
Self-Hosted Docker Deployment
Docker Compose Configuration:
version: '3.8'
services:
backend:
build: ./backend
ports:
- "8000:8000"
environment:
- DATABASE_URL=sqlite:///./data/youtube_summarizer.db
- OPENAI_API_KEY=${OPENAI_API_KEY}
volumes:
- ./data:/app/data
- ./logs:/app/logs
healthcheck:
test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
interval: 30s
timeout: 10s
retries: 3
restart: unless-stopped
frontend:
build: ./frontend
ports:
- "3000:3000"
environment:
- REACT_APP_API_URL=http://localhost:8000
depends_on:
- backend
restart: unless-stopped
Environment Configuration
Required Environment Variables:
# API Keys (at least one required)
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
DEEPSEEK_API_KEY=sk-your-deepseek-key
# Database
DATABASE_URL=sqlite:///./data/youtube_summarizer.db
# Security
SECRET_KEY=your-secret-key-here
CORS_ORIGINS=http://localhost:3000,http://localhost:5173
# Optional: YouTube API for metadata
YOUTUBE_API_KEY=your-youtube-api-key
# Application Settings
MAX_VIDEO_LENGTH_MINUTES=180
RATE_LIMIT_PER_MINUTE=30
CACHE_TTL_HOURS=24
# Frontend Environment Variables
REACT_APP_API_URL=http://localhost:8000
REACT_APP_ENVIRONMENT=development
Security Considerations
Input Validation
- URL Validation: Comprehensive YouTube URL format checking
- Input Sanitization: HTML escaping and XSS prevention
- Request Size Limits: Prevent oversized requests
Rate Limiting
class RateLimiter:
def __init__(self, max_requests: int = 30, window_seconds: int = 60):
self.max_requests = max_requests
self.window_seconds = window_seconds
def is_allowed(self, client_ip: str) -> bool:
"""Check if request is allowed for this IP"""
API Key Management
- Environment variable storage (never commit to repository)
- Rotation capability for production deployments
- Separate keys for different environments
CORS Configuration
app.add_middleware(
CORSMiddleware,
allow_origins=["http://localhost:3000", "http://localhost:5173"],
allow_credentials=True,
allow_methods=["GET", "POST", "PUT", "DELETE"],
allow_headers=["*"],
)
Performance Optimization
Backend Optimization
- Async Everything: All I/O operations use async/await
- Background Processing: Long-running tasks don't block requests
- Intelligent Caching: Memory and database caching layers
- Connection Pooling: Database connection reuse
Frontend Optimization
- Virtual Scrolling: Handle large summary lists efficiently
- Debounced Search: Reduce API calls during user input
- Code Splitting: Load components only when needed
- React Query Caching: Automatic request deduplication and caching
Caching Strategy
# Multi-layer caching approach
# 1. Memory cache for hot data (current session)
# 2. Database cache for persistence (24-hour TTL)
# 3. Smart cache keys: hash(video_id + model + options)
def get_cache_key(video_id: str, model: str, options: dict) -> str:
key_data = f"{video_id}:{model}:{json.dumps(options, sort_keys=True)}"
return hashlib.sha256(key_data.encode()).hexdigest()
Cost Optimization
AI API Cost Management
- Model Selection: Default to GPT-4o-mini (~$0.01/1K tokens)
- Token Optimization: Efficient prompts and transcript chunking
- Caching Strategy: 24-hour cache reduces repeat API calls
- Usage Tracking: Monitor and alert on cost thresholds
Target Cost Structure (Hobby Scale)
- Base Cost: ~$0.10/month for typical usage
- Video Processing: ~$0.001-0.005 per 30-minute video
- Caching Benefit: ~80% reduction in repeat processing costs
Development Workflow
Quick Start Commands
# Development setup
git clone <repository>
cd youtube-summarizer
cp .env.example .env
# Edit .env with your API keys
# Single command startup
docker-compose up
# Access points
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs
Development Scripts
{
"scripts": {
"dev": "docker-compose up",
"dev:backend": "cd backend && uvicorn main:app --reload",
"dev:frontend": "cd frontend && npm run dev",
"test": "npm run test:backend && npm run test:frontend",
"test:backend": "cd backend && pytest",
"test:frontend": "cd frontend && npm test",
"build": "docker-compose build",
"lint": "npm run lint:backend && npm run lint:frontend",
"lint:backend": "cd backend && ruff . && black . && mypy .",
"lint:frontend": "cd frontend && eslint src && prettier --check src"
}
}
Git Hooks
# .pre-commit-config.yaml
repos:
- repo: https://github.com/psf/black
rev: 23.3.0
hooks:
- id: black
files: ^backend/
- repo: https://github.com/charliermarsh/ruff-pre-commit
rev: v0.0.270
hooks:
- id: ruff
files: ^backend/
- repo: https://github.com/pre-commit/mirrors-mypy
rev: v1.3.0
hooks:
- id: mypy
files: ^backend/
additional_dependencies: [types-all]
- repo: https://github.com/pre-commit/mirrors-eslint
rev: v8.42.0
hooks:
- id: eslint
files: ^frontend/src/
types: [file]
types_or: [typescript, tsx]
Architecture Decision Records
ADR-001: Self-Hosted Architecture Choice
Status: Accepted
Context: User explicitly requested "no imselfhosting" and hobby-scale deployment
Decision: Docker Compose deployment with local database storage
Consequences: Simplified deployment, reduced costs, requires local resource management
ADR-002: AI Model Strategy
Status: Accepted
Context: Cost optimization for hobby use while maintaining quality
Decision: Primary OpenAI GPT-4o-mini, fallback to other models
Consequences: ~$0.10/month costs, good quality summaries, multiple provider support
ADR-003: Database Evolution Path
Status: Accepted
Context: Start simple but allow growth to production scale
Decision: SQLite for development/hobby, PostgreSQL migration path for production
Consequences: Zero-config development start, clear upgrade path when needed
This architecture document serves as the definitive technical guide for implementing the YouTube Summarizer application.