youtube-summarizer/docs/architecture.md

24 KiB

YouTube Summarizer - Technical Architecture

Architecture Overview

This document defines the comprehensive technical architecture for the YouTube Summarizer application, designed as a self-hosted, hobby-scale system with professional code quality.

Design Principles

  1. Self-Hosted Priority: All components run locally without external cloud dependencies (except AI API calls)
  2. Hobby Scale Optimization: Simple deployment with Docker Compose, cost-effective (~$0.10/month)
  3. Professional Code Quality: Modern technologies, type safety, comprehensive testing
  4. Background Processing: User-requested priority for reliable video processing
  5. Learning-Friendly: Technologies that provide quick feedback loops and satisfying development experience

Technology Stack

Backend Stack

Component Technology Version Purpose
Runtime Python 3.11+ AI library compatibility
Framework FastAPI Latest High-performance async API
Database SQLite → PostgreSQL Latest Development → Production
ORM SQLAlchemy 2.0+ Async database operations
Validation Pydantic V2 Request/response validation
ASGI Server Uvicorn Latest Production ASGI server
Testing pytest Latest Unit and integration testing

Frontend Stack

Component Technology Version Purpose
Framework React 18+ Modern UI framework
Language TypeScript Latest Type-safe development
Build Tool Vite Latest Fast development and building
UI Library shadcn/ui Latest Component design system
Styling Tailwind CSS Latest Utility-first CSS
State Management Zustand Latest Global state management
Server State React Query Latest API calls and caching
Testing Vitest + RTL Latest Component and unit testing

AI & External Services

Service Provider Model Purpose
Primary AI OpenAI GPT-4o-mini Cost-effective summarization
Fallback AI Anthropic Claude 3 Haiku Backup model
Alternative DeepSeek DeepSeek Chat Budget option
Video APIs YouTube youtube-transcript-api Transcript extraction
Metadata YouTube yt-dlp Video metadata

Development & Deployment

Component Technology Purpose
Containerization Docker + Docker Compose Self-hosted deployment
Code Quality Black + Ruff + mypy Python formatting and linting
Frontend Quality ESLint + Prettier TypeScript/React standards
Pre-commit pre-commit hooks Automated quality checks
Documentation FastAPI Auto Docs API documentation

System Architecture

High-Level Architecture

┌─────────────────┐    ┌─────────────────┐    ┌─────────────────┐
│   React Frontend │    │  FastAPI Backend │    │   AI Services   │
│                 │    │                 │    │                 │
│  • shadcn/ui    │◄──►│  • REST API     │◄──►│  • OpenAI       │
│  • TypeScript   │    │  • Background   │    │  • Anthropic    │
│  • Zustand      │    │    Tasks        │    │  • DeepSeek     │
│  • React Query  │    │  • SQLAlchemy   │    │                 │
└─────────────────┘    └─────────────────┘    └─────────────────┘
        │                       │
        │                       ▼
        │               ┌─────────────────┐
        │               │   SQLite DB     │
        └──────────────►│                 │
                        │  • Summaries    │
                        │  • Jobs         │
                        │  • Cache        │
                        └─────────────────┘

Project Structure

youtube-summarizer/
├── frontend/                 # React TypeScript frontend
│   ├── src/
│   │   ├── components/      # UI components
│   │   │   ├── ui/         # shadcn/ui base components
│   │   │   ├── forms/      # Form components
│   │   │   ├── summary/    # Summary display components
│   │   │   ├── history/    # History management
│   │   │   ├── processing/ # Status and progress
│   │   │   ├── layout/     # Layout components
│   │   │   └── error/      # Error handling components
│   │   ├── hooks/          # Custom React hooks
│   │   │   ├── api/        # API-specific hooks
│   │   │   └── ui/         # UI utility hooks
│   │   ├── api/            # API client layer
│   │   ├── stores/         # Zustand stores
│   │   ├── types/          # TypeScript definitions
│   │   └── test/           # Test utilities
│   ├── public/             # Static assets
│   ├── package.json        # Dependencies and scripts
│   ├── vite.config.ts      # Build configuration
│   ├── vitest.config.ts    # Test configuration
│   └── tailwind.config.js  # Styling configuration
├── backend/                  # FastAPI Python backend
│   ├── api/                # API endpoints
│   │   ├── __init__.py
│   │   ├── summarize.py    # Main summarization endpoints
│   │   ├── summaries.py    # Summary retrieval endpoints
│   │   └── health.py       # Health check endpoints
│   ├── services/           # Business logic
│   │   ├── __init__.py
│   │   ├── video_service.py # YouTube integration
│   │   ├── ai_service.py   # AI model integration
│   │   └── cache_service.py # Caching logic
│   ├── models/             # Database models
│   │   ├── __init__.py
│   │   ├── summary.py      # Summary data model
│   │   └── job.py          # Processing job model
│   ├── repositories/       # Data access layer
│   │   ├── __init__.py
│   │   ├── summary_repository.py
│   │   └── job_repository.py
│   ├── core/               # Core utilities
│   │   ├── __init__.py
│   │   ├── config.py       # Configuration management
│   │   ├── database.py     # Database connection
│   │   ├── exceptions.py   # Custom exception classes
│   │   ├── security.py     # Rate limiting and validation
│   │   └── cache.py        # Caching implementation
│   ├── tests/              # Test suite
│   │   ├── unit/          # Unit tests
│   │   ├── integration/   # Integration tests
│   │   └── conftest.py    # Test configuration
│   ├── main.py             # FastAPI application entry
│   ├── requirements.txt    # Python dependencies
│   └── Dockerfile          # Container configuration
├── docker-compose.yml        # Self-hosted deployment
├── .env.example             # Environment template
├── .pre-commit-config.yaml  # Code quality hooks
├── .gitignore              # Git ignore patterns
└── README.md               # Setup and usage guide

Data Models

Summary Model

class Summary(Base):
    __tablename__ = "summaries"
    
    # Primary key
    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    
    # Video information
    video_id = Column(String(20), nullable=False, index=True)
    video_title = Column(Text)
    video_url = Column(Text, nullable=False)
    video_duration = Column(Integer)  # Duration in seconds
    video_channel = Column(String(255))
    video_upload_date = Column(String(20))  # YYYY-MM-DD format
    video_thumbnail_url = Column(Text)
    video_view_count = Column(Integer)
    
    # Transcript data
    transcript_text = Column(Text)
    transcript_language = Column(String(10), default='en')
    transcript_type = Column(String(20))  # 'manual' or 'auto-generated'
    
    # Summary data
    summary_text = Column(Text)
    key_points = Column(JSON)  # Array of strings
    chapters = Column(JSON)    # Array of chapter objects
    
    # Processing metadata
    model_used = Column(String(50), nullable=False)
    processing_time = Column(Float)  # Processing time in seconds
    token_count = Column(Integer)    # Total tokens used
    cost_estimate = Column(Float)    # Estimated cost in USD
    
    # Timestamps
    created_at = Column(DateTime, default=datetime.utcnow, nullable=False)
    updated_at = Column(DateTime, default=datetime.utcnow, onupdate=datetime.utcnow)
    
    # Cache keys for invalidation
    cache_key = Column(String(255), index=True)  # Hash of video_id + model + options

Processing Job Model

class ProcessingJob(Base):
    __tablename__ = "processing_jobs"
    
    id = Column(UUID(as_uuid=True), primary_key=True, default=uuid.uuid4)
    video_url = Column(Text, nullable=False)
    video_id = Column(String(20), nullable=False)
    
    # Job configuration
    model_name = Column(String(50), nullable=False)
    options = Column(JSON)  # Summary options (length, focus, etc.)
    
    # Job status
    status = Column(Enum(JobStatus), default=JobStatus.PENDING, nullable=False)
    progress_percentage = Column(Integer, default=0)
    current_step = Column(String(50))  # "validating", "extracting", "summarizing"
    
    # Results
    summary_id = Column(UUID(as_uuid=True))  # Foreign key to Summary
    error_message = Column(Text)
    error_code = Column(String(50))
    
    # Timing
    created_at = Column(DateTime, default=datetime.utcnow)
    started_at = Column(DateTime)
    completed_at = Column(DateTime)

API Specification

Core Endpoints

POST /api/summarize

Purpose: Submit a YouTube URL for summarization

Request:

interface SummarizeRequest {
  url: string;           // YouTube URL
  model?: string;        // AI model selection (default: "openai")
  options?: {
    length?: "brief" | "standard" | "detailed";
    focus?: string;
  };
}

Response:

interface SummarizeResponse {
  id: string;            // Summary ID
  video: VideoMetadata;  // Video information
  summary: SummaryData;  // Generated summary
  status: "completed" | "processing";
  processing_time: number;
}

GET /api/summary/{id}

Purpose: Retrieve a specific summary

Response:

interface SummaryResponse {
  id: string;
  video: VideoMetadata;
  summary: SummaryData;
  created_at: string;
  metadata: ProcessingMetadata;
}

GET /api/summaries

Purpose: List recent summaries with optional filtering

Query Parameters:

  • limit: Number of results (default: 20)
  • search: Search term for title/content
  • model: Filter by AI model used

Error Handling

Error Response Format

interface APIErrorResponse {
  error: {
    code: string;        // Error code (e.g., "INVALID_URL")
    message: string;     // Human-readable message
    details: object;     // Additional error context
    recoverable: boolean; // Whether retry might succeed
    timestamp: string;   // ISO timestamp
    path: string;        // Request path
  }
}

Error Codes

  • INVALID_URL: Invalid YouTube URL format
  • VIDEO_NOT_FOUND: Video is unavailable or private
  • TRANSCRIPT_UNAVAILABLE: No transcript available for video
  • AI_SERVICE_ERROR: AI service temporarily unavailable
  • RATE_LIMITED: Too many requests from this IP
  • TOKEN_LIMIT_EXCEEDED: Video transcript too long for model
  • UNKNOWN_ERROR: Unexpected server error

Frontend Architecture

Component Architecture

Core Components

  • SummarizeForm: Main URL input form with validation
  • SummaryDisplay: Comprehensive summary viewer with export options
  • ProcessingStatus: Real-time progress updates
  • SummaryHistory: Searchable list of previous summaries
  • ErrorBoundary: React error boundaries with recovery options

State Management

Zustand Stores:

interface AppStore {
  // UI state
  theme: 'light' | 'dark';
  sidebarOpen: boolean;
  
  // Processing state
  currentJob: ProcessingJob | null;
  processingHistory: ProcessingJob[];
  
  // Settings
  defaultModel: string;
  summaryLength: string;
}

interface SummaryStore {
  summaries: Summary[];
  currentSummary: Summary | null;
  searchResults: Summary[];
  
  // Actions
  addSummary: (summary: Summary) => void;
  updateSummary: (id: string, updates: Partial<Summary>) => void;
  searchSummaries: (query: string) => void;
}

API Client Architecture

TypeScript API Client:

class APIClient {
  private baseURL: string;
  private httpClient: AxiosInstance;
  
  // Configure automatic retries and error handling
  constructor(baseURL: string) {
    this.httpClient = axios.create({
      baseURL,
      timeout: 30000,
    });
    this.setupInterceptors();
  }
  
  // Type-safe API methods
  async summarizeVideo(request: SummarizeRequest): Promise<SummarizeResponse>;
  async getSummary(id: string): Promise<SummaryResponse>;
  async getSummaries(params?: SummaryListParams): Promise<SummaryListResponse>;
  async exportSummary(id: string, format: ExportFormat): Promise<Blob>;
}

Backend Services

Video Service

Purpose: Handle YouTube URL processing and transcript extraction

Key Methods:

class VideoService:
    async def extract_video_id(self, url: str) -> str:
        """Extract video ID with comprehensive URL format support"""
        
    async def get_transcript(self, video_id: str) -> Dict[str, Any]:
        """Get transcript with fallback chain:
        1. Manual captions (preferred)
        2. Auto-generated captions
        3. Error with helpful message
        """
        
    async def get_video_metadata(self, video_id: str) -> Dict[str, Any]:
        """Extract metadata using yt-dlp for rich video information"""

AI Service

Purpose: Manage AI model integration with provider abstraction

Key Methods:

class AIService:
    def __init__(self, provider: str, api_key: str):
        self.provider = provider
        self.client = self._get_client(provider, api_key)
    
    async def generate_summary(
        self, 
        transcript: str, 
        video_metadata: Dict[str, Any],
        options: Dict[str, Any] = None
    ) -> Dict[str, Any]:
        """Generate structured summary with:
        - Overview paragraph
        - Key points list
        - Chapter breakdown (if applicable)
        - Cost tracking
        """

Cache Service

Purpose: Intelligent caching to minimize API costs

Caching Strategy:

class CacheService:
    def get_cache_key(self, video_id: str, model: str, options: Dict) -> str:
        """Generate cache key from video_id + model + options hash"""
        
    async def get_cached_summary(self, cache_key: str) -> Optional[Summary]:
        """Retrieve cached summary if within TTL"""
        
    async def cache_summary(self, cache_key: str, summary: Summary, ttl: int = 86400):
        """Store summary with 24-hour default TTL"""

Testing Strategy

Backend Testing

Test Structure:

backend/tests/
├── unit/
│   ├── test_video_service.py      # URL parsing, transcript extraction
│   ├── test_ai_service.py         # AI integration, prompt engineering
│   ├── test_cache_service.py      # Cache logic, key generation
│   └── test_repositories.py      # Database operations
├── integration/
│   ├── test_api.py               # End-to-end API testing
│   ├── test_background_jobs.py   # Background processing
│   └── test_error_handling.py    # Error scenarios
└── conftest.py                   # Test configuration and fixtures

Testing Patterns:

  • Repository Pattern Testing: Mock database, test data operations
  • Service Layer Testing: Mock external APIs, test business logic
  • API Endpoint Testing: FastAPI TestClient for request/response testing
  • Error Scenario Testing: Comprehensive error condition coverage

Frontend Testing

Test Structure:

frontend/src/
├── components/
│   ├── SummarizeForm.test.tsx    # Form validation, submission
│   ├── SummaryDisplay.test.tsx   # Summary rendering, export
│   └── ErrorBoundary.test.tsx    # Error handling components
├── hooks/
│   ├── api/
│   │   └── useSummarization.test.ts # API hook testing
│   └── ui/
├── test/
│   ├── setup.ts                  # Global test configuration
│   ├── mocks/                    # API and component mocks
│   └── utils.tsx                 # Test utilities and wrappers
└── api/
    └── client.test.ts            # API client testing

Testing Patterns:

  • Component Testing: Render, interaction, and state testing
  • Custom Hook Testing: Logic testing with renderHook
  • API Client Testing: Mock HTTP responses, error handling
  • Integration Testing: Full user flow testing

Test Configuration

pytest Configuration (backend/pytest.ini):

[tool:pytest]
testpaths = tests
python_files = test_*.py
python_classes = Test*
python_functions = test_*
addopts = 
    --verbose
    --cov=.
    --cov-report=html
    --cov-report=term-missing
    --asyncio-mode=auto

Vitest Configuration (frontend/vitest.config.ts):

export default defineConfig({
  plugins: [react()],
  test: {
    environment: 'jsdom',
    setupFiles: ['./src/test/setup.ts'],
    globals: true,
    css: true,
    coverage: {
      reporter: ['text', 'html', 'json'],
      exclude: ['node_modules/', 'src/test/']
    }
  }
});

Deployment Architecture

Self-Hosted Docker Deployment

Docker Compose Configuration:

version: '3.8'

services:
  backend:
    build: ./backend
    ports:
      - "8000:8000"
    environment:
      - DATABASE_URL=sqlite:///./data/youtube_summarizer.db
      - OPENAI_API_KEY=${OPENAI_API_KEY}
    volumes:
      - ./data:/app/data
      - ./logs:/app/logs
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
    restart: unless-stopped

  frontend:
    build: ./frontend
    ports:
      - "3000:3000"
    environment:
      - REACT_APP_API_URL=http://localhost:8000
    depends_on:
      - backend
    restart: unless-stopped

Environment Configuration

Required Environment Variables:

# API Keys (at least one required)
OPENAI_API_KEY=sk-your-openai-key
ANTHROPIC_API_KEY=sk-ant-your-anthropic-key
DEEPSEEK_API_KEY=sk-your-deepseek-key

# Database
DATABASE_URL=sqlite:///./data/youtube_summarizer.db

# Security
SECRET_KEY=your-secret-key-here
CORS_ORIGINS=http://localhost:3000,http://localhost:5173

# Optional: YouTube API for metadata
YOUTUBE_API_KEY=your-youtube-api-key

# Application Settings
MAX_VIDEO_LENGTH_MINUTES=180
RATE_LIMIT_PER_MINUTE=30
CACHE_TTL_HOURS=24

# Frontend Environment Variables
REACT_APP_API_URL=http://localhost:8000
REACT_APP_ENVIRONMENT=development

Security Considerations

Input Validation

  • URL Validation: Comprehensive YouTube URL format checking
  • Input Sanitization: HTML escaping and XSS prevention
  • Request Size Limits: Prevent oversized requests

Rate Limiting

class RateLimiter:
    def __init__(self, max_requests: int = 30, window_seconds: int = 60):
        self.max_requests = max_requests
        self.window_seconds = window_seconds
    
    def is_allowed(self, client_ip: str) -> bool:
        """Check if request is allowed for this IP"""

API Key Management

  • Environment variable storage (never commit to repository)
  • Rotation capability for production deployments
  • Separate keys for different environments

CORS Configuration

app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://localhost:3000", "http://localhost:5173"],
    allow_credentials=True,
    allow_methods=["GET", "POST", "PUT", "DELETE"],
    allow_headers=["*"],
)

Performance Optimization

Backend Optimization

  • Async Everything: All I/O operations use async/await
  • Background Processing: Long-running tasks don't block requests
  • Intelligent Caching: Memory and database caching layers
  • Connection Pooling: Database connection reuse

Frontend Optimization

  • Virtual Scrolling: Handle large summary lists efficiently
  • Debounced Search: Reduce API calls during user input
  • Code Splitting: Load components only when needed
  • React Query Caching: Automatic request deduplication and caching

Caching Strategy

# Multi-layer caching approach
# 1. Memory cache for hot data (current session)
# 2. Database cache for persistence (24-hour TTL)
# 3. Smart cache keys: hash(video_id + model + options)

def get_cache_key(video_id: str, model: str, options: dict) -> str:
    key_data = f"{video_id}:{model}:{json.dumps(options, sort_keys=True)}"
    return hashlib.sha256(key_data.encode()).hexdigest()

Cost Optimization

AI API Cost Management

  • Model Selection: Default to GPT-4o-mini (~$0.01/1K tokens)
  • Token Optimization: Efficient prompts and transcript chunking
  • Caching Strategy: 24-hour cache reduces repeat API calls
  • Usage Tracking: Monitor and alert on cost thresholds

Target Cost Structure (Hobby Scale)

  • Base Cost: ~$0.10/month for typical usage
  • Video Processing: ~$0.001-0.005 per 30-minute video
  • Caching Benefit: ~80% reduction in repeat processing costs

Development Workflow

Quick Start Commands

# Development setup
git clone <repository>
cd youtube-summarizer
cp .env.example .env
# Edit .env with your API keys

# Single command startup
docker-compose up

# Access points
# Frontend: http://localhost:3000
# Backend API: http://localhost:8000
# API Docs: http://localhost:8000/docs

Development Scripts

{
  "scripts": {
    "dev": "docker-compose up",
    "dev:backend": "cd backend && uvicorn main:app --reload",
    "dev:frontend": "cd frontend && npm run dev",
    "test": "npm run test:backend && npm run test:frontend",
    "test:backend": "cd backend && pytest",
    "test:frontend": "cd frontend && npm test",
    "build": "docker-compose build",
    "lint": "npm run lint:backend && npm run lint:frontend",
    "lint:backend": "cd backend && ruff . && black . && mypy .",
    "lint:frontend": "cd frontend && eslint src && prettier --check src"
  }
}

Git Hooks

# .pre-commit-config.yaml
repos:
  - repo: https://github.com/psf/black
    rev: 23.3.0
    hooks:
      - id: black
        files: ^backend/
  
  - repo: https://github.com/charliermarsh/ruff-pre-commit
    rev: v0.0.270
    hooks:
      - id: ruff
        files: ^backend/
  
  - repo: https://github.com/pre-commit/mirrors-mypy
    rev: v1.3.0
    hooks:
      - id: mypy
        files: ^backend/
        additional_dependencies: [types-all]
  
  - repo: https://github.com/pre-commit/mirrors-eslint
    rev: v8.42.0
    hooks:
      - id: eslint
        files: ^frontend/src/
        types: [file]
        types_or: [typescript, tsx]

Architecture Decision Records

ADR-001: Self-Hosted Architecture Choice

Status: Accepted
Context: User explicitly requested "no imselfhosting" and hobby-scale deployment
Decision: Docker Compose deployment with local database storage
Consequences: Simplified deployment, reduced costs, requires local resource management

ADR-002: AI Model Strategy

Status: Accepted
Context: Cost optimization for hobby use while maintaining quality
Decision: Primary OpenAI GPT-4o-mini, fallback to other models
Consequences: ~$0.10/month costs, good quality summaries, multiple provider support

ADR-003: Database Evolution Path

Status: Accepted
Context: Start simple but allow growth to production scale
Decision: SQLite for development/hobby, PostgreSQL migration path for production
Consequences: Zero-config development start, clear upgrade path when needed


This architecture document serves as the definitive technical guide for implementing the YouTube Summarizer application.