youtube-summarizer/FILE_STRUCTURE.md

354 lines
15 KiB
Markdown

# YouTube Summarizer - File Structure
## Project Overview
The YouTube Summarizer is a comprehensive web application for extracting, transcribing, and summarizing YouTube videos with AI. It features a 9-tier fallback chain for reliable transcript extraction and audio retention for re-transcription.
## Directory Structure
```
youtube-summarizer/
├── scripts/ # Development and deployment tools ✅ NEW
│ ├── restart-backend.sh # Backend server restart script
│ ├── restart-frontend.sh # Frontend server restart script
│ └── restart-both.sh # Full stack restart script
├── logs/ # Server logs (auto-created by scripts)
├── backend/ # FastAPI backend application
│ ├── api/ # API endpoints and routers
│ │ ├── auth.py # Authentication endpoints (register, login, logout)
│ │ ├── batch.py # Batch processing endpoints
│ │ ├── enhanced_export.py # Enhanced export with AI intelligence ✅ Story 4.4
│ │ ├── export.py # Export functionality endpoints
│ │ ├── history.py # Job history API endpoints ✅ NEW
│ │ ├── pipeline.py # Main summarization pipeline
│ │ ├── summarization.py # AI summarization endpoints
│ │ ├── templates.py # Template management
│ │ └── transcripts.py # Dual transcript extraction (YouTube/Whisper)
│ ├── config/ # Configuration modules
│ │ ├── settings.py # Application settings
│ │ └── video_download_config.py # Video download & storage config
│ ├── core/ # Core utilities and foundations
│ │ ├── database_registry.py # SQLAlchemy singleton registry pattern
│ │ ├── exceptions.py # Custom exception classes
│ │ └── websocket_manager.py # WebSocket connection management
│ ├── models/ # Database models
│ │ ├── base.py # Base model with registry integration
│ │ ├── batch.py # Batch processing models
│ │ ├── enhanced_export.py # Enhanced export database models ✅ Story 4.4
│ │ ├── job_history.py # Job history models and schemas ✅ NEW
│ │ ├── summary.py # Summary and transcript models
│ │ ├── user.py # User authentication models
│ │ └── video_download.py # Video download enums and configs
│ ├── services/ # Business logic services
│ │ ├── anthropic_summarizer.py # Claude AI integration
│ │ ├── auth_service.py # Authentication service
│ │ ├── batch_processing_service.py # Batch job management
│ │ ├── cache_manager.py # Multi-level caching
│ │ ├── dual_transcript_service.py # Orchestrates YouTube/Whisper
│ │ ├── enhanced_markdown_formatter.py # Professional document templates ✅ Story 4.4
│ │ ├── enhanced_template_manager.py # Domain-specific AI templates ✅ Story 4.4
│ │ ├── executive_summary_generator.py # Business-focused AI summaries ✅ Story 4.4
│ │ ├── export_service.py # Multi-format export
│ │ ├── intelligent_video_downloader.py # 9-tier fallback chain
│ │ ├── job_history_service.py # Job history management ✅ NEW
│ │ ├── notification_service.py # Real-time notifications
│ │ ├── summary_pipeline.py # Main processing pipeline
│ │ ├── timestamp_processor.py # Semantic section detection ✅ Story 4.4
│ │ ├── transcript_service.py # Core transcript extraction
│ │ ├── video_service.py # YouTube metadata extraction
│ │ ├── whisper_transcript_service.py # Legacy OpenAI Whisper (deprecated)
│ │ └── faster_whisper_transcript_service.py # ⚡ Faster-Whisper (20-32x speed) ✅ NEW
│ ├── tests/ # Test suites
│ │ ├── unit/ # Unit tests (229+ tests)
│ │ └── integration/ # Integration tests
│ ├── .env # Environment configuration
│ ├── CLAUDE.md # Backend-specific AI guidance
│ └── main.py # FastAPI application entry point
├── frontend/ # React TypeScript frontend
│ ├── src/
│ │ ├── api/ # API client and endpoints
│ │ │ ├── apiClient.ts # Axios-based API client
│ │ │ └── historyAPI.ts # Job history API client ✅ NEW
│ │ ├── components/ # Reusable React components
│ │ │ ├── auth/ # Authentication components
│ │ │ │ ├── ConditionalProtectedRoute.tsx # Smart auth wrapper ✅ NEW
│ │ │ │ └── ProtectedRoute.tsx # Standard auth protection
│ │ │ ├── history/ # History system components ✅ NEW
│ │ │ │ └── JobDetailModal.tsx # Enhanced history detail modal
│ │ │ ├── Batch/ # Batch processing UI
│ │ │ ├── Export/ # Export dialog components
│ │ │ ├── ProcessingProgress.tsx # Real-time progress
│ │ │ ├── SummarizeForm.tsx # Main form with transcript selector
│ │ │ ├── SummaryDisplay.tsx # Summary viewer
│ │ │ ├── TranscriptComparison.tsx # Side-by-side comparison
│ │ │ ├── TranscriptSelector.tsx # YouTube/Whisper selector
│ │ │ └── TranscriptViewer.tsx # Transcript display
│ │ ├── config/ # Configuration and settings ✅ NEW
│ │ │ └── app.config.ts # App-wide configuration including auth
│ │ ├── contexts/ # React contexts
│ │ │ └── AuthContext.tsx # Global authentication state
│ │ ├── hooks/ # Custom React hooks
│ │ │ ├── useBatchProcessing.ts # Batch operations
│ │ │ ├── useTranscriptSelector.ts # Transcript source logic
│ │ │ └── useWebSocket.ts # WebSocket connection
│ │ ├── pages/ # Page components
│ │ │ ├── MainPage.tsx # Unified main page (replaces Admin/Dashboard) ✅ NEW
│ │ │ ├── HistoryPage.tsx # Persistent job history page ✅ NEW
│ │ │ ├── BatchProcessingPage.tsx # Batch UI
│ │ │ ├── auth/ # Authentication pages
│ │ │ │ ├── LoginPage.tsx # Login form
│ │ │ │ └── RegisterPage.tsx # Registration form
│ │ ├── types/ # TypeScript definitions
│ │ │ └── index.ts # Shared type definitions
│ │ ├── utils/ # Utility functions
│ │ ├── App.tsx # Main app component
│ │ └── main.tsx # React entry point
│ ├── public/ # Static assets
│ ├── .env.example # Environment variables template ✅ NEW
│ ├── package.json # Frontend dependencies
│ └── vite.config.ts # Vite configuration
├── video_storage/ # Media storage directories (auto-created)
│ ├── audio/ # Audio files for re-transcription
│ │ ├── *.mp3 # MP3 audio files (192kbps)
│ │ └── *_metadata.json # Audio metadata and settings
│ ├── cache/ # API response caching
│ ├── summaries/ # Generated AI summaries
│ ├── temp/ # Temporary processing files
│ ├── transcripts/ # Extracted transcripts
│ │ ├── *.txt # Plain text transcripts
│ │ └── *.json # Structured transcript data
│ └── videos/ # Downloaded video files
├── data/ # Database and application data
│ ├── app.db # SQLite database
│ └── cache/ # Local cache storage
├── scripts/ # Utility scripts
│ ├── setup_test_env.sh # Test environment setup
│ └── validate_test_setup.py # Test configuration validator
├── migrations/ # Alembic database migrations
│ └── versions/ # Migration version files
├── docs/ # Project documentation
│ ├── architecture.md # System architecture
│ ├── prd.md # Product requirements
│ ├── stories/ # Development stories
│ └── TESTING-INSTRUCTIONS.md # Test guidelines
├── .env.example # Environment template
├── .gitignore # Git exclusions
├── CHANGELOG.md # Version history
├── CLAUDE.md # AI development guidance
├── docker-compose.yml # Docker services
├── Dockerfile # Container configuration
├── README.md # Project documentation
├── requirements.txt # Python dependencies
└── run_tests.sh # Test runner script
```
## Key Directories
### Backend Services (`backend/services/`)
Core business logic implementing the 9-tier transcript extraction fallback chain:
1. **YouTube Transcript API** - Primary method using official API
2. **Auto-generated Captions** - YouTube's automatic captions
3. **Whisper AI Transcription** - OpenAI Whisper for audio
4. **PyTubeFix Downloader** - Alternative YouTube library
5. **YT-DLP Downloader** - Robust video/audio extraction
6. **Playwright Browser** - Browser automation fallback
7. **External Tools** - 4K Video Downloader integration
8. **Web Services** - Third-party transcript APIs
9. **Transcript-Only** - Metadata without full transcript
### Storage Structure (`video_storage/`)
Organized media storage with audio retention for re-transcription:
- **audio/** - MP3 files (192kbps) with metadata for future enhanced transcription
- **transcripts/** - Text and JSON transcripts from all sources
- **summaries/** - AI-generated summaries in multiple formats
- **cache/** - Cached API responses for performance
- **temp/** - Temporary files during processing
- **videos/** - Optional video file storage
### Frontend Components (`frontend/src/components/`)
- **TranscriptSelector** - Radio button UI for choosing YouTube/Whisper/Both
- **TranscriptComparison** - Side-by-side quality analysis
- **ProcessingProgress** - Real-time WebSocket progress updates
- **SummarizeForm** - Main interface with source selection
### Database Models (`backend/models/`)
- **User** - Authentication and user management
- **Summary** - Video summaries with transcripts
- **BatchJob** - Batch processing management
- **RefreshToken** - JWT refresh token storage
## Configuration Files
### Environment Variables (`.env`)
```bash
# Core Configuration
USE_MOCK_SERVICES=false
ENABLE_REAL_TRANSCRIPT_EXTRACTION=true
# API Keys
YOUTUBE_API_KEY=your_key
GOOGLE_API_KEY=your_gemini_key
ANTHROPIC_API_KEY=your_claude_key
# Storage Configuration
VIDEO_DOWNLOAD_STORAGE_PATH=./video_storage
VIDEO_DOWNLOAD_KEEP_AUDIO_FILES=true
VIDEO_DOWNLOAD_AUDIO_CLEANUP_DAYS=30
```
### Video Download Config (`backend/config/video_download_config.py`)
- Storage paths and limits
- Download method priorities
- Audio retention settings
- Fallback chain configuration
## Testing Infrastructure
### Test Runner (`run_tests.sh`)
Comprehensive test execution with 229+ unit tests:
- Fast unit tests (~0.2s)
- Integration tests
- Coverage reporting
- Parallel execution
### Test Categories
- **unit/** - Isolated service tests
- **integration/** - API endpoint tests
- **auth/** - Authentication tests
- **pipeline/** - End-to-end tests
## Development Workflows
### Quick Start
```bash
# Backend
cd backend
source venv/bin/activate
python main.py
# Frontend
cd frontend
npm install
npm run dev
# Testing
./run_tests.sh run-unit --fail-fast
```
### Admin Testing
Direct access without authentication:
```
http://localhost:3002/admin
```
### Protected App
Full application with authentication:
```
http://localhost:3002/dashboard
```
## Key Features
### Transcript Extraction
- 9-tier fallback chain for reliability
- YouTube captions and Whisper AI options
- Quality comparison and analysis
- Processing time estimation
### Audio Retention
- Automatic audio saving as MP3
- Metadata tracking for re-transcription
- Configurable retention period
- WAV to MP3 conversion
### Real-time Updates
- WebSocket progress tracking
- Stage-based pipeline monitoring
- Job cancellation support
- Connection recovery
### Batch Processing
- Process up to 100 videos
- Sequential queue management
- Progress tracking per item
- ZIP export with organization
## API Endpoints
### Core Pipeline
- `POST /api/pipeline/process` - Start video processing
- `GET /api/pipeline/status/{job_id}` - Check job status
- `GET /api/pipeline/result/{job_id}` - Get results
### Dual Transcripts
- `POST /api/transcripts/dual/extract` - Extract with options
- `GET /api/transcripts/dual/compare/{video_id}` - Compare sources
### Authentication
- `POST /api/auth/register` - User registration
- `POST /api/auth/login` - User login
- `POST /api/auth/refresh` - Token refresh
### Batch Operations
- `POST /api/batch/jobs` - Create batch job
- `GET /api/batch/jobs/{job_id}` - Job status
- `GET /api/batch/export/{job_id}` - Export results
### Enhanced Export System ✅ Story 4.4
- `POST /api/export/enhanced` - Generate professional export with AI intelligence
- `GET /api/export/config` - Available export configuration options
- `POST /api/export/templates` - Create custom prompt templates
- `GET /api/export/templates` - List and filter domain templates
- `POST /api/export/recommendations` - Get domain-specific template recommendations
- `GET /api/export/templates/{id}/analytics` - Template performance metrics
- `GET /api/export/system/stats` - Overall system statistics
## Database Schema
### Core Tables
- `users` - User accounts and profiles
- `summaries` - Video summaries and metadata
- `refresh_tokens` - JWT refresh tokens
- `batch_jobs` - Batch processing jobs
- `batch_job_items` - Individual batch items
## Docker Services
### docker-compose.yml
```yaml
services:
backend:
build: .
ports: ["8000:8000"]
volumes: ["./video_storage:/app/video_storage"]
frontend:
build: ./frontend
ports: ["3002:3002"]
redis:
image: redis:alpine
ports: ["6379:6379"]
```
## Version History
- **v5.1.0** - 9-tier fallback chain, audio retention
- **v5.0.0** - MCP server, SDKs, agent frameworks
- **v4.1.0** - Dual transcript options
- **v3.5.0** - Real-time WebSocket updates
- **v3.4.0** - Batch processing
- **v3.3.0** - Summary history
- **v3.2.0** - Frontend authentication
- **v3.1.0** - Backend authentication
---
*Last updated: 2025-08-27 - Added transcript fallback chain and audio retention features*