# CLAUDE.md This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository. **📌 PRIMARY WORKFLOW**: @.cursor/rules/agent_workflow.mdc - Single source of truth for all development patterns ## Project Context Trax is a production-ready media transcription platform within the my-ai-projects ecosystem. It uses Whisper for transcription with domain-specific AI enhancement, optimized for M3 MacBook performance. **Core Architecture**: Download-first media processing → Whisper transcription → DeepSeek enhancement → Multi-format export ## Core Development Principles From @.cursor/rules/agent_workflow.mdc: - **Keep It Simple**: One workflow, clear patterns, no complex hierarchies - **Context First**: Always understand what you're building before coding - **Test First**: Write tests before implementation - **Quality Built-In**: Enforce standards as you go, not as separate phases - **Progressive Enhancement**: Start simple, add complexity only when needed ## Quick Decision Tree ### Request Type → Action - **Question/How-to**: Answer directly with code examples - **Implementation Request**: Follow TDD workflow below - **Server/Command**: Execute appropriate command - **Analysis/Review**: Examine code and provide feedback ## Enhanced TDD Workflow with Planning From @.cursor/rules/agent_workflow.mdc with spec-driven development: ``` 1. Plan (Spec-First) → 2. Understand Requirements → 3. Write Tests → 4. Implement → 5. Validate → 6. Done ``` ### MANDATORY: Plan Mode First - **Always enter plan mode** before implementing any feature - Create detailed plan in `.claude/tasks/.md` - Break down into phases with clear deliverables - Update plan as you progress - Plan should include: requirements, architecture, test strategy, implementation phases ## Essential Commands ### Environment Setup ```bash # Navigate to project and activate environment cd /Users/enias/projects/my-ai-projects/apps/trax source .venv/bin/activate # Install/update dependencies with uv (10-100x faster than pip) uv pip install -e ".[dev]" ``` ### Step 1: Plan Mode (Spec-First) ```bash # Enter plan mode and create detailed spec # In Claude Code: Shift+Tab twice to enter plan mode # Create plan at: .claude/tasks/.md # Include: requirements, phases, architecture, test strategy ``` ### Step 2: Understand Requirements ```bash # Get task details and context task-master show # Get task details ./scripts/tm_context.sh get # Get cached context ``` ### Step 3: Write Tests First ```bash # Run tests with coverage uv run pytest # All tests uv run pytest tests/test_transcription_service.py -v # Specific test file uv run pytest -k "test_multi_pass" -v # Tests matching pattern uv run pytest -m unit # Unit tests only ``` ### Step 4: Implement Minimal Code ```bash # Development server uv run python src/main.py # Start development server ``` ### Step 5: Validate Quality ```bash # Code quality uv run black src/ tests/ # Format code uv run ruff check --fix src/ tests/ # Lint and auto-fix uv run mypy src/ # Type checking ./scripts/validate_loc.sh # Check file sizes # Database operations uv run alembic upgrade head # Apply migrations uv run alembic revision --autogenerate -m "description" # Create migration ``` ### Step 6: Complete Task & Update Plan ```bash # Update plan with results # Document in .claude/tasks/.md what was completed task-master set-status --id= --status=done ./scripts/tm_cache.sh update ./scripts/update_changelog.sh --type=task ``` ### CLI Commands ```bash # Standard transcription uv run python -m src.cli.main transcribe audio.mp3 # Basic transcription uv run python -m src.cli.main transcribe audio.mp3 --v2 # AI-enhanced (99% accuracy) # Enhanced CLI (recommended for production) uv run python -m src.cli.enhanced_cli transcribe audio.mp3 --multi-pass --confidence-threshold 0.9 uv run python -m src.cli.enhanced_cli transcribe lecture.mp3 --domain academic --diarize uv run python -m src.cli.enhanced_cli batch /path/to/files --parallel 8 # YouTube processing uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID uv run python -m src.cli.main batch-urls urls.txt --output-dir transcripts/ ``` ## Project Structure ``` trax/ ├── src/ # Main application code │ ├── services/ # Core business logic (protocol-based) │ │ ├── protocols.py # Service interfaces │ │ ├── transcription_service.py │ │ ├── multi_pass_transcription.py │ │ ├── domain_enhancement.py │ │ ├── batch_processor.py │ │ └── export_service.py │ ├── database/ # Data layer │ │ ├── models.py # Core SQLAlchemy models │ │ ├── v2_models.py # Extended v2 features │ │ └── repositories/ # Data access patterns │ ├── cli/ # Command-line interfaces │ │ ├── main.py # Standard CLI │ │ └── enhanced_cli.py # Advanced CLI with progress │ ├── api/ # REST API endpoints (future) │ ├── utils/ # Shared utilities │ └── config.py # Configuration (inherits from ../../.env) ├── tests/ # Test suite │ ├── fixtures/ # Real test media files │ │ ├── audio/ # Sample audio files │ │ └── video/ # Sample video files │ ├── conftest.py # Pytest configuration │ └── test_*.py # Test files ├── scripts/ # Utility scripts │ ├── validate_loc.sh # File size validation │ ├── tm_context.sh # Task context caching │ └── update_changelog.sh ├── .cursor/rules/ # Cursor AI rules │ ├── agent_workflow.mdc # Main workflow (single source) │ └── *.mdc # Supporting rules ├── .taskmaster/ # Task Master configuration │ ├── tasks/ # Task files │ ├── docs/ # PRD and documentation │ └── config.json # AI model configuration ├── .venv/ # Virtual environment (gitignored) ├── pyproject.toml # Package configuration (uv) ├── CLAUDE.md # This file └── AGENTS.md # Development rules ``` ## High-Level Architecture ### Service Layer (`src/services/`) The core processing logic uses **protocol-based design** for modularity: ```python # All services implement protocols for clean interfaces from src.services.protocols import TranscriptionProtocol, EnhancementProtocol # Key services: - transcription_service.py # Whisper integration (20-70x faster on M3) - multi_pass_transcription.py # Iterative refinement for 99.5% accuracy - domain_enhancement.py # AI enhancement with domain adaptation - batch_processor.py # Parallel processing (8 workers optimal) - export_service.py # Multi-format export (TXT, SRT, VTT, JSON) ``` ### Performance Optimizations - **Memory Management**: `memory_optimization.py` - Automatic cleanup, chunked processing - **Speed**: `speed_optimization.py` - M3-specific optimizations, distil-large-v3 model - **Domain Adaptation**: `domain_adaptation.py` - Technical/academic/medical terminology - **Caching**: Multi-layer caching with different TTLs per data type ### Database Layer (`src/database/`) PostgreSQL with SQLAlchemy ORM: - `models.py` - Core models (MediaFile, Transcript, Enhancement) - `v2_models.py` - Extended models for v2 features - `repositories/` - Data access patterns with protocol compliance ### Testing Strategy (`tests/`) **Real-file testing** - No mocks, actual media files: ```python # tests/conftest.py provides real test fixtures @pytest.fixture def sample_audio_5s(): return Path("tests/fixtures/audio/sample_5s.wav") ``` ## Configuration System Inherits from root project `.env` at `../../.env`: ```python from src.config import config # All API keys available as attributes api_key = config.DEEPSEEK_API_KEY services = config.get_available_ai_services() ``` ## File Organization Rules ### File Size Limits - **Code Files** (.py, .ts, .js): - Target: Under 300 lines - Maximum: 350 lines (only with clear justification) - Exceptions: Complex algorithms, comprehensive test suites - **Documentation** (.md, .txt): - Target: Under 550 lines - Maximum: 600 lines for essential docs (CLAUDE.md, README.md) - **Single Responsibility**: One service/component per file - **Protocol-Based**: Use typing.Protocol for service interfaces ### Example Structure ```python # transcription_service.py - Only transcription logic (50-100 lines) class TranscriptionService(TranscriptionProtocol): async def transcribe(self, file_path: Path) -> TranscriptResult: # Focused implementation pass # audio_processor.py - Only audio processing logic (50-100 lines) class AudioProcessor(AudioProtocol): def process_audio(self, audio_data) -> ProcessedAudio: # Focused implementation pass ``` ## Key Implementation Patterns ### 1. Download-First Architecture ```python # Always download media before processing downloader = MediaDownloadService() local_path = await downloader.download(url) result = await transcriber.transcribe(local_path) ``` ### 2. Test-First Development ```python # Write test that defines the interface def test_transcription_service(): service = TranscriptionService() result = service.transcribe_audio("test.wav") assert result.text is not None assert result.confidence > 0.8 # THEN implement to make test pass ``` ### 3. Multi-Pass Refinement (v2) ```python # Iterative improvement for 99.5% accuracy service = MultiPassTranscriptionService() result = await service.transcribe_with_passes( file_path, min_confidence=0.9, max_passes=3 ) ``` ### 4. Batch Processing ```python # Optimized for M3 with 8 parallel workers processor = BatchProcessor(max_workers=8) results = await processor.process_batch(file_paths) ``` ## Performance Targets - 5-minute audio: <30 seconds processing - 95% accuracy (v1), 99% accuracy (v2) - <1 second CLI response time - Support files up to 500MB - 8 parallel workers on M3 ## Current Implementation Status ### ✅ Completed - Whisper transcription with distil-large-v3 - DeepSeek AI enhancement - Multi-pass refinement system - Domain adaptation (technical/academic/medical) - Speaker diarization - Batch processing with parallel workers - Export to TXT/SRT/VTT/JSON - PostgreSQL database with migrations - Comprehensive test suite - Enhanced CLI with progress tracking ### 🚧 In Progress - Research agent UI (Streamlit) - Vector search integration (ChromaDB/FAISS) - Advanced speaker profiles ## Task Master Integration Task Master commands for project management: ```bash # View current tasks task-master list task-master next task-master show # Update task status task-master set-status --id= --status=done task-master update-subtask --id= --prompt="implementation notes" ``` See `.taskmaster/CLAUDE.md` for full Task Master workflow integration. ## Common Workflows ### Adding New Feature ```bash # 1. Get task details task-master show # 2. Write tests first # Create test file with comprehensive test cases # 3. Implement minimal code # Write code to pass tests # 4. Validate quality uv run pytest && uv run black src/ tests/ && uv run ruff check --fix # 5. Complete task-master set-status --id= --status=done ``` ### Fixing Bug ```bash # 1. Reproduce the bug with a failing test # 2. Fix the code to make test pass # 3. Validate uv run pytest && quality checks # 4. Update status task-master set-status --id= --status=done ``` ## Common Issues & Solutions ### Database Connection ```bash # Check PostgreSQL status pg_ctl status -D /usr/local/var/postgres # Start if needed pg_ctl start -D /usr/local/var/postgres ``` ### FFmpeg Missing ```bash # Install via Homebrew brew install ffmpeg ``` ### API Key Issues ```bash # Verify keys loaded uv run python -c "from src.config import config; config.display_config_status()" ``` ### Missing .env Check `../../.env` exists in root project ### Import Errors Run `uv pip install -e ".[dev]"` ### Type Errors Run `uv run mypy src/` ### Formatting Issues Run `uv run black src/ tests/` ## Anti-Patterns to Avoid ### ❌ DON'T: Skip Understanding - Jumping straight to coding without requirements - Not reading task details or context - Ignoring existing code patterns ### ❌ DON'T: Skip Testing - Writing code before tests - Incomplete test coverage - Not testing edge cases ### ❌ DON'T: Ignore Quality - Large, monolithic code files (>350 lines without justification) - Documentation files exceeding 600 lines - Poor formatting or linting errors - Not following project patterns ### ❌ DON'T: Over-Engineer - Complex abstractions when simple works - Multiple layers when one suffices - Premature optimization ## Success Metrics ### Code Quality - All tests pass - Code files under LOC limits (300 lines target, 350 max) - Documentation under 550 lines (600 max for essentials) - No linting errors - Consistent formatting ### Development Speed - Clear understanding of requirements - Tests written first - Minimal viable implementation - Quick validation cycles ### Maintainability - Small, focused files - Clear separation of concerns - Consistent patterns - Good test coverage ## Memory Management ### Claude Code Memory (# shortcut) Use `#` to save important context: ``` #remember Using distil-large-v3 for M3 optimization #remember PostgreSQL 15+ with JSONB for flexible storage #remember 8 parallel workers optimal for batch processing ``` ### Memory Levels - **Project-level**: Saved to `.claude.md` in project root - **User-level**: Saved globally across all projects - **Session-level**: Saved to `.claude/context/session.md` ### What to Remember - **Architecture decisions**: Model choices, database patterns - **Performance targets**: Processing times, accuracy goals - **Configuration**: API keys, service endpoints - **Conventions**: Naming patterns, file organization - **Dependencies**: Required packages, versions ## Cursor Rules Key rules from `.cursor/rules/`: - **agent_workflow.mdc** - Simplified TDD workflow (single source of truth) - **progressive-enhancement.mdc** - Iterative refinement approach - **utc-timestamps.mdc** - Timestamp handling standards - **low-loc.mdc** - Low Line of Code patterns (300 line target for code, 550 for docs) ## Parallel Development Git worktrees enable parallel development across features: - **Setup**: Run `.claude/scripts/setup_worktrees.sh` - **5 Default Worktrees**: features, testing, docs, performance, bugfix - **Switch**: Use `/Users/enias/projects/my-ai-projects/apps/trax-worktrees/switch.sh` - **Status**: Check all with `trax-worktrees/status.sh` - **Full Guide**: [Parallel Development Workflow](.claude/docs/parallel-development-workflow.md) --- *Architecture Version: 2.0 | Python 3.11+ | PostgreSQL 15+ | FFmpeg 6.0+* **Remember**: Keep it simple. Follow @.cursor/rules/agent_workflow.mdc: Understand → Test → Implement → Validate → Complete.