15 KiB
CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
📌 PRIMARY WORKFLOW: @.cursor/rules/agent_workflow.mdc - Single source of truth for all development patterns
Project Context
Trax is a production-ready media transcription platform within the my-ai-projects ecosystem. It uses Whisper for transcription with domain-specific AI enhancement, optimized for M3 MacBook performance.
Core Architecture: Download-first media processing → Whisper transcription → DeepSeek enhancement → Multi-format export
Core Development Principles
From @.cursor/rules/agent_workflow.mdc:
- Keep It Simple: One workflow, clear patterns, no complex hierarchies
- Context First: Always understand what you're building before coding
- Test First: Write tests before implementation
- Quality Built-In: Enforce standards as you go, not as separate phases
- Progressive Enhancement: Start simple, add complexity only when needed
Quick Decision Tree
Request Type → Action
- Question/How-to: Answer directly with code examples
- Implementation Request: Follow TDD workflow below
- Server/Command: Execute appropriate command
- Analysis/Review: Examine code and provide feedback
Enhanced TDD Workflow with Planning
From @.cursor/rules/agent_workflow.mdc with spec-driven development:
1. Plan (Spec-First) → 2. Understand Requirements → 3. Write Tests → 4. Implement → 5. Validate → 6. Done
MANDATORY: Plan Mode First
- Always enter plan mode before implementing any feature
- Create detailed plan in
.claude/tasks/<feature-name>.md - Break down into phases with clear deliverables
- Update plan as you progress
- Plan should include: requirements, architecture, test strategy, implementation phases
Essential Commands
Environment Setup
# Navigate to project and activate environment
cd /Users/enias/projects/my-ai-projects/apps/trax
source .venv/bin/activate
# Install/update dependencies with uv (10-100x faster than pip)
uv pip install -e ".[dev]"
Step 1: Plan Mode (Spec-First)
# Enter plan mode and create detailed spec
# In Claude Code: Shift+Tab twice to enter plan mode
# Create plan at: .claude/tasks/<feature-name>.md
# Include: requirements, phases, architecture, test strategy
Step 2: Understand Requirements
# Get task details and context
task-master show <task-id> # Get task details
./scripts/tm_context.sh get <task-id> # Get cached context
Step 3: Write Tests First
# Run tests with coverage
uv run pytest # All tests
uv run pytest tests/test_transcription_service.py -v # Specific test file
uv run pytest -k "test_multi_pass" -v # Tests matching pattern
uv run pytest -m unit # Unit tests only
Step 4: Implement Minimal Code
# Development server
uv run python src/main.py # Start development server
Step 5: Validate Quality
# Code quality
uv run black src/ tests/ # Format code
uv run ruff check --fix src/ tests/ # Lint and auto-fix
uv run mypy src/ # Type checking
./scripts/validate_loc.sh # Check file sizes
# Database operations
uv run alembic upgrade head # Apply migrations
uv run alembic revision --autogenerate -m "description" # Create migration
Step 6: Complete Task & Update Plan
# Update plan with results
# Document in .claude/tasks/<feature-name>.md what was completed
task-master set-status --id=<task-id> --status=done
./scripts/tm_cache.sh update <task-id>
./scripts/update_changelog.sh <task-id> --type=task
CLI Commands
# Standard transcription
uv run python -m src.cli.main transcribe audio.mp3 # Basic transcription
uv run python -m src.cli.main transcribe audio.mp3 --v2 # AI-enhanced (99% accuracy)
# Enhanced CLI (recommended for production)
uv run python -m src.cli.enhanced_cli transcribe audio.mp3 --multi-pass --confidence-threshold 0.9
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3 --domain academic --diarize
uv run python -m src.cli.enhanced_cli batch /path/to/files --parallel 8
# YouTube processing
uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID
uv run python -m src.cli.main batch-urls urls.txt --output-dir transcripts/
Project Structure
trax/
├── src/ # Main application code
│ ├── services/ # Core business logic (protocol-based)
│ │ ├── protocols.py # Service interfaces
│ │ ├── transcription_service.py
│ │ ├── multi_pass_transcription.py
│ │ ├── domain_enhancement.py
│ │ ├── batch_processor.py
│ │ └── export_service.py
│ ├── database/ # Data layer
│ │ ├── models.py # Core SQLAlchemy models
│ │ ├── v2_models.py # Extended v2 features
│ │ └── repositories/ # Data access patterns
│ ├── cli/ # Command-line interfaces
│ │ ├── main.py # Standard CLI
│ │ └── enhanced_cli.py # Advanced CLI with progress
│ ├── api/ # REST API endpoints (future)
│ ├── utils/ # Shared utilities
│ └── config.py # Configuration (inherits from ../../.env)
├── tests/ # Test suite
│ ├── fixtures/ # Real test media files
│ │ ├── audio/ # Sample audio files
│ │ └── video/ # Sample video files
│ ├── conftest.py # Pytest configuration
│ └── test_*.py # Test files
├── scripts/ # Utility scripts
│ ├── validate_loc.sh # File size validation
│ ├── tm_context.sh # Task context caching
│ └── update_changelog.sh
├── .cursor/rules/ # Cursor AI rules
│ ├── agent_workflow.mdc # Main workflow (single source)
│ └── *.mdc # Supporting rules
├── .taskmaster/ # Task Master configuration
│ ├── tasks/ # Task files
│ ├── docs/ # PRD and documentation
│ └── config.json # AI model configuration
├── .venv/ # Virtual environment (gitignored)
├── pyproject.toml # Package configuration (uv)
├── CLAUDE.md # This file
└── AGENTS.md # Development rules
High-Level Architecture
Service Layer (src/services/)
The core processing logic uses protocol-based design for modularity:
# All services implement protocols for clean interfaces
from src.services.protocols import TranscriptionProtocol, EnhancementProtocol
# Key services:
- transcription_service.py # Whisper integration (20-70x faster on M3)
- multi_pass_transcription.py # Iterative refinement for 99.5% accuracy
- domain_enhancement.py # AI enhancement with domain adaptation
- batch_processor.py # Parallel processing (8 workers optimal)
- export_service.py # Multi-format export (TXT, SRT, VTT, JSON)
Performance Optimizations
- Memory Management:
memory_optimization.py- Automatic cleanup, chunked processing - Speed:
speed_optimization.py- M3-specific optimizations, distil-large-v3 model - Domain Adaptation:
domain_adaptation.py- Technical/academic/medical terminology - Caching: Multi-layer caching with different TTLs per data type
Database Layer (src/database/)
PostgreSQL with SQLAlchemy ORM:
models.py- Core models (MediaFile, Transcript, Enhancement)v2_models.py- Extended models for v2 featuresrepositories/- Data access patterns with protocol compliance
Testing Strategy (tests/)
Real-file testing - No mocks, actual media files:
# tests/conftest.py provides real test fixtures
@pytest.fixture
def sample_audio_5s():
return Path("tests/fixtures/audio/sample_5s.wav")
Configuration System
Inherits from root project .env at ../../.env:
from src.config import config
# All API keys available as attributes
api_key = config.DEEPSEEK_API_KEY
services = config.get_available_ai_services()
File Organization Rules
File Size Limits
- Code Files (.py, .ts, .js):
- Target: Under 300 lines
- Maximum: 350 lines (only with clear justification)
- Exceptions: Complex algorithms, comprehensive test suites
- Documentation (.md, .txt):
- Target: Under 550 lines
- Maximum: 600 lines for essential docs (CLAUDE.md, README.md)
- Single Responsibility: One service/component per file
- Protocol-Based: Use typing.Protocol for service interfaces
Example Structure
# transcription_service.py - Only transcription logic (50-100 lines)
class TranscriptionService(TranscriptionProtocol):
async def transcribe(self, file_path: Path) -> TranscriptResult:
# Focused implementation
pass
# audio_processor.py - Only audio processing logic (50-100 lines)
class AudioProcessor(AudioProtocol):
def process_audio(self, audio_data) -> ProcessedAudio:
# Focused implementation
pass
Key Implementation Patterns
1. Download-First Architecture
# Always download media before processing
downloader = MediaDownloadService()
local_path = await downloader.download(url)
result = await transcriber.transcribe(local_path)
2. Test-First Development
# Write test that defines the interface
def test_transcription_service():
service = TranscriptionService()
result = service.transcribe_audio("test.wav")
assert result.text is not None
assert result.confidence > 0.8
# THEN implement to make test pass
3. Multi-Pass Refinement (v2)
# Iterative improvement for 99.5% accuracy
service = MultiPassTranscriptionService()
result = await service.transcribe_with_passes(
file_path,
min_confidence=0.9,
max_passes=3
)
4. Batch Processing
# Optimized for M3 with 8 parallel workers
processor = BatchProcessor(max_workers=8)
results = await processor.process_batch(file_paths)
Performance Targets
- 5-minute audio: <30 seconds processing
- 95% accuracy (v1), 99% accuracy (v2)
- <1 second CLI response time
- Support files up to 500MB
- 8 parallel workers on M3
Current Implementation Status
✅ Completed
- Whisper transcription with distil-large-v3
- DeepSeek AI enhancement
- Multi-pass refinement system
- Domain adaptation (technical/academic/medical)
- Speaker diarization
- Batch processing with parallel workers
- Export to TXT/SRT/VTT/JSON
- PostgreSQL database with migrations
- Comprehensive test suite
- Enhanced CLI with progress tracking
🚧 In Progress
- Research agent UI (Streamlit)
- Vector search integration (ChromaDB/FAISS)
- Advanced speaker profiles
Task Master Integration
Task Master commands for project management:
# View current tasks
task-master list
task-master next
task-master show <id>
# Update task status
task-master set-status --id=<id> --status=done
task-master update-subtask --id=<id> --prompt="implementation notes"
See .taskmaster/CLAUDE.md for full Task Master workflow integration.
Common Workflows
Adding New Feature
# 1. Get task details
task-master show <task-id>
# 2. Write tests first
# Create test file with comprehensive test cases
# 3. Implement minimal code
# Write code to pass tests
# 4. Validate quality
uv run pytest && uv run black src/ tests/ && uv run ruff check --fix
# 5. Complete
task-master set-status --id=<task-id> --status=done
Fixing Bug
# 1. Reproduce the bug with a failing test
# 2. Fix the code to make test pass
# 3. Validate
uv run pytest && quality checks
# 4. Update status
task-master set-status --id=<task-id> --status=done
Common Issues & Solutions
Database Connection
# Check PostgreSQL status
pg_ctl status -D /usr/local/var/postgres
# Start if needed
pg_ctl start -D /usr/local/var/postgres
FFmpeg Missing
# Install via Homebrew
brew install ffmpeg
API Key Issues
# Verify keys loaded
uv run python -c "from src.config import config; config.display_config_status()"
Missing .env
Check ../../.env exists in root project
Import Errors
Run uv pip install -e ".[dev]"
Type Errors
Run uv run mypy src/
Formatting Issues
Run uv run black src/ tests/
Anti-Patterns to Avoid
❌ DON'T: Skip Understanding
- Jumping straight to coding without requirements
- Not reading task details or context
- Ignoring existing code patterns
❌ DON'T: Skip Testing
- Writing code before tests
- Incomplete test coverage
- Not testing edge cases
❌ DON'T: Ignore Quality
- Large, monolithic code files (>350 lines without justification)
- Documentation files exceeding 600 lines
- Poor formatting or linting errors
- Not following project patterns
❌ DON'T: Over-Engineer
- Complex abstractions when simple works
- Multiple layers when one suffices
- Premature optimization
Success Metrics
Code Quality
- All tests pass
- Code files under LOC limits (300 lines target, 350 max)
- Documentation under 550 lines (600 max for essentials)
- No linting errors
- Consistent formatting
Development Speed
- Clear understanding of requirements
- Tests written first
- Minimal viable implementation
- Quick validation cycles
Maintainability
- Small, focused files
- Clear separation of concerns
- Consistent patterns
- Good test coverage
Memory Management
Claude Code Memory (# shortcut)
Use # to save important context:
#remember Using distil-large-v3 for M3 optimization
#remember PostgreSQL 15+ with JSONB for flexible storage
#remember 8 parallel workers optimal for batch processing
Memory Levels
- Project-level: Saved to
.claude.mdin project root - User-level: Saved globally across all projects
- Session-level: Saved to
.claude/context/session.md
What to Remember
- Architecture decisions: Model choices, database patterns
- Performance targets: Processing times, accuracy goals
- Configuration: API keys, service endpoints
- Conventions: Naming patterns, file organization
- Dependencies: Required packages, versions
Cursor Rules
Key rules from .cursor/rules/:
- agent_workflow.mdc - Simplified TDD workflow (single source of truth)
- progressive-enhancement.mdc - Iterative refinement approach
- utc-timestamps.mdc - Timestamp handling standards
- low-loc.mdc - Low Line of Code patterns (300 line target for code, 550 for docs)
Parallel Development
Git worktrees enable parallel development across features:
- Setup: Run
.claude/scripts/setup_worktrees.sh - 5 Default Worktrees: features, testing, docs, performance, bugfix
- Switch: Use
/Users/enias/projects/my-ai-projects/apps/trax-worktrees/switch.sh - Status: Check all with
trax-worktrees/status.sh - Full Guide: Parallel Development Workflow
Architecture Version: 2.0 | Python 3.11+ | PostgreSQL 15+ | FFmpeg 6.0+
Remember: Keep it simple. Follow @.cursor/rules/agent_workflow.mdc: Understand → Test → Implement → Validate → Complete.