trax/CLAUDE.md

480 lines
15 KiB
Markdown

# CLAUDE.md
This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.
**📌 PRIMARY WORKFLOW**: @.cursor/rules/agent_workflow.mdc - Single source of truth for all development patterns
## Project Context
Trax is a production-ready media transcription platform within the my-ai-projects ecosystem. It uses Whisper for transcription with domain-specific AI enhancement, optimized for M3 MacBook performance.
**Core Architecture**: Download-first media processing → Whisper transcription → DeepSeek enhancement → Multi-format export
## Core Development Principles
From @.cursor/rules/agent_workflow.mdc:
- **Keep It Simple**: One workflow, clear patterns, no complex hierarchies
- **Context First**: Always understand what you're building before coding
- **Test First**: Write tests before implementation
- **Quality Built-In**: Enforce standards as you go, not as separate phases
- **Progressive Enhancement**: Start simple, add complexity only when needed
## Quick Decision Tree
### Request Type → Action
- **Question/How-to**: Answer directly with code examples
- **Implementation Request**: Follow TDD workflow below
- **Server/Command**: Execute appropriate command
- **Analysis/Review**: Examine code and provide feedback
## Enhanced TDD Workflow with Planning
From @.cursor/rules/agent_workflow.mdc with spec-driven development:
```
1. Plan (Spec-First) → 2. Understand Requirements → 3. Write Tests → 4. Implement → 5. Validate → 6. Done
```
### MANDATORY: Plan Mode First
- **Always enter plan mode** before implementing any feature
- Create detailed plan in `.claude/tasks/<feature-name>.md`
- Break down into phases with clear deliverables
- Update plan as you progress
- Plan should include: requirements, architecture, test strategy, implementation phases
## Essential Commands
### Environment Setup
```bash
# Navigate to project and activate environment
cd /Users/enias/projects/my-ai-projects/apps/trax
source .venv/bin/activate
# Install/update dependencies with uv (10-100x faster than pip)
uv pip install -e ".[dev]"
```
### Step 1: Plan Mode (Spec-First)
```bash
# Enter plan mode and create detailed spec
# In Claude Code: Shift+Tab twice to enter plan mode
# Create plan at: .claude/tasks/<feature-name>.md
# Include: requirements, phases, architecture, test strategy
```
### Step 2: Understand Requirements
```bash
# Get task details and context
task-master show <task-id> # Get task details
./scripts/tm_context.sh get <task-id> # Get cached context
```
### Step 3: Write Tests First
```bash
# Run tests with coverage
uv run pytest # All tests
uv run pytest tests/test_transcription_service.py -v # Specific test file
uv run pytest -k "test_multi_pass" -v # Tests matching pattern
uv run pytest -m unit # Unit tests only
```
### Step 4: Implement Minimal Code
```bash
# Development server
uv run python src/main.py # Start development server
```
### Step 5: Validate Quality
```bash
# Code quality
uv run black src/ tests/ # Format code
uv run ruff check --fix src/ tests/ # Lint and auto-fix
uv run mypy src/ # Type checking
./scripts/validate_loc.sh # Check file sizes
# Database operations
uv run alembic upgrade head # Apply migrations
uv run alembic revision --autogenerate -m "description" # Create migration
```
### Step 6: Complete Task & Update Plan
```bash
# Update plan with results
# Document in .claude/tasks/<feature-name>.md what was completed
task-master set-status --id=<task-id> --status=done
./scripts/tm_cache.sh update <task-id>
./scripts/update_changelog.sh <task-id> --type=task
```
### CLI Commands
```bash
# Standard transcription
uv run python -m src.cli.main transcribe audio.mp3 # Basic transcription
uv run python -m src.cli.main transcribe audio.mp3 --v2 # AI-enhanced (99% accuracy)
# Enhanced CLI (recommended for production)
uv run python -m src.cli.enhanced_cli transcribe audio.mp3 --multi-pass --confidence-threshold 0.9
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3 --domain academic --diarize
uv run python -m src.cli.enhanced_cli batch /path/to/files --parallel 8
# YouTube processing
uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID
uv run python -m src.cli.main batch-urls urls.txt --output-dir transcripts/
```
## Project Structure
```
trax/
├── src/ # Main application code
│ ├── services/ # Core business logic (protocol-based)
│ │ ├── protocols.py # Service interfaces
│ │ ├── transcription_service.py
│ │ ├── multi_pass_transcription.py
│ │ ├── domain_enhancement.py
│ │ ├── batch_processor.py
│ │ └── export_service.py
│ ├── database/ # Data layer
│ │ ├── models.py # Core SQLAlchemy models
│ │ ├── v2_models.py # Extended v2 features
│ │ └── repositories/ # Data access patterns
│ ├── cli/ # Command-line interfaces
│ │ ├── main.py # Standard CLI
│ │ └── enhanced_cli.py # Advanced CLI with progress
│ ├── api/ # REST API endpoints (future)
│ ├── utils/ # Shared utilities
│ └── config.py # Configuration (inherits from ../../.env)
├── tests/ # Test suite
│ ├── fixtures/ # Real test media files
│ │ ├── audio/ # Sample audio files
│ │ └── video/ # Sample video files
│ ├── conftest.py # Pytest configuration
│ └── test_*.py # Test files
├── scripts/ # Utility scripts
│ ├── validate_loc.sh # File size validation
│ ├── tm_context.sh # Task context caching
│ └── update_changelog.sh
├── .cursor/rules/ # Cursor AI rules
│ ├── agent_workflow.mdc # Main workflow (single source)
│ └── *.mdc # Supporting rules
├── .taskmaster/ # Task Master configuration
│ ├── tasks/ # Task files
│ ├── docs/ # PRD and documentation
│ └── config.json # AI model configuration
├── .venv/ # Virtual environment (gitignored)
├── pyproject.toml # Package configuration (uv)
├── CLAUDE.md # This file
└── AGENTS.md # Development rules
```
## High-Level Architecture
### Service Layer (`src/services/`)
The core processing logic uses **protocol-based design** for modularity:
```python
# All services implement protocols for clean interfaces
from src.services.protocols import TranscriptionProtocol, EnhancementProtocol
# Key services:
- transcription_service.py # Whisper integration (20-70x faster on M3)
- multi_pass_transcription.py # Iterative refinement for 99.5% accuracy
- domain_enhancement.py # AI enhancement with domain adaptation
- batch_processor.py # Parallel processing (8 workers optimal)
- export_service.py # Multi-format export (TXT, SRT, VTT, JSON)
```
### Performance Optimizations
- **Memory Management**: `memory_optimization.py` - Automatic cleanup, chunked processing
- **Speed**: `speed_optimization.py` - M3-specific optimizations, distil-large-v3 model
- **Domain Adaptation**: `domain_adaptation.py` - Technical/academic/medical terminology
- **Caching**: Multi-layer caching with different TTLs per data type
### Database Layer (`src/database/`)
PostgreSQL with SQLAlchemy ORM:
- `models.py` - Core models (MediaFile, Transcript, Enhancement)
- `v2_models.py` - Extended models for v2 features
- `repositories/` - Data access patterns with protocol compliance
### Testing Strategy (`tests/`)
**Real-file testing** - No mocks, actual media files:
```python
# tests/conftest.py provides real test fixtures
@pytest.fixture
def sample_audio_5s():
return Path("tests/fixtures/audio/sample_5s.wav")
```
## Configuration System
Inherits from root project `.env` at `../../.env`:
```python
from src.config import config
# All API keys available as attributes
api_key = config.DEEPSEEK_API_KEY
services = config.get_available_ai_services()
```
## File Organization Rules
### File Size Limits
- **Code Files** (.py, .ts, .js):
- Target: Under 300 lines
- Maximum: 350 lines (only with clear justification)
- Exceptions: Complex algorithms, comprehensive test suites
- **Documentation** (.md, .txt):
- Target: Under 550 lines
- Maximum: 600 lines for essential docs (CLAUDE.md, README.md)
- **Single Responsibility**: One service/component per file
- **Protocol-Based**: Use typing.Protocol for service interfaces
### Example Structure
```python
# transcription_service.py - Only transcription logic (50-100 lines)
class TranscriptionService(TranscriptionProtocol):
async def transcribe(self, file_path: Path) -> TranscriptResult:
# Focused implementation
pass
# audio_processor.py - Only audio processing logic (50-100 lines)
class AudioProcessor(AudioProtocol):
def process_audio(self, audio_data) -> ProcessedAudio:
# Focused implementation
pass
```
## Key Implementation Patterns
### 1. Download-First Architecture
```python
# Always download media before processing
downloader = MediaDownloadService()
local_path = await downloader.download(url)
result = await transcriber.transcribe(local_path)
```
### 2. Test-First Development
```python
# Write test that defines the interface
def test_transcription_service():
service = TranscriptionService()
result = service.transcribe_audio("test.wav")
assert result.text is not None
assert result.confidence > 0.8
# THEN implement to make test pass
```
### 3. Multi-Pass Refinement (v2)
```python
# Iterative improvement for 99.5% accuracy
service = MultiPassTranscriptionService()
result = await service.transcribe_with_passes(
file_path,
min_confidence=0.9,
max_passes=3
)
```
### 4. Batch Processing
```python
# Optimized for M3 with 8 parallel workers
processor = BatchProcessor(max_workers=8)
results = await processor.process_batch(file_paths)
```
## Performance Targets
- 5-minute audio: <30 seconds processing
- 95% accuracy (v1), 99% accuracy (v2)
- <1 second CLI response time
- Support files up to 500MB
- 8 parallel workers on M3
## Current Implementation Status
### ✅ Completed
- Whisper transcription with distil-large-v3
- DeepSeek AI enhancement
- Multi-pass refinement system
- Domain adaptation (technical/academic/medical)
- Speaker diarization
- Batch processing with parallel workers
- Export to TXT/SRT/VTT/JSON
- PostgreSQL database with migrations
- Comprehensive test suite
- Enhanced CLI with progress tracking
### 🚧 In Progress
- Research agent UI (Streamlit)
- Vector search integration (ChromaDB/FAISS)
- Advanced speaker profiles
## Task Master Integration
Task Master commands for project management:
```bash
# View current tasks
task-master list
task-master next
task-master show <id>
# Update task status
task-master set-status --id=<id> --status=done
task-master update-subtask --id=<id> --prompt="implementation notes"
```
See `.taskmaster/CLAUDE.md` for full Task Master workflow integration.
## Common Workflows
### Adding New Feature
```bash
# 1. Get task details
task-master show <task-id>
# 2. Write tests first
# Create test file with comprehensive test cases
# 3. Implement minimal code
# Write code to pass tests
# 4. Validate quality
uv run pytest && uv run black src/ tests/ && uv run ruff check --fix
# 5. Complete
task-master set-status --id=<task-id> --status=done
```
### Fixing Bug
```bash
# 1. Reproduce the bug with a failing test
# 2. Fix the code to make test pass
# 3. Validate
uv run pytest && quality checks
# 4. Update status
task-master set-status --id=<task-id> --status=done
```
## Common Issues & Solutions
### Database Connection
```bash
# Check PostgreSQL status
pg_ctl status -D /usr/local/var/postgres
# Start if needed
pg_ctl start -D /usr/local/var/postgres
```
### FFmpeg Missing
```bash
# Install via Homebrew
brew install ffmpeg
```
### API Key Issues
```bash
# Verify keys loaded
uv run python -c "from src.config import config; config.display_config_status()"
```
### Missing .env
Check `../../.env` exists in root project
### Import Errors
Run `uv pip install -e ".[dev]"`
### Type Errors
Run `uv run mypy src/`
### Formatting Issues
Run `uv run black src/ tests/`
## Anti-Patterns to Avoid
### ❌ DON'T: Skip Understanding
- Jumping straight to coding without requirements
- Not reading task details or context
- Ignoring existing code patterns
### ❌ DON'T: Skip Testing
- Writing code before tests
- Incomplete test coverage
- Not testing edge cases
### ❌ DON'T: Ignore Quality
- Large, monolithic code files (>350 lines without justification)
- Documentation files exceeding 600 lines
- Poor formatting or linting errors
- Not following project patterns
### ❌ DON'T: Over-Engineer
- Complex abstractions when simple works
- Multiple layers when one suffices
- Premature optimization
## Success Metrics
### Code Quality
- All tests pass
- Code files under LOC limits (300 lines target, 350 max)
- Documentation under 550 lines (600 max for essentials)
- No linting errors
- Consistent formatting
### Development Speed
- Clear understanding of requirements
- Tests written first
- Minimal viable implementation
- Quick validation cycles
### Maintainability
- Small, focused files
- Clear separation of concerns
- Consistent patterns
- Good test coverage
## Memory Management
### Claude Code Memory (# shortcut)
Use `#` to save important context:
```
#remember Using distil-large-v3 for M3 optimization
#remember PostgreSQL 15+ with JSONB for flexible storage
#remember 8 parallel workers optimal for batch processing
```
### Memory Levels
- **Project-level**: Saved to `.claude.md` in project root
- **User-level**: Saved globally across all projects
- **Session-level**: Saved to `.claude/context/session.md`
### What to Remember
- **Architecture decisions**: Model choices, database patterns
- **Performance targets**: Processing times, accuracy goals
- **Configuration**: API keys, service endpoints
- **Conventions**: Naming patterns, file organization
- **Dependencies**: Required packages, versions
## Cursor Rules
Key rules from `.cursor/rules/`:
- **agent_workflow.mdc** - Simplified TDD workflow (single source of truth)
- **progressive-enhancement.mdc** - Iterative refinement approach
- **utc-timestamps.mdc** - Timestamp handling standards
- **low-loc.mdc** - Low Line of Code patterns (300 line target for code, 550 for docs)
## Parallel Development
Git worktrees enable parallel development across features:
- **Setup**: Run `.claude/scripts/setup_worktrees.sh`
- **5 Default Worktrees**: features, testing, docs, performance, bugfix
- **Switch**: Use `/Users/enias/projects/my-ai-projects/apps/trax-worktrees/switch.sh`
- **Status**: Check all with `trax-worktrees/status.sh`
- **Full Guide**: [Parallel Development Workflow](.claude/docs/parallel-development-workflow.md)
---
*Architecture Version: 2.0 | Python 3.11+ | PostgreSQL 15+ | FFmpeg 6.0+*
**Remember**: Keep it simple. Follow @.cursor/rules/agent_workflow.mdc: Understand → Test → Implement → Validate → Complete.