trax/CLAUDE.md

15 KiB

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

📌 PRIMARY WORKFLOW: @.cursor/rules/agent_workflow.mdc - Single source of truth for all development patterns

Project Context

Trax is a production-ready media transcription platform within the my-ai-projects ecosystem. It uses Whisper for transcription with domain-specific AI enhancement, optimized for M3 MacBook performance.

Core Architecture: Download-first media processing → Whisper transcription → DeepSeek enhancement → Multi-format export

Core Development Principles

From @.cursor/rules/agent_workflow.mdc:

  • Keep It Simple: One workflow, clear patterns, no complex hierarchies
  • Context First: Always understand what you're building before coding
  • Test First: Write tests before implementation
  • Quality Built-In: Enforce standards as you go, not as separate phases
  • Progressive Enhancement: Start simple, add complexity only when needed

Quick Decision Tree

Request Type → Action

  • Question/How-to: Answer directly with code examples
  • Implementation Request: Follow TDD workflow below
  • Server/Command: Execute appropriate command
  • Analysis/Review: Examine code and provide feedback

Enhanced TDD Workflow with Planning

From @.cursor/rules/agent_workflow.mdc with spec-driven development:

1. Plan (Spec-First) → 2. Understand Requirements → 3. Write Tests → 4. Implement → 5. Validate → 6. Done

MANDATORY: Plan Mode First

  • Always enter plan mode before implementing any feature
  • Create detailed plan in .claude/tasks/<feature-name>.md
  • Break down into phases with clear deliverables
  • Update plan as you progress
  • Plan should include: requirements, architecture, test strategy, implementation phases

Essential Commands

Environment Setup

# Navigate to project and activate environment
cd /Users/enias/projects/my-ai-projects/apps/trax
source .venv/bin/activate

# Install/update dependencies with uv (10-100x faster than pip)
uv pip install -e ".[dev]"

Step 1: Plan Mode (Spec-First)

# Enter plan mode and create detailed spec
# In Claude Code: Shift+Tab twice to enter plan mode
# Create plan at: .claude/tasks/<feature-name>.md
# Include: requirements, phases, architecture, test strategy

Step 2: Understand Requirements

# Get task details and context
task-master show <task-id>                      # Get task details
./scripts/tm_context.sh get <task-id>          # Get cached context

Step 3: Write Tests First

# Run tests with coverage
uv run pytest                                    # All tests
uv run pytest tests/test_transcription_service.py -v  # Specific test file
uv run pytest -k "test_multi_pass" -v           # Tests matching pattern
uv run pytest -m unit                           # Unit tests only

Step 4: Implement Minimal Code

# Development server
uv run python src/main.py                       # Start development server

Step 5: Validate Quality

# Code quality
uv run black src/ tests/                        # Format code
uv run ruff check --fix src/ tests/            # Lint and auto-fix
uv run mypy src/                               # Type checking
./scripts/validate_loc.sh                       # Check file sizes

# Database operations
uv run alembic upgrade head                     # Apply migrations
uv run alembic revision --autogenerate -m "description"  # Create migration

Step 6: Complete Task & Update Plan

# Update plan with results
# Document in .claude/tasks/<feature-name>.md what was completed
task-master set-status --id=<task-id> --status=done
./scripts/tm_cache.sh update <task-id>
./scripts/update_changelog.sh <task-id> --type=task

CLI Commands

# Standard transcription
uv run python -m src.cli.main transcribe audio.mp3         # Basic transcription
uv run python -m src.cli.main transcribe audio.mp3 --v2    # AI-enhanced (99% accuracy)

# Enhanced CLI (recommended for production)
uv run python -m src.cli.enhanced_cli transcribe audio.mp3 --multi-pass --confidence-threshold 0.9
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3 --domain academic --diarize
uv run python -m src.cli.enhanced_cli batch /path/to/files --parallel 8

# YouTube processing
uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID
uv run python -m src.cli.main batch-urls urls.txt --output-dir transcripts/

Project Structure

trax/
├── src/                    # Main application code
│   ├── services/          # Core business logic (protocol-based)
│   │   ├── protocols.py  # Service interfaces
│   │   ├── transcription_service.py
│   │   ├── multi_pass_transcription.py
│   │   ├── domain_enhancement.py
│   │   ├── batch_processor.py
│   │   └── export_service.py
│   ├── database/          # Data layer
│   │   ├── models.py     # Core SQLAlchemy models
│   │   ├── v2_models.py  # Extended v2 features
│   │   └── repositories/ # Data access patterns
│   ├── cli/              # Command-line interfaces
│   │   ├── main.py       # Standard CLI
│   │   └── enhanced_cli.py # Advanced CLI with progress
│   ├── api/              # REST API endpoints (future)
│   ├── utils/            # Shared utilities
│   └── config.py         # Configuration (inherits from ../../.env)
├── tests/                 # Test suite
│   ├── fixtures/         # Real test media files
│   │   ├── audio/       # Sample audio files
│   │   └── video/       # Sample video files
│   ├── conftest.py      # Pytest configuration
│   └── test_*.py        # Test files
├── scripts/              # Utility scripts
│   ├── validate_loc.sh  # File size validation
│   ├── tm_context.sh    # Task context caching
│   └── update_changelog.sh
├── .cursor/rules/        # Cursor AI rules
│   ├── agent_workflow.mdc # Main workflow (single source)
│   └── *.mdc            # Supporting rules
├── .taskmaster/          # Task Master configuration
│   ├── tasks/           # Task files
│   ├── docs/           # PRD and documentation
│   └── config.json     # AI model configuration
├── .venv/               # Virtual environment (gitignored)
├── pyproject.toml       # Package configuration (uv)
├── CLAUDE.md           # This file
└── AGENTS.md           # Development rules

High-Level Architecture

Service Layer (src/services/)

The core processing logic uses protocol-based design for modularity:

# All services implement protocols for clean interfaces
from src.services.protocols import TranscriptionProtocol, EnhancementProtocol

# Key services:
- transcription_service.py     # Whisper integration (20-70x faster on M3)
- multi_pass_transcription.py  # Iterative refinement for 99.5% accuracy
- domain_enhancement.py         # AI enhancement with domain adaptation
- batch_processor.py           # Parallel processing (8 workers optimal)
- export_service.py            # Multi-format export (TXT, SRT, VTT, JSON)

Performance Optimizations

  • Memory Management: memory_optimization.py - Automatic cleanup, chunked processing
  • Speed: speed_optimization.py - M3-specific optimizations, distil-large-v3 model
  • Domain Adaptation: domain_adaptation.py - Technical/academic/medical terminology
  • Caching: Multi-layer caching with different TTLs per data type

Database Layer (src/database/)

PostgreSQL with SQLAlchemy ORM:

  • models.py - Core models (MediaFile, Transcript, Enhancement)
  • v2_models.py - Extended models for v2 features
  • repositories/ - Data access patterns with protocol compliance

Testing Strategy (tests/)

Real-file testing - No mocks, actual media files:

# tests/conftest.py provides real test fixtures
@pytest.fixture
def sample_audio_5s():
    return Path("tests/fixtures/audio/sample_5s.wav")

Configuration System

Inherits from root project .env at ../../.env:

from src.config import config

# All API keys available as attributes
api_key = config.DEEPSEEK_API_KEY
services = config.get_available_ai_services()

File Organization Rules

File Size Limits

  • Code Files (.py, .ts, .js):
    • Target: Under 300 lines
    • Maximum: 350 lines (only with clear justification)
    • Exceptions: Complex algorithms, comprehensive test suites
  • Documentation (.md, .txt):
    • Target: Under 550 lines
    • Maximum: 600 lines for essential docs (CLAUDE.md, README.md)
  • Single Responsibility: One service/component per file
  • Protocol-Based: Use typing.Protocol for service interfaces

Example Structure

# transcription_service.py - Only transcription logic (50-100 lines)
class TranscriptionService(TranscriptionProtocol):
    async def transcribe(self, file_path: Path) -> TranscriptResult:
        # Focused implementation
        pass

# audio_processor.py - Only audio processing logic (50-100 lines)
class AudioProcessor(AudioProtocol):
    def process_audio(self, audio_data) -> ProcessedAudio:
        # Focused implementation
        pass

Key Implementation Patterns

1. Download-First Architecture

# Always download media before processing
downloader = MediaDownloadService()
local_path = await downloader.download(url)
result = await transcriber.transcribe(local_path)

2. Test-First Development

# Write test that defines the interface
def test_transcription_service():
    service = TranscriptionService()
    result = service.transcribe_audio("test.wav")
    assert result.text is not None
    assert result.confidence > 0.8
# THEN implement to make test pass

3. Multi-Pass Refinement (v2)

# Iterative improvement for 99.5% accuracy
service = MultiPassTranscriptionService()
result = await service.transcribe_with_passes(
    file_path, 
    min_confidence=0.9,
    max_passes=3
)

4. Batch Processing

# Optimized for M3 with 8 parallel workers
processor = BatchProcessor(max_workers=8)
results = await processor.process_batch(file_paths)

Performance Targets

  • 5-minute audio: <30 seconds processing
  • 95% accuracy (v1), 99% accuracy (v2)
  • <1 second CLI response time
  • Support files up to 500MB
  • 8 parallel workers on M3

Current Implementation Status

Completed

  • Whisper transcription with distil-large-v3
  • DeepSeek AI enhancement
  • Multi-pass refinement system
  • Domain adaptation (technical/academic/medical)
  • Speaker diarization
  • Batch processing with parallel workers
  • Export to TXT/SRT/VTT/JSON
  • PostgreSQL database with migrations
  • Comprehensive test suite
  • Enhanced CLI with progress tracking

🚧 In Progress

  • Research agent UI (Streamlit)
  • Vector search integration (ChromaDB/FAISS)
  • Advanced speaker profiles

Task Master Integration

Task Master commands for project management:

# View current tasks
task-master list
task-master next
task-master show <id>

# Update task status
task-master set-status --id=<id> --status=done
task-master update-subtask --id=<id> --prompt="implementation notes"

See .taskmaster/CLAUDE.md for full Task Master workflow integration.

Common Workflows

Adding New Feature

# 1. Get task details
task-master show <task-id>

# 2. Write tests first
# Create test file with comprehensive test cases

# 3. Implement minimal code
# Write code to pass tests

# 4. Validate quality
uv run pytest && uv run black src/ tests/ && uv run ruff check --fix

# 5. Complete
task-master set-status --id=<task-id> --status=done

Fixing Bug

# 1. Reproduce the bug with a failing test

# 2. Fix the code to make test pass

# 3. Validate
uv run pytest && quality checks

# 4. Update status
task-master set-status --id=<task-id> --status=done

Common Issues & Solutions

Database Connection

# Check PostgreSQL status
pg_ctl status -D /usr/local/var/postgres
# Start if needed
pg_ctl start -D /usr/local/var/postgres

FFmpeg Missing

# Install via Homebrew
brew install ffmpeg

API Key Issues

# Verify keys loaded
uv run python -c "from src.config import config; config.display_config_status()"

Missing .env

Check ../../.env exists in root project

Import Errors

Run uv pip install -e ".[dev]"

Type Errors

Run uv run mypy src/

Formatting Issues

Run uv run black src/ tests/

Anti-Patterns to Avoid

DON'T: Skip Understanding

  • Jumping straight to coding without requirements
  • Not reading task details or context
  • Ignoring existing code patterns

DON'T: Skip Testing

  • Writing code before tests
  • Incomplete test coverage
  • Not testing edge cases

DON'T: Ignore Quality

  • Large, monolithic code files (>350 lines without justification)
  • Documentation files exceeding 600 lines
  • Poor formatting or linting errors
  • Not following project patterns

DON'T: Over-Engineer

  • Complex abstractions when simple works
  • Multiple layers when one suffices
  • Premature optimization

Success Metrics

Code Quality

  • All tests pass
  • Code files under LOC limits (300 lines target, 350 max)
  • Documentation under 550 lines (600 max for essentials)
  • No linting errors
  • Consistent formatting

Development Speed

  • Clear understanding of requirements
  • Tests written first
  • Minimal viable implementation
  • Quick validation cycles

Maintainability

  • Small, focused files
  • Clear separation of concerns
  • Consistent patterns
  • Good test coverage

Memory Management

Claude Code Memory (# shortcut)

Use # to save important context:

#remember Using distil-large-v3 for M3 optimization
#remember PostgreSQL 15+ with JSONB for flexible storage
#remember 8 parallel workers optimal for batch processing

Memory Levels

  • Project-level: Saved to .claude.md in project root
  • User-level: Saved globally across all projects
  • Session-level: Saved to .claude/context/session.md

What to Remember

  • Architecture decisions: Model choices, database patterns
  • Performance targets: Processing times, accuracy goals
  • Configuration: API keys, service endpoints
  • Conventions: Naming patterns, file organization
  • Dependencies: Required packages, versions

Cursor Rules

Key rules from .cursor/rules/:

  • agent_workflow.mdc - Simplified TDD workflow (single source of truth)
  • progressive-enhancement.mdc - Iterative refinement approach
  • utc-timestamps.mdc - Timestamp handling standards
  • low-loc.mdc - Low Line of Code patterns (300 line target for code, 550 for docs)

Parallel Development

Git worktrees enable parallel development across features:

  • Setup: Run .claude/scripts/setup_worktrees.sh
  • 5 Default Worktrees: features, testing, docs, performance, bugfix
  • Switch: Use /Users/enias/projects/my-ai-projects/apps/trax-worktrees/switch.sh
  • Status: Check all with trax-worktrees/status.sh
  • Full Guide: Parallel Development Workflow

Architecture Version: 2.0 | Python 3.11+ | PostgreSQL 15+ | FFmpeg 6.0+

Remember: Keep it simple. Follow @.cursor/rules/agent_workflow.mdc: Understand → Test → Implement → Validate → Complete.