15 KiB

Raw Blame History

CLAUDE.md

This file provides guidance to Claude Code (claude.ai/code) when working with code in this repository.

📌 PRIMARY WORKFLOW: @.cursor/rules/agent_workflow.mdc - Single source of truth for all development patterns

Project Context

Trax is a production-ready media transcription platform within the my-ai-projects ecosystem. It uses Whisper for transcription with domain-specific AI enhancement, optimized for M3 MacBook performance.

Core Architecture: Download-first media processing → Whisper transcription → DeepSeek enhancement → Multi-format export

Core Development Principles

From @.cursor/rules/agent_workflow.mdc:

Keep It Simple: One workflow, clear patterns, no complex hierarchies
Context First: Always understand what you're building before coding
Test First: Write tests before implementation
Quality Built-In: Enforce standards as you go, not as separate phases
Progressive Enhancement: Start simple, add complexity only when needed

Quick Decision Tree

Request Type → Action

Question/How-to: Answer directly with code examples
Implementation Request: Follow TDD workflow below
Server/Command: Execute appropriate command
Analysis/Review: Examine code and provide feedback

Enhanced TDD Workflow with Planning

From @.cursor/rules/agent_workflow.mdc with spec-driven development:

1. Plan (Spec-First) → 2. Understand Requirements → 3. Write Tests → 4. Implement → 5. Validate → 6. Done

MANDATORY: Plan Mode First

Always enter plan mode before implementing any feature
Create detailed plan in .claude/tasks/<feature-name>.md
Break down into phases with clear deliverables
Update plan as you progress
Plan should include: requirements, architecture, test strategy, implementation phases

Essential Commands

Environment Setup

# Navigate to project and activate environment
cd /Users/enias/projects/my-ai-projects/apps/trax
source .venv/bin/activate

# Install/update dependencies with uv (10-100x faster than pip)
uv pip install -e ".[dev]"

Step 1: Plan Mode (Spec-First)

# Enter plan mode and create detailed spec
# In Claude Code: Shift+Tab twice to enter plan mode
# Create plan at: .claude/tasks/<feature-name>.md
# Include: requirements, phases, architecture, test strategy

Step 2: Understand Requirements

# Get task details and context
task-master show <task-id>                      # Get task details
./scripts/tm_context.sh get <task-id>          # Get cached context

Step 3: Write Tests First

# Run tests with coverage
uv run pytest                                    # All tests
uv run pytest tests/test_transcription_service.py -v  # Specific test file
uv run pytest -k "test_multi_pass" -v           # Tests matching pattern
uv run pytest -m unit                           # Unit tests only

Step 4: Implement Minimal Code

# Development server
uv run python src/main.py                       # Start development server

Step 5: Validate Quality

# Code quality
uv run black src/ tests/                        # Format code
uv run ruff check --fix src/ tests/            # Lint and auto-fix
uv run mypy src/                               # Type checking
./scripts/validate_loc.sh                       # Check file sizes

# Database operations
uv run alembic upgrade head                     # Apply migrations
uv run alembic revision --autogenerate -m "description"  # Create migration

Step 6: Complete Task & Update Plan

# Update plan with results
# Document in .claude/tasks/<feature-name>.md what was completed
task-master set-status --id=<task-id> --status=done
./scripts/tm_cache.sh update <task-id>
./scripts/update_changelog.sh <task-id> --type=task

CLI Commands

# Standard transcription
uv run python -m src.cli.main transcribe audio.mp3         # Basic transcription
uv run python -m src.cli.main transcribe audio.mp3 --v2    # AI-enhanced (99% accuracy)

# Enhanced CLI (recommended for production)
uv run python -m src.cli.enhanced_cli transcribe audio.mp3 --multi-pass --confidence-threshold 0.9
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3 --domain academic --diarize
uv run python -m src.cli.enhanced_cli batch /path/to/files --parallel 8

# YouTube processing
uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID
uv run python -m src.cli.main batch-urls urls.txt --output-dir transcripts/

Project Structure

trax/
├── src/                    # Main application code
│   ├── services/          # Core business logic (protocol-based)
│   │   ├── protocols.py  # Service interfaces
│   │   ├── transcription_service.py
│   │   ├── multi_pass_transcription.py
│   │   ├── domain_enhancement.py
│   │   ├── batch_processor.py
│   │   └── export_service.py
│   ├── database/          # Data layer
│   │   ├── models.py     # Core SQLAlchemy models
│   │   ├── v2_models.py  # Extended v2 features
│   │   └── repositories/ # Data access patterns
│   ├── cli/              # Command-line interfaces
│   │   ├── main.py       # Standard CLI
│   │   └── enhanced_cli.py # Advanced CLI with progress
│   ├── api/              # REST API endpoints (future)
│   ├── utils/            # Shared utilities
│   └── config.py         # Configuration (inherits from ../../.env)
├── tests/                 # Test suite
│   ├── fixtures/         # Real test media files
│   │   ├── audio/       # Sample audio files
│   │   └── video/       # Sample video files
│   ├── conftest.py      # Pytest configuration
│   └── test_*.py        # Test files
├── scripts/              # Utility scripts
│   ├── validate_loc.sh  # File size validation
│   ├── tm_context.sh    # Task context caching
│   └── update_changelog.sh
├── .cursor/rules/        # Cursor AI rules
│   ├── agent_workflow.mdc # Main workflow (single source)
│   └── *.mdc            # Supporting rules
├── .taskmaster/          # Task Master configuration
│   ├── tasks/           # Task files
│   ├── docs/           # PRD and documentation
│   └── config.json     # AI model configuration
├── .venv/               # Virtual environment (gitignored)
├── pyproject.toml       # Package configuration (uv)
├── CLAUDE.md           # This file
└── AGENTS.md           # Development rules

High-Level Architecture

Service Layer (`src/services/`)

The core processing logic uses protocol-based design for modularity:

# All services implement protocols for clean interfaces
from src.services.protocols import TranscriptionProtocol, EnhancementProtocol

# Key services:
- transcription_service.py     # Whisper integration (20-70x faster on M3)
- multi_pass_transcription.py  # Iterative refinement for 99.5% accuracy
- domain_enhancement.py         # AI enhancement with domain adaptation
- batch_processor.py           # Parallel processing (8 workers optimal)
- export_service.py            # Multi-format export (TXT, SRT, VTT, JSON)

Performance Optimizations

Memory Management: memory_optimization.py - Automatic cleanup, chunked processing
Speed: speed_optimization.py - M3-specific optimizations, distil-large-v3 model
Domain Adaptation: domain_adaptation.py - Technical/academic/medical terminology
Caching: Multi-layer caching with different TTLs per data type

Database Layer (`src/database/`)

PostgreSQL with SQLAlchemy ORM:

models.py - Core models (MediaFile, Transcript, Enhancement)
v2_models.py - Extended models for v2 features
repositories/ - Data access patterns with protocol compliance

Testing Strategy (`tests/`)

Real-file testing - No mocks, actual media files:

# tests/conftest.py provides real test fixtures
@pytest.fixture
def sample_audio_5s():
    return Path("tests/fixtures/audio/sample_5s.wav")

Configuration System

Inherits from root project .env at ../../.env:

from src.config import config

# All API keys available as attributes
api_key = config.DEEPSEEK_API_KEY
services = config.get_available_ai_services()

File Organization Rules

File Size Limits

Code Files (.py, .ts, .js):
- Target: Under 300 lines
- Maximum: 350 lines (only with clear justification)
- Exceptions: Complex algorithms, comprehensive test suites
Documentation (.md, .txt):
- Target: Under 550 lines
- Maximum: 600 lines for essential docs (CLAUDE.md, README.md)
Single Responsibility: One service/component per file
Protocol-Based: Use typing.Protocol for service interfaces

Example Structure

# transcription_service.py - Only transcription logic (50-100 lines)
class TranscriptionService(TranscriptionProtocol):
    async def transcribe(self, file_path: Path) -> TranscriptResult:
        # Focused implementation
        pass

# audio_processor.py - Only audio processing logic (50-100 lines)
class AudioProcessor(AudioProtocol):
    def process_audio(self, audio_data) -> ProcessedAudio:
        # Focused implementation
        pass

Key Implementation Patterns

1. Download-First Architecture

# Always download media before processing
downloader = MediaDownloadService()
local_path = await downloader.download(url)
result = await transcriber.transcribe(local_path)

2. Test-First Development

# Write test that defines the interface
def test_transcription_service():
    service = TranscriptionService()
    result = service.transcribe_audio("test.wav")
    assert result.text is not None
    assert result.confidence > 0.8
# THEN implement to make test pass

3. Multi-Pass Refinement (v2)

# Iterative improvement for 99.5% accuracy
service = MultiPassTranscriptionService()
result = await service.transcribe_with_passes(
    file_path, 
    min_confidence=0.9,
    max_passes=3
)

4. Batch Processing

# Optimized for M3 with 8 parallel workers
processor = BatchProcessor(max_workers=8)
results = await processor.process_batch(file_paths)

Performance Targets

5-minute audio: <30 seconds processing
95% accuracy (v1), 99% accuracy (v2)
<1 second CLI response time
Support files up to 500MB
8 parallel workers on M3

Current Implementation Status

✅ Completed

Whisper transcription with distil-large-v3
DeepSeek AI enhancement
Multi-pass refinement system
Domain adaptation (technical/academic/medical)
Speaker diarization
Batch processing with parallel workers
Export to TXT/SRT/VTT/JSON
PostgreSQL database with migrations
Comprehensive test suite
Enhanced CLI with progress tracking

🚧 In Progress

Research agent UI (Streamlit)
Vector search integration (ChromaDB/FAISS)
Advanced speaker profiles

Task Master Integration

Task Master commands for project management:

# View current tasks
task-master list
task-master next
task-master show <id>

# Update task status
task-master set-status --id=<id> --status=done
task-master update-subtask --id=<id> --prompt="implementation notes"

See .taskmaster/CLAUDE.md for full Task Master workflow integration.

Common Workflows

Adding New Feature

# 1. Get task details
task-master show <task-id>

# 2. Write tests first
# Create test file with comprehensive test cases

# 3. Implement minimal code
# Write code to pass tests

# 4. Validate quality
uv run pytest && uv run black src/ tests/ && uv run ruff check --fix

# 5. Complete
task-master set-status --id=<task-id> --status=done

Fixing Bug

# 1. Reproduce the bug with a failing test

# 2. Fix the code to make test pass

# 3. Validate
uv run pytest && quality checks

# 4. Update status
task-master set-status --id=<task-id> --status=done

Common Issues & Solutions

Database Connection

# Check PostgreSQL status
pg_ctl status -D /usr/local/var/postgres
# Start if needed
pg_ctl start -D /usr/local/var/postgres

FFmpeg Missing

# Install via Homebrew
brew install ffmpeg

API Key Issues

# Verify keys loaded
uv run python -c "from src.config import config; config.display_config_status()"

Missing .env

Check ../../.env exists in root project

Import Errors

Run uv pip install -e ".[dev]"

Type Errors

Run uv run mypy src/

Formatting Issues

Run uv run black src/ tests/

Anti-Patterns to Avoid

❌ DON'T: Skip Understanding

Jumping straight to coding without requirements
Not reading task details or context
Ignoring existing code patterns

❌ DON'T: Skip Testing

Writing code before tests
Incomplete test coverage
Not testing edge cases

❌ DON'T: Ignore Quality

Large, monolithic code files (>350 lines without justification)
Documentation files exceeding 600 lines
Poor formatting or linting errors
Not following project patterns

❌ DON'T: Over-Engineer

Complex abstractions when simple works
Multiple layers when one suffices
Premature optimization

Success Metrics

Code Quality

All tests pass
Code files under LOC limits (300 lines target, 350 max)
Documentation under 550 lines (600 max for essentials)
No linting errors
Consistent formatting

Development Speed

Clear understanding of requirements
Tests written first
Minimal viable implementation
Quick validation cycles

Maintainability

Small, focused files
Clear separation of concerns
Consistent patterns
Good test coverage

Memory Management

Claude Code Memory (# shortcut)

Use # to save important context:

#remember Using distil-large-v3 for M3 optimization
#remember PostgreSQL 15+ with JSONB for flexible storage
#remember 8 parallel workers optimal for batch processing

Memory Levels

Project-level: Saved to .claude.md in project root
User-level: Saved globally across all projects
Session-level: Saved to .claude/context/session.md

What to Remember

Architecture decisions: Model choices, database patterns
Performance targets: Processing times, accuracy goals
Configuration: API keys, service endpoints
Conventions: Naming patterns, file organization
Dependencies: Required packages, versions

Cursor Rules

Key rules from .cursor/rules/:

agent_workflow.mdc - Simplified TDD workflow (single source of truth)
progressive-enhancement.mdc - Iterative refinement approach
utc-timestamps.mdc - Timestamp handling standards
low-loc.mdc - Low Line of Code patterns (300 line target for code, 550 for docs)

Parallel Development

Git worktrees enable parallel development across features:

Setup: Run .claude/scripts/setup_worktrees.sh
5 Default Worktrees: features, testing, docs, performance, bugfix
Switch: Use /Users/enias/projects/my-ai-projects/apps/trax-worktrees/switch.sh
Status: Check all with trax-worktrees/status.sh
Full Guide: Parallel Development Workflow

Architecture Version: 2.0 | Python 3.11+ | PostgreSQL 15+ | FFmpeg 6.0+

Remember: Keep it simple. Follow @.cursor/rules/agent_workflow.mdc: Understand → Test → Implement → Validate → Complete.

15 KiB Raw Blame History