# Project Directory Structure This document provides an overview of the Trax Media Processing Platform directory structure and the purpose of each component. ## Root Directory ``` trax/ ├── CLAUDE.md # Project context for Claude Code ├── AGENTS.md # Development rules for AI agents ├── EXECUTIVE-SUMMARY.md # High-level project overview ├── CHANGELOG.md # Version history and changes ├── PROJECT-DIRECTORY.md # This file - directory structure ├── README.md # Project introduction and quick start ├── pyproject.toml # Project configuration and dependencies ├── requirements.txt # Locked dependencies (generated) ├── scratchpad.md # Temporary notes and ideas └── test_config.py # Configuration testing utilities ``` ## Source Code (`src/`) ``` src/ ├── __init__.py # Python package initialization ├── config.py # Centralized configuration system ├── main.py # Application entry point ├── cli/ # Command-line interface │ ├── __init__.py │ └── main.py # Click-based CLI implementation ├── services/ # Business logic services │ ├── __init__.py │ ├── transcription/ # Transcription services │ │ ├── __init__.py │ │ ├── protocols.py # Service interfaces │ │ ├── whisper_service.py # Whisper implementation │ │ └── enhancement.py # AI enhancement service │ ├── caching/ # Caching layer │ │ ├── __init__.py │ │ ├── protocols.py # Cache interfaces │ │ └── sqlite_cache.py # SQLite cache implementation │ ├── batch/ # Batch processing │ │ ├── __init__.py │ │ ├── processor.py # Batch job processor │ │ └── queue.py # Job queue management │ └── export/ # Export functionality │ ├── __init__.py │ ├── protocols.py # Export interfaces │ ├── json_exporter.py # JSON export │ └── txt_exporter.py # Text export ├── models/ # Database models │ ├── __init__.py │ ├── base.py # Base model class │ ├── media.py # Media file models │ ├── transcript.py # Transcript models │ └── batch.py # Batch job models ├── database/ # Database layer │ ├── __init__.py │ ├── registry.py # Database registry pattern │ ├── connection.py # Connection management │ └── migrations/ # Alembic migrations ├── utils/ # Utility functions │ ├── __init__.py │ ├── audio.py # Audio processing utilities │ ├── validation.py # Input validation │ └── logging.py # Logging configuration └── agents/ # AI agent components ├── __init__.py └── rules/ # Agent rule files ├── TRANSCRIPTION_RULES.md ├── BATCH_PROCESSING_RULES.md ├── DATABASE_RULES.md ├── CACHING_RULES.md └── EXPORT_RULES.md ``` ## Documentation (`docs/`) ``` docs/ ├── architecture/ # Architecture documentation │ ├── development-patterns.md # Historical learnings and patterns │ ├── audio-processing.md # Audio pipeline architecture │ └── iterative-pipeline.md # Version progression details ├── reports/ # Analysis reports │ ├── 01-repository-inventory.md │ ├── 02-historical-context.md │ ├── 03-architecture-design.md │ ├── 04-team-structure.md │ ├── 05-technical-migration.md │ └── 06-product-vision.md └── team/ # Team documentation └── job-descriptions.md # Role definitions ``` ## Tests (`tests/`) ``` tests/ ├── __init__.py # Test package initialization ├── conftest.py # Pytest configuration and fixtures ├── factories/ # Test data factories │ ├── __init__.py │ ├── media_factory.py # Media file factories │ ├── transcript_factory.py # Transcript factories │ └── batch_factory.py # Batch job factories ├── fixtures/ # Test fixtures and data │ ├── audio/ # Test audio files │ │ ├── sample_5s.wav # 5-second test file │ │ ├── sample_30s.mp3 # 30-second test file │ │ └── sample_2m.mp4 # 2-minute test file │ └── transcripts/ # Expected transcript outputs │ └── expected_outputs.json ├── unit/ # Unit tests │ ├── test_protocols.py # Protocol interface tests │ ├── test_models.py # Database model tests │ └── services/ # Service unit tests │ ├── test_batch.py # Batch service tests │ └── test_whisper.py # Whisper service tests └── integration/ # Integration tests ├── test_pipeline_v1.py # v1 pipeline tests ├── test_batch_processing.py # Batch processing tests └── test_cli.py # CLI integration tests ``` ## Data (`data/`) ``` data/ ├── media/ # Media file storage │ ├── downloads/ # Downloaded media files │ └── processed/ # Processed audio files ├── exports/ # Export output files │ ├── json/ # JSON export files │ └── txt/ # Text export files └── cache/ # Cache storage ├── embeddings/ # Embedding cache ├── transcripts/ # Transcript cache └── analysis/ # Analysis cache ``` ## Scripts (`scripts/`) ``` scripts/ ├── setup_dev.sh # Development environment setup ├── setup_db.sh # Database initialization ├── run_tests.sh # Test execution script └── deploy.sh # Deployment script ``` ## Configuration Files ### `pyproject.toml` - Project metadata and dependencies - uv package manager configuration - Development tools configuration (Black, Ruff, MyPy) - Build system settings ### `.env` (inherited from root) - API keys and secrets - Database connection strings - Service configuration - Environment-specific settings ### `alembic.ini` - Database migration configuration - Alembic settings and paths ## Key File Purposes ### Core Documentation - **CLAUDE.md**: Context for Claude Code to understand current state - **AGENTS.md**: Development rules and workflows for AI agents - **EXECUTIVE-SUMMARY.md**: High-level project overview and strategy - **CHANGELOG.md**: Version history and change tracking - **PROJECT-DIRECTORY.md**: This file - directory structure overview ### Configuration - **src/config.py**: Centralized configuration with root .env inheritance - **pyproject.toml**: Project dependencies and tooling configuration - **requirements.txt**: Locked dependency versions (generated) ### Architecture - **docs/architecture/**: Detailed architecture patterns and decisions - **docs/reports/**: Analysis reports from YouTube Summarizer project - **src/agents/rules/**: Agent rule files for consistency ### Testing - **tests/fixtures/audio/**: Real audio files for testing (no mocks) - **tests/conftest.py**: Pytest configuration and shared fixtures - **tests/factories/**: Test data generation utilities ## Development Workflow ### File Organization Principles 1. **Separation of Concerns**: Each directory has a specific purpose 2. **Protocol-Based Design**: Interfaces defined in protocols.py files 3. **Real Files Testing**: Actual media files in test fixtures 4. **Documentation Limits**: Keep files under 600 LOC for AI comprehension 5. **Clear Naming**: Descriptive file and directory names ### Adding New Components 1. **Services**: Add to `src/services/` with protocol interface 2. **Models**: Add to `src/models/` with database registry 3. **Tests**: Add to `tests/` with real file fixtures 4. **Documentation**: Add to `docs/` with clear structure 5. **Rules**: Add to `src/agents/rules/` for consistency ### Migration Strategy - **Database Changes**: Use Alembic migrations in `src/database/migrations/` - **Schema Updates**: Update models and create migration - **Data Migration**: Scripts in `scripts/` directory - **Version Tracking**: Update CHANGELOG.md with changes --- *Last Updated: 2024-12-19* *Project Structure Version: 1.0*