trax/PROJECT-DIRECTORY.md

9.0 KiB

Project Directory Structure

This document provides an overview of the Trax Media Processing Platform directory structure and the purpose of each component.

Root Directory

trax/
├── CLAUDE.md                    # Project context for Claude Code
├── AGENTS.md                    # Development rules for AI agents
├── EXECUTIVE-SUMMARY.md         # High-level project overview
├── CHANGELOG.md                 # Version history and changes
├── PROJECT-DIRECTORY.md         # This file - directory structure
├── README.md                    # Project introduction and quick start
├── pyproject.toml              # Project configuration and dependencies
├── requirements.txt            # Locked dependencies (generated)
├── scratchpad.md               # Temporary notes and ideas
└── test_config.py              # Configuration testing utilities

Source Code (src/)

src/
├── __init__.py                 # Python package initialization
├── config.py                   # Centralized configuration system
├── main.py                     # Application entry point
├── cli/                        # Command-line interface
│   ├── __init__.py
│   └── main.py                 # Click-based CLI implementation
├── services/                   # Business logic services
│   ├── __init__.py
│   ├── transcription/          # Transcription services
│   │   ├── __init__.py
│   │   ├── protocols.py        # Service interfaces
│   │   ├── whisper_service.py  # Whisper implementation
│   │   └── enhancement.py      # AI enhancement service
│   ├── caching/                # Caching layer
│   │   ├── __init__.py
│   │   ├── protocols.py        # Cache interfaces
│   │   └── sqlite_cache.py     # SQLite cache implementation
│   ├── batch/                  # Batch processing
│   │   ├── __init__.py
│   │   ├── processor.py        # Batch job processor
│   │   └── queue.py            # Job queue management
│   └── export/                 # Export functionality
│       ├── __init__.py
│       ├── protocols.py        # Export interfaces
│       ├── json_exporter.py    # JSON export
│       └── txt_exporter.py     # Text export
├── models/                     # Database models
│   ├── __init__.py
│   ├── base.py                 # Base model class
│   ├── media.py                # Media file models
│   ├── transcript.py           # Transcript models
│   └── batch.py                # Batch job models
├── database/                   # Database layer
│   ├── __init__.py
│   ├── registry.py             # Database registry pattern
│   ├── connection.py           # Connection management
│   └── migrations/             # Alembic migrations
├── utils/                      # Utility functions
│   ├── __init__.py
│   ├── audio.py                # Audio processing utilities
│   ├── validation.py           # Input validation
│   └── logging.py              # Logging configuration
└── agents/                     # AI agent components
    ├── __init__.py
    └── rules/                  # Agent rule files
        ├── TRANSCRIPTION_RULES.md
        ├── BATCH_PROCESSING_RULES.md
        ├── DATABASE_RULES.md
        ├── CACHING_RULES.md
        └── EXPORT_RULES.md

Documentation (docs/)

docs/
├── architecture/               # Architecture documentation
│   ├── development-patterns.md # Historical learnings and patterns
│   ├── audio-processing.md     # Audio pipeline architecture
│   └── iterative-pipeline.md   # Version progression details
├── reports/                    # Analysis reports
│   ├── 01-repository-inventory.md
│   ├── 02-historical-context.md
│   ├── 03-architecture-design.md
│   ├── 04-team-structure.md
│   ├── 05-technical-migration.md
│   └── 06-product-vision.md
└── team/                       # Team documentation
    └── job-descriptions.md     # Role definitions

Tests (tests/)

tests/
├── __init__.py                 # Test package initialization
├── conftest.py                 # Pytest configuration and fixtures
├── factories/                  # Test data factories
│   ├── __init__.py
│   ├── media_factory.py        # Media file factories
│   ├── transcript_factory.py   # Transcript factories
│   └── batch_factory.py        # Batch job factories
├── fixtures/                   # Test fixtures and data
│   ├── audio/                  # Test audio files
│   │   ├── sample_5s.wav       # 5-second test file
│   │   ├── sample_30s.mp3      # 30-second test file
│   │   └── sample_2m.mp4       # 2-minute test file
│   └── transcripts/            # Expected transcript outputs
│       └── expected_outputs.json
├── unit/                       # Unit tests
│   ├── test_protocols.py       # Protocol interface tests
│   ├── test_models.py          # Database model tests
│   └── services/               # Service unit tests
│       ├── test_batch.py       # Batch service tests
│       └── test_whisper.py     # Whisper service tests
└── integration/                # Integration tests
    ├── test_pipeline_v1.py     # v1 pipeline tests
    ├── test_batch_processing.py # Batch processing tests
    └── test_cli.py             # CLI integration tests

Data (data/)

data/
├── media/                      # Media file storage
│   ├── downloads/              # Downloaded media files
│   └── processed/              # Processed audio files
├── exports/                    # Export output files
│   ├── json/                   # JSON export files
│   └── txt/                    # Text export files
└── cache/                      # Cache storage
    ├── embeddings/             # Embedding cache
    ├── transcripts/            # Transcript cache
    └── analysis/               # Analysis cache

Scripts (scripts/)

scripts/
├── setup_dev.sh               # Development environment setup
├── setup_db.sh                # Database initialization
├── run_tests.sh               # Test execution script
└── deploy.sh                  # Deployment script

Configuration Files

pyproject.toml

  • Project metadata and dependencies
  • uv package manager configuration
  • Development tools configuration (Black, Ruff, MyPy)
  • Build system settings

.env (inherited from root)

  • API keys and secrets
  • Database connection strings
  • Service configuration
  • Environment-specific settings

alembic.ini

  • Database migration configuration
  • Alembic settings and paths

Key File Purposes

Core Documentation

  • CLAUDE.md: Context for Claude Code to understand current state
  • AGENTS.md: Development rules and workflows for AI agents
  • EXECUTIVE-SUMMARY.md: High-level project overview and strategy
  • CHANGELOG.md: Version history and change tracking
  • PROJECT-DIRECTORY.md: This file - directory structure overview

Configuration

  • src/config.py: Centralized configuration with root .env inheritance
  • pyproject.toml: Project dependencies and tooling configuration
  • requirements.txt: Locked dependency versions (generated)

Architecture

  • docs/architecture/: Detailed architecture patterns and decisions
  • docs/reports/: Analysis reports from YouTube Summarizer project
  • src/agents/rules/: Agent rule files for consistency

Testing

  • tests/fixtures/audio/: Real audio files for testing (no mocks)
  • tests/conftest.py: Pytest configuration and shared fixtures
  • tests/factories/: Test data generation utilities

Development Workflow

File Organization Principles

  1. Separation of Concerns: Each directory has a specific purpose
  2. Protocol-Based Design: Interfaces defined in protocols.py files
  3. Real Files Testing: Actual media files in test fixtures
  4. Documentation Limits: Keep files under 600 LOC for AI comprehension
  5. Clear Naming: Descriptive file and directory names

Adding New Components

  1. Services: Add to src/services/ with protocol interface
  2. Models: Add to src/models/ with database registry
  3. Tests: Add to tests/ with real file fixtures
  4. Documentation: Add to docs/ with clear structure
  5. Rules: Add to src/agents/rules/ for consistency

Migration Strategy

  • Database Changes: Use Alembic migrations in src/database/migrations/
  • Schema Updates: Update models and create migration
  • Data Migration: Scripts in scripts/ directory
  • Version Tracking: Update CHANGELOG.md with changes

Last Updated: 2024-12-19 Project Structure Version: 1.0