9.0 KiB
9.0 KiB
Project Directory Structure
This document provides an overview of the Trax Media Processing Platform directory structure and the purpose of each component.
Root Directory
trax/
├── CLAUDE.md # Project context for Claude Code
├── AGENTS.md # Development rules for AI agents
├── EXECUTIVE-SUMMARY.md # High-level project overview
├── CHANGELOG.md # Version history and changes
├── PROJECT-DIRECTORY.md # This file - directory structure
├── README.md # Project introduction and quick start
├── pyproject.toml # Project configuration and dependencies
├── requirements.txt # Locked dependencies (generated)
├── scratchpad.md # Temporary notes and ideas
└── test_config.py # Configuration testing utilities
Source Code (src/)
src/
├── __init__.py # Python package initialization
├── config.py # Centralized configuration system
├── main.py # Application entry point
├── cli/ # Command-line interface
│ ├── __init__.py
│ └── main.py # Click-based CLI implementation
├── services/ # Business logic services
│ ├── __init__.py
│ ├── transcription/ # Transcription services
│ │ ├── __init__.py
│ │ ├── protocols.py # Service interfaces
│ │ ├── whisper_service.py # Whisper implementation
│ │ └── enhancement.py # AI enhancement service
│ ├── caching/ # Caching layer
│ │ ├── __init__.py
│ │ ├── protocols.py # Cache interfaces
│ │ └── sqlite_cache.py # SQLite cache implementation
│ ├── batch/ # Batch processing
│ │ ├── __init__.py
│ │ ├── processor.py # Batch job processor
│ │ └── queue.py # Job queue management
│ └── export/ # Export functionality
│ ├── __init__.py
│ ├── protocols.py # Export interfaces
│ ├── json_exporter.py # JSON export
│ └── txt_exporter.py # Text export
├── models/ # Database models
│ ├── __init__.py
│ ├── base.py # Base model class
│ ├── media.py # Media file models
│ ├── transcript.py # Transcript models
│ └── batch.py # Batch job models
├── database/ # Database layer
│ ├── __init__.py
│ ├── registry.py # Database registry pattern
│ ├── connection.py # Connection management
│ └── migrations/ # Alembic migrations
├── utils/ # Utility functions
│ ├── __init__.py
│ ├── audio.py # Audio processing utilities
│ ├── validation.py # Input validation
│ └── logging.py # Logging configuration
└── agents/ # AI agent components
├── __init__.py
└── rules/ # Agent rule files
├── TRANSCRIPTION_RULES.md
├── BATCH_PROCESSING_RULES.md
├── DATABASE_RULES.md
├── CACHING_RULES.md
└── EXPORT_RULES.md
Documentation (docs/)
docs/
├── architecture/ # Architecture documentation
│ ├── development-patterns.md # Historical learnings and patterns
│ ├── audio-processing.md # Audio pipeline architecture
│ └── iterative-pipeline.md # Version progression details
├── reports/ # Analysis reports
│ ├── 01-repository-inventory.md
│ ├── 02-historical-context.md
│ ├── 03-architecture-design.md
│ ├── 04-team-structure.md
│ ├── 05-technical-migration.md
│ └── 06-product-vision.md
└── team/ # Team documentation
└── job-descriptions.md # Role definitions
Tests (tests/)
tests/
├── __init__.py # Test package initialization
├── conftest.py # Pytest configuration and fixtures
├── factories/ # Test data factories
│ ├── __init__.py
│ ├── media_factory.py # Media file factories
│ ├── transcript_factory.py # Transcript factories
│ └── batch_factory.py # Batch job factories
├── fixtures/ # Test fixtures and data
│ ├── audio/ # Test audio files
│ │ ├── sample_5s.wav # 5-second test file
│ │ ├── sample_30s.mp3 # 30-second test file
│ │ └── sample_2m.mp4 # 2-minute test file
│ └── transcripts/ # Expected transcript outputs
│ └── expected_outputs.json
├── unit/ # Unit tests
│ ├── test_protocols.py # Protocol interface tests
│ ├── test_models.py # Database model tests
│ └── services/ # Service unit tests
│ ├── test_batch.py # Batch service tests
│ └── test_whisper.py # Whisper service tests
└── integration/ # Integration tests
├── test_pipeline_v1.py # v1 pipeline tests
├── test_batch_processing.py # Batch processing tests
└── test_cli.py # CLI integration tests
Data (data/)
data/
├── media/ # Media file storage
│ ├── downloads/ # Downloaded media files
│ └── processed/ # Processed audio files
├── exports/ # Export output files
│ ├── json/ # JSON export files
│ └── txt/ # Text export files
└── cache/ # Cache storage
├── embeddings/ # Embedding cache
├── transcripts/ # Transcript cache
└── analysis/ # Analysis cache
Scripts (scripts/)
scripts/
├── setup_dev.sh # Development environment setup
├── setup_db.sh # Database initialization
├── run_tests.sh # Test execution script
└── deploy.sh # Deployment script
Configuration Files
pyproject.toml
- Project metadata and dependencies
- uv package manager configuration
- Development tools configuration (Black, Ruff, MyPy)
- Build system settings
.env (inherited from root)
- API keys and secrets
- Database connection strings
- Service configuration
- Environment-specific settings
alembic.ini
- Database migration configuration
- Alembic settings and paths
Key File Purposes
Core Documentation
- CLAUDE.md: Context for Claude Code to understand current state
- AGENTS.md: Development rules and workflows for AI agents
- EXECUTIVE-SUMMARY.md: High-level project overview and strategy
- CHANGELOG.md: Version history and change tracking
- PROJECT-DIRECTORY.md: This file - directory structure overview
Configuration
- src/config.py: Centralized configuration with root .env inheritance
- pyproject.toml: Project dependencies and tooling configuration
- requirements.txt: Locked dependency versions (generated)
Architecture
- docs/architecture/: Detailed architecture patterns and decisions
- docs/reports/: Analysis reports from YouTube Summarizer project
- src/agents/rules/: Agent rule files for consistency
Testing
- tests/fixtures/audio/: Real audio files for testing (no mocks)
- tests/conftest.py: Pytest configuration and shared fixtures
- tests/factories/: Test data generation utilities
Development Workflow
File Organization Principles
- Separation of Concerns: Each directory has a specific purpose
- Protocol-Based Design: Interfaces defined in protocols.py files
- Real Files Testing: Actual media files in test fixtures
- Documentation Limits: Keep files under 600 LOC for AI comprehension
- Clear Naming: Descriptive file and directory names
Adding New Components
- Services: Add to
src/services/with protocol interface - Models: Add to
src/models/with database registry - Tests: Add to
tests/with real file fixtures - Documentation: Add to
docs/with clear structure - Rules: Add to
src/agents/rules/for consistency
Migration Strategy
- Database Changes: Use Alembic migrations in
src/database/migrations/ - Schema Updates: Update models and create migration
- Data Migration: Scripts in
scripts/directory - Version Tracking: Update CHANGELOG.md with changes
Last Updated: 2024-12-19 Project Structure Version: 1.0