220 lines
9.0 KiB
Markdown
220 lines
9.0 KiB
Markdown
# Project Directory Structure
|
|
|
|
This document provides an overview of the Trax Media Processing Platform directory structure and the purpose of each component.
|
|
|
|
## Root Directory
|
|
|
|
```
|
|
trax/
|
|
├── CLAUDE.md # Project context for Claude Code
|
|
├── AGENTS.md # Development rules for AI agents
|
|
├── EXECUTIVE-SUMMARY.md # High-level project overview
|
|
├── CHANGELOG.md # Version history and changes
|
|
├── PROJECT-DIRECTORY.md # This file - directory structure
|
|
├── README.md # Project introduction and quick start
|
|
├── pyproject.toml # Project configuration and dependencies
|
|
├── requirements.txt # Locked dependencies (generated)
|
|
├── scratchpad.md # Temporary notes and ideas
|
|
└── test_config.py # Configuration testing utilities
|
|
```
|
|
|
|
## Source Code (`src/`)
|
|
|
|
```
|
|
src/
|
|
├── __init__.py # Python package initialization
|
|
├── config.py # Centralized configuration system
|
|
├── main.py # Application entry point
|
|
├── cli/ # Command-line interface
|
|
│ ├── __init__.py
|
|
│ └── main.py # Click-based CLI implementation
|
|
├── services/ # Business logic services
|
|
│ ├── __init__.py
|
|
│ ├── transcription/ # Transcription services
|
|
│ │ ├── __init__.py
|
|
│ │ ├── protocols.py # Service interfaces
|
|
│ │ ├── whisper_service.py # Whisper implementation
|
|
│ │ └── enhancement.py # AI enhancement service
|
|
│ ├── caching/ # Caching layer
|
|
│ │ ├── __init__.py
|
|
│ │ ├── protocols.py # Cache interfaces
|
|
│ │ └── sqlite_cache.py # SQLite cache implementation
|
|
│ ├── batch/ # Batch processing
|
|
│ │ ├── __init__.py
|
|
│ │ ├── processor.py # Batch job processor
|
|
│ │ └── queue.py # Job queue management
|
|
│ └── export/ # Export functionality
|
|
│ ├── __init__.py
|
|
│ ├── protocols.py # Export interfaces
|
|
│ ├── json_exporter.py # JSON export
|
|
│ └── txt_exporter.py # Text export
|
|
├── models/ # Database models
|
|
│ ├── __init__.py
|
|
│ ├── base.py # Base model class
|
|
│ ├── media.py # Media file models
|
|
│ ├── transcript.py # Transcript models
|
|
│ └── batch.py # Batch job models
|
|
├── database/ # Database layer
|
|
│ ├── __init__.py
|
|
│ ├── registry.py # Database registry pattern
|
|
│ ├── connection.py # Connection management
|
|
│ └── migrations/ # Alembic migrations
|
|
├── utils/ # Utility functions
|
|
│ ├── __init__.py
|
|
│ ├── audio.py # Audio processing utilities
|
|
│ ├── validation.py # Input validation
|
|
│ └── logging.py # Logging configuration
|
|
└── agents/ # AI agent components
|
|
├── __init__.py
|
|
└── rules/ # Agent rule files
|
|
├── TRANSCRIPTION_RULES.md
|
|
├── BATCH_PROCESSING_RULES.md
|
|
├── DATABASE_RULES.md
|
|
├── CACHING_RULES.md
|
|
└── EXPORT_RULES.md
|
|
```
|
|
|
|
## Documentation (`docs/`)
|
|
|
|
```
|
|
docs/
|
|
├── architecture/ # Architecture documentation
|
|
│ ├── development-patterns.md # Historical learnings and patterns
|
|
│ ├── audio-processing.md # Audio pipeline architecture
|
|
│ └── iterative-pipeline.md # Version progression details
|
|
├── reports/ # Analysis reports
|
|
│ ├── 01-repository-inventory.md
|
|
│ ├── 02-historical-context.md
|
|
│ ├── 03-architecture-design.md
|
|
│ ├── 04-team-structure.md
|
|
│ ├── 05-technical-migration.md
|
|
│ └── 06-product-vision.md
|
|
└── team/ # Team documentation
|
|
└── job-descriptions.md # Role definitions
|
|
```
|
|
|
|
## Tests (`tests/`)
|
|
|
|
```
|
|
tests/
|
|
├── __init__.py # Test package initialization
|
|
├── conftest.py # Pytest configuration and fixtures
|
|
├── factories/ # Test data factories
|
|
│ ├── __init__.py
|
|
│ ├── media_factory.py # Media file factories
|
|
│ ├── transcript_factory.py # Transcript factories
|
|
│ └── batch_factory.py # Batch job factories
|
|
├── fixtures/ # Test fixtures and data
|
|
│ ├── audio/ # Test audio files
|
|
│ │ ├── sample_5s.wav # 5-second test file
|
|
│ │ ├── sample_30s.mp3 # 30-second test file
|
|
│ │ └── sample_2m.mp4 # 2-minute test file
|
|
│ └── transcripts/ # Expected transcript outputs
|
|
│ └── expected_outputs.json
|
|
├── unit/ # Unit tests
|
|
│ ├── test_protocols.py # Protocol interface tests
|
|
│ ├── test_models.py # Database model tests
|
|
│ └── services/ # Service unit tests
|
|
│ ├── test_batch.py # Batch service tests
|
|
│ └── test_whisper.py # Whisper service tests
|
|
└── integration/ # Integration tests
|
|
├── test_pipeline_v1.py # v1 pipeline tests
|
|
├── test_batch_processing.py # Batch processing tests
|
|
└── test_cli.py # CLI integration tests
|
|
```
|
|
|
|
## Data (`data/`)
|
|
|
|
```
|
|
data/
|
|
├── media/ # Media file storage
|
|
│ ├── downloads/ # Downloaded media files
|
|
│ └── processed/ # Processed audio files
|
|
├── exports/ # Export output files
|
|
│ ├── json/ # JSON export files
|
|
│ └── txt/ # Text export files
|
|
└── cache/ # Cache storage
|
|
├── embeddings/ # Embedding cache
|
|
├── transcripts/ # Transcript cache
|
|
└── analysis/ # Analysis cache
|
|
```
|
|
|
|
## Scripts (`scripts/`)
|
|
|
|
```
|
|
scripts/
|
|
├── setup_dev.sh # Development environment setup
|
|
├── setup_db.sh # Database initialization
|
|
├── run_tests.sh # Test execution script
|
|
└── deploy.sh # Deployment script
|
|
```
|
|
|
|
## Configuration Files
|
|
|
|
### `pyproject.toml`
|
|
- Project metadata and dependencies
|
|
- uv package manager configuration
|
|
- Development tools configuration (Black, Ruff, MyPy)
|
|
- Build system settings
|
|
|
|
### `.env` (inherited from root)
|
|
- API keys and secrets
|
|
- Database connection strings
|
|
- Service configuration
|
|
- Environment-specific settings
|
|
|
|
### `alembic.ini`
|
|
- Database migration configuration
|
|
- Alembic settings and paths
|
|
|
|
## Key File Purposes
|
|
|
|
### Core Documentation
|
|
- **CLAUDE.md**: Context for Claude Code to understand current state
|
|
- **AGENTS.md**: Development rules and workflows for AI agents
|
|
- **EXECUTIVE-SUMMARY.md**: High-level project overview and strategy
|
|
- **CHANGELOG.md**: Version history and change tracking
|
|
- **PROJECT-DIRECTORY.md**: This file - directory structure overview
|
|
|
|
### Configuration
|
|
- **src/config.py**: Centralized configuration with root .env inheritance
|
|
- **pyproject.toml**: Project dependencies and tooling configuration
|
|
- **requirements.txt**: Locked dependency versions (generated)
|
|
|
|
### Architecture
|
|
- **docs/architecture/**: Detailed architecture patterns and decisions
|
|
- **docs/reports/**: Analysis reports from YouTube Summarizer project
|
|
- **src/agents/rules/**: Agent rule files for consistency
|
|
|
|
### Testing
|
|
- **tests/fixtures/audio/**: Real audio files for testing (no mocks)
|
|
- **tests/conftest.py**: Pytest configuration and shared fixtures
|
|
- **tests/factories/**: Test data generation utilities
|
|
|
|
## Development Workflow
|
|
|
|
### File Organization Principles
|
|
1. **Separation of Concerns**: Each directory has a specific purpose
|
|
2. **Protocol-Based Design**: Interfaces defined in protocols.py files
|
|
3. **Real Files Testing**: Actual media files in test fixtures
|
|
4. **Documentation Limits**: Keep files under 600 LOC for AI comprehension
|
|
5. **Clear Naming**: Descriptive file and directory names
|
|
|
|
### Adding New Components
|
|
1. **Services**: Add to `src/services/` with protocol interface
|
|
2. **Models**: Add to `src/models/` with database registry
|
|
3. **Tests**: Add to `tests/` with real file fixtures
|
|
4. **Documentation**: Add to `docs/` with clear structure
|
|
5. **Rules**: Add to `src/agents/rules/` for consistency
|
|
|
|
### Migration Strategy
|
|
- **Database Changes**: Use Alembic migrations in `src/database/migrations/`
|
|
- **Schema Updates**: Update models and create migration
|
|
- **Data Migration**: Scripts in `scripts/` directory
|
|
- **Version Tracking**: Update CHANGELOG.md with changes
|
|
|
|
---
|
|
|
|
*Last Updated: 2024-12-19*
|
|
*Project Structure Version: 1.0*
|