# Trax Project Rewrite - Context Engineering Brief > **Note**: This is more like a prompt/context engineering document rather than a final specification. The content here provides the framework and requirements for the next developer to work from. ## Executive Summary **Objective**: Complete rewrite of `../app/trax` with focus on deterministic, reliable AI coding agents and robust context management. **Key Problem**: Previous project failed due to insufficient context engineering, unclear rules, and non-systematic development. **Solution**: Implement systematic, permission-based development process with comprehensive reporting. **Core Concept**: Trax (and the YouTube summarizer it evolved from) is a media content processing system that: - Starts with YouTube transcripts as the foundation - Runs content through various AI workflows - Produces summaries, glossaries, study guides, and other educational content - Uses AI agents to transform raw media into structured, educational outputs --- ## 1. Project Context & Requirements ### 1.1 Core Trax Project Goals - **Package Manager**: Migrate from `pip` to `uv` - **Documentation**: Consolidate `CLAUDE.md` and `AGENTS.md` (600 LOC limit) - **Context Engineering**: Establish robust AI agent context management for media processing workflows - **Backend-First**: Start with CLI transcription service, then Directus, then frontend - **Database Reliability**: Excellent testing for migrations - **Media Processing**: Build AI workflows for transforming YouTube transcripts into educational content ### 1.2 Development Philosophy - **Context Engineering**: Previous project failed due to insufficient context engineering - **Rule Setting**: Poor rule establishment led to project chaos - **Systematic Approach**: Need to shift from ad-hoc to systematic development - **Sequential Development**: Avoid simultaneous backend/frontend development - **Modular Design**: Ensure workflows and pipelines are modular ### 1.3 Documentation Constraints 1. **Strict LOC Limits**: All documentation must remain under 600 lines of code 2. **Approval Process**: Changes to `CLAUDE.md` or `AGENTS.md` require explicit approval 3. **Changelog Requirements**: Comprehensive changelog for all modifications 4. **Update Protocol**: Request approval before starting new work or after completing tasks --- ## 2. Comprehensive Report Methodology (General Context Engineering) ### 2.1 Interactive Report Process **Expectation**: A comprehensive report on how to start over, including deep repo search and breaking up reports for manageable review. ### 2.2 Six-Checkpoint Process (Permission Required at Each Stage) #### Phase 1: Current State Analysis **CHECKPOINT 1**: Repository Inventory Report - Complete file structure analysis, codebase assessment, documentation review - Configuration system analysis, dependencies and technical debt - Media processing pipeline analysis, YouTube API integration assessment - **REQUIRES APPROVAL** before proceeding **CHECKPOINT 2**: Historical Context Report - Analysis of built/discarded media processing features, development patterns - Failed approaches to content generation, lessons learned, success patterns to preserve - YouTube summarizer evolution analysis, educational content generation experiments - **REQUIRES APPROVAL** before proceeding #### Phase 2: Strategic Planning **CHECKPOINT 3**: Architecture Design Report - Modular backend architecture for media processing, database migration strategy - Testing framework design, context engineering system design for AI workflows - Content generation pipeline architecture, educational output formatting system - **REQUIRES APPROVAL** before proceeding **CHECKPOINT 4**: Team Structure Report - Role definitions for media processing team, skill requirements, collaboration workflows - Communication protocols and decision-making distribution for content generation pipeline - Educational content specialist roles, AI workflow coordination protocols - **REQUIRES APPROVAL** before proceeding #### Phase 3: Implementation Roadmap **CHECKPOINT 5**: Technical Migration Report - `uv` package manager migration, documentation consolidation - Code quality standards, development environment setup for media processing - YouTube API integration migration, content generation workflow setup - **REQUIRES APPROVAL** before proceeding **CHECKPOINT 6**: Product Vision Report - Feature prioritization matrix for educational content types, development phases and milestones - Success metrics for content generation quality, KPIs for user engagement with educational outputs - Risk mitigation strategies for AI content generation, media processing reliability - **FINAL APPROVAL** required before implementation --- ## 3. Trax-Specific Implementation Plan ### 3.1 Development Roadmap **Phase 1**: CLI Transcription Service - **Goal**: Iterate back to CLI enhanced transcription service for YouTube content - **Focus**: Backend-first approach with modular workflows for transcript processing - **Requirements**: Robust testing, clean architecture, YouTube API integration **Phase 2**: Directus Integration - **Goal**: Add connection to Directus CMS for content management - **Focus**: Database reliability and migration testing for media content storage - **Requirements**: Excellent migration testing suite, content metadata management **Phase 3**: Frontend Development - **Goal**: Develop frontend interface for content viewing and management - **Focus**: Tailwind + Vanilla JS approach for educational content display - **Requirements**: Separate from backend development, responsive design for study materials **Phase 4**: AI Content Generation - **Goal**: Add AI-powered content generation (summaries, glossaries, study guides) - **Focus**: Context engineering and AI agent integration for educational content creation - **Requirements**: Strong context management system, multiple AI workflow pipelines ### 3.2 Required Team Structure - Backend Python Developer (and separate researcher) - for transcript processing and AI workflows - Audio Engineer Specialist - for media processing and quality assurance - Tailwind + Vanilla JS Researcher (and separate frontend developer) - for educational content display - AI/Machine Learning Deep Researcher (and separate developer) - for content generation algorithms ### 3.3 Deliverables Required 1. **PRODUCT-VISION.md Report**: Historical analysis of media processing features, lessons learned, clear product vision for educational content generation 2. **Team Structure Recommendation**: Role definitions and collaboration protocols for media processing team 3. **Development Roadmap**: Phased implementation plan with milestones for transcript-to-educational-content pipeline --- ## 4. Interactive Development Process ### 4.1 Permission-Based Workflow **CRITICAL**: Each checkpoint requires explicit approval before proceeding. Ask clarifying questions and wait for confirmation. ### 4.2 Phase 1: Analysis & Discovery **CHECKPOINT 1**: Repository Inventory - **Task**: Deep dive into current codebase and documentation, especially media processing components - **Deliverable**: Comprehensive technical analysis report including YouTube integration assessment - **Questions to Ask**: - What aspects of current media processing architecture should be preserved? - Which dependencies are critical for content generation vs. replaceable? - What technical debt in the transcript processing pipeline should be prioritized? **CHECKPOINT 2**: Historical Context - **Task**: Research project evolution and lessons learned, especially YouTube summarizer development - **Deliverable**: Historical analysis and pattern recognition for media processing workflows - **Questions to Ask**: - Which failed approaches to content generation should be avoided? - What successful patterns in educational content creation should be replicated? - What media processing features are still desired but need better implementation? ### 4.3 Phase 2: Strategic Planning **CHECKPOINT 3**: Architecture Design - **Task**: Design modular backend architecture for media processing and content generation - **Deliverable**: Technical architecture proposal for transcript-to-educational-content pipeline - **Questions to Ask**: - What level of modularity is desired for content generation workflows? - Which architectural patterns align with your vision for educational content processing? - What are the critical non-functional requirements for media processing reliability? **CHECKPOINT 4**: Team Structure - **Task**: Define roles, responsibilities, and workflows for media processing team - **Deliverable**: Team structure and collaboration plan for content generation pipeline - **Questions to Ask**: - Are the proposed roles sufficient for your vision of educational content creation? - What collaboration patterns work best for media processing workflows? - How should decision-making be distributed across content generation pipeline? ### 4.4 Phase 3: Implementation Planning **CHECKPOINT 5**: Technical Migration - **Task**: Plan technical implementation and migration for media processing system - **Deliverable**: Detailed implementation roadmap for transcript processing and content generation - **Questions to Ask**: - What migration approach minimizes risk for YouTube API integration? - Which technical decisions need your input for content generation workflows? - What rollback strategies should be planned for media processing pipeline? **CHECKPOINT 6**: Product Vision - **Task**: Define product roadmap and success metrics for educational content generation - **Deliverable**: Comprehensive product vision document for media processing platform - **Questions to Ask**: - Does this vision align with your long-term goals for educational content creation? - Are the success metrics meaningful for content quality and user engagement? - What risks or concerns need additional attention for AI content generation reliability? --- ## 5. Quality Assurance & Success Criteria ### 5.1 Quality Assurance Requirements 1. **Documentation Standards**: Ensure all docs meet 600 LOC limit 2. **Approval Workflows**: Implement change management processes 3. **Testing Framework**: Create comprehensive test suite 4. **Development Processes**: Establish systematic workflows ### 5.2 Success Criteria - **Clean Architecture**: Maintainable, modular codebase with clear context for media processing - **Reliable Agents**: Robust AI agent system with strong context management for content generation - **Systematic Development**: Prevent future chaos through proper processes for educational content creation - **Scalable Team**: Clear documentation and structure for team growth in media processing - **Comprehensive Reports**: Detailed analysis and planning documents for content generation pipeline - **Interactive Process**: Maintain your control and input throughout development - **Permission-Based Workflow**: No major decisions made without your approval - **Content Quality**: High-quality educational outputs (summaries, glossaries, study guides) - **Media Processing Reliability**: Robust YouTube transcript processing and content transformation ### 5.3 Communication Protocol - **Checkpoint Reviews**: Each checkpoint requires your review and approval - **Clarifying Questions**: Developer must ask specific questions at each stage - **Decision Points**: All architectural and strategic decisions need your input - **Progress Updates**: Regular status updates between checkpoints - **Risk Escalation**: Immediate notification of any blockers or concerns