11 KiB
Trax Project Rewrite - Context Engineering Brief
Note: This is more like a prompt/context engineering document rather than a final specification. The content here provides the framework and requirements for the next developer to work from.
Executive Summary
Objective: Complete rewrite of ../app/trax with focus on deterministic, reliable AI coding agents and robust context management.
Key Problem: Previous project failed due to insufficient context engineering, unclear rules, and non-systematic development.
Solution: Implement systematic, permission-based development process with comprehensive reporting.
Core Concept: Trax (and the YouTube summarizer it evolved from) is a media content processing system that:
- Starts with YouTube transcripts as the foundation
- Runs content through various AI workflows
- Produces summaries, glossaries, study guides, and other educational content
- Uses AI agents to transform raw media into structured, educational outputs
1. Project Context & Requirements
1.1 Core Trax Project Goals
- Package Manager: Migrate from
piptouv - Documentation: Consolidate
CLAUDE.mdandAGENTS.md(600 LOC limit) - Context Engineering: Establish robust AI agent context management for media processing workflows
- Backend-First: Start with CLI transcription service, then Directus, then frontend
- Database Reliability: Excellent testing for migrations
- Media Processing: Build AI workflows for transforming YouTube transcripts into educational content
1.2 Development Philosophy
- Context Engineering: Previous project failed due to insufficient context engineering
- Rule Setting: Poor rule establishment led to project chaos
- Systematic Approach: Need to shift from ad-hoc to systematic development
- Sequential Development: Avoid simultaneous backend/frontend development
- Modular Design: Ensure workflows and pipelines are modular
1.3 Documentation Constraints
- Strict LOC Limits: All documentation must remain under 600 lines of code
- Approval Process: Changes to
CLAUDE.mdorAGENTS.mdrequire explicit approval - Changelog Requirements: Comprehensive changelog for all modifications
- Update Protocol: Request approval before starting new work or after completing tasks
2. Comprehensive Report Methodology (General Context Engineering)
2.1 Interactive Report Process
Expectation: A comprehensive report on how to start over, including deep repo search and breaking up reports for manageable review.
2.2 Six-Checkpoint Process (Permission Required at Each Stage)
Phase 1: Current State Analysis
CHECKPOINT 1: Repository Inventory Report
- Complete file structure analysis, codebase assessment, documentation review
- Configuration system analysis, dependencies and technical debt
- Media processing pipeline analysis, YouTube API integration assessment
- REQUIRES APPROVAL before proceeding
CHECKPOINT 2: Historical Context Report
- Analysis of built/discarded media processing features, development patterns
- Failed approaches to content generation, lessons learned, success patterns to preserve
- YouTube summarizer evolution analysis, educational content generation experiments
- REQUIRES APPROVAL before proceeding
Phase 2: Strategic Planning
CHECKPOINT 3: Architecture Design Report
- Modular backend architecture for media processing, database migration strategy
- Testing framework design, context engineering system design for AI workflows
- Content generation pipeline architecture, educational output formatting system
- REQUIRES APPROVAL before proceeding
CHECKPOINT 4: Team Structure Report
- Role definitions for media processing team, skill requirements, collaboration workflows
- Communication protocols and decision-making distribution for content generation pipeline
- Educational content specialist roles, AI workflow coordination protocols
- REQUIRES APPROVAL before proceeding
Phase 3: Implementation Roadmap
CHECKPOINT 5: Technical Migration Report
uvpackage manager migration, documentation consolidation- Code quality standards, development environment setup for media processing
- YouTube API integration migration, content generation workflow setup
- REQUIRES APPROVAL before proceeding
CHECKPOINT 6: Product Vision Report
- Feature prioritization matrix for educational content types, development phases and milestones
- Success metrics for content generation quality, KPIs for user engagement with educational outputs
- Risk mitigation strategies for AI content generation, media processing reliability
- FINAL APPROVAL required before implementation
3. Trax-Specific Implementation Plan
3.1 Development Roadmap
Phase 1: CLI Transcription Service
- Goal: Iterate back to CLI enhanced transcription service for YouTube content
- Focus: Backend-first approach with modular workflows for transcript processing
- Requirements: Robust testing, clean architecture, YouTube API integration
Phase 2: Directus Integration
- Goal: Add connection to Directus CMS for content management
- Focus: Database reliability and migration testing for media content storage
- Requirements: Excellent migration testing suite, content metadata management
Phase 3: Frontend Development
- Goal: Develop frontend interface for content viewing and management
- Focus: Tailwind + Vanilla JS approach for educational content display
- Requirements: Separate from backend development, responsive design for study materials
Phase 4: AI Content Generation
- Goal: Add AI-powered content generation (summaries, glossaries, study guides)
- Focus: Context engineering and AI agent integration for educational content creation
- Requirements: Strong context management system, multiple AI workflow pipelines
3.2 Required Team Structure
- Backend Python Developer (and separate researcher) - for transcript processing and AI workflows
- Audio Engineer Specialist - for media processing and quality assurance
- Tailwind + Vanilla JS Researcher (and separate frontend developer) - for educational content display
- AI/Machine Learning Deep Researcher (and separate developer) - for content generation algorithms
3.3 Deliverables Required
- PRODUCT-VISION.md Report: Historical analysis of media processing features, lessons learned, clear product vision for educational content generation
- Team Structure Recommendation: Role definitions and collaboration protocols for media processing team
- Development Roadmap: Phased implementation plan with milestones for transcript-to-educational-content pipeline
4. Interactive Development Process
4.1 Permission-Based Workflow
CRITICAL: Each checkpoint requires explicit approval before proceeding. Ask clarifying questions and wait for confirmation.
4.2 Phase 1: Analysis & Discovery
CHECKPOINT 1: Repository Inventory
- Task: Deep dive into current codebase and documentation, especially media processing components
- Deliverable: Comprehensive technical analysis report including YouTube integration assessment
- Questions to Ask:
- What aspects of current media processing architecture should be preserved?
- Which dependencies are critical for content generation vs. replaceable?
- What technical debt in the transcript processing pipeline should be prioritized?
CHECKPOINT 2: Historical Context
- Task: Research project evolution and lessons learned, especially YouTube summarizer development
- Deliverable: Historical analysis and pattern recognition for media processing workflows
- Questions to Ask:
- Which failed approaches to content generation should be avoided?
- What successful patterns in educational content creation should be replicated?
- What media processing features are still desired but need better implementation?
4.3 Phase 2: Strategic Planning
CHECKPOINT 3: Architecture Design
- Task: Design modular backend architecture for media processing and content generation
- Deliverable: Technical architecture proposal for transcript-to-educational-content pipeline
- Questions to Ask:
- What level of modularity is desired for content generation workflows?
- Which architectural patterns align with your vision for educational content processing?
- What are the critical non-functional requirements for media processing reliability?
CHECKPOINT 4: Team Structure
- Task: Define roles, responsibilities, and workflows for media processing team
- Deliverable: Team structure and collaboration plan for content generation pipeline
- Questions to Ask:
- Are the proposed roles sufficient for your vision of educational content creation?
- What collaboration patterns work best for media processing workflows?
- How should decision-making be distributed across content generation pipeline?
4.4 Phase 3: Implementation Planning
CHECKPOINT 5: Technical Migration
- Task: Plan technical implementation and migration for media processing system
- Deliverable: Detailed implementation roadmap for transcript processing and content generation
- Questions to Ask:
- What migration approach minimizes risk for YouTube API integration?
- Which technical decisions need your input for content generation workflows?
- What rollback strategies should be planned for media processing pipeline?
CHECKPOINT 6: Product Vision
- Task: Define product roadmap and success metrics for educational content generation
- Deliverable: Comprehensive product vision document for media processing platform
- Questions to Ask:
- Does this vision align with your long-term goals for educational content creation?
- Are the success metrics meaningful for content quality and user engagement?
- What risks or concerns need additional attention for AI content generation reliability?
5. Quality Assurance & Success Criteria
5.1 Quality Assurance Requirements
- Documentation Standards: Ensure all docs meet 600 LOC limit
- Approval Workflows: Implement change management processes
- Testing Framework: Create comprehensive test suite
- Development Processes: Establish systematic workflows
5.2 Success Criteria
- Clean Architecture: Maintainable, modular codebase with clear context for media processing
- Reliable Agents: Robust AI agent system with strong context management for content generation
- Systematic Development: Prevent future chaos through proper processes for educational content creation
- Scalable Team: Clear documentation and structure for team growth in media processing
- Comprehensive Reports: Detailed analysis and planning documents for content generation pipeline
- Interactive Process: Maintain your control and input throughout development
- Permission-Based Workflow: No major decisions made without your approval
- Content Quality: High-quality educational outputs (summaries, glossaries, study guides)
- Media Processing Reliability: Robust YouTube transcript processing and content transformation
5.3 Communication Protocol
- Checkpoint Reviews: Each checkpoint requires your review and approval
- Clarifying Questions: Developer must ask specific questions at each stage
- Decision Points: All architectural and strategic decisions need your input
- Progress Updates: Regular status updates between checkpoints
- Risk Escalation: Immediate notification of any blockers or concerns