11 KiB

Raw Permalink Blame History

Trax Project Rewrite - Context Engineering Brief

Note: This is more like a prompt/context engineering document rather than a final specification. The content here provides the framework and requirements for the next developer to work from.

Executive Summary

Objective: Complete rewrite of ../app/trax with focus on deterministic, reliable AI coding agents and robust context management.

Key Problem: Previous project failed due to insufficient context engineering, unclear rules, and non-systematic development.

Solution: Implement systematic, permission-based development process with comprehensive reporting.

Core Concept: Trax (and the YouTube summarizer it evolved from) is a media content processing system that:

Starts with YouTube transcripts as the foundation
Runs content through various AI workflows
Produces summaries, glossaries, study guides, and other educational content
Uses AI agents to transform raw media into structured, educational outputs

1. Project Context & Requirements

1.1 Core Trax Project Goals

Package Manager: Migrate from pip to uv
Documentation: Consolidate CLAUDE.md and AGENTS.md (600 LOC limit)
Context Engineering: Establish robust AI agent context management for media processing workflows
Backend-First: Start with CLI transcription service, then Directus, then frontend
Database Reliability: Excellent testing for migrations
Media Processing: Build AI workflows for transforming YouTube transcripts into educational content

1.2 Development Philosophy

Context Engineering: Previous project failed due to insufficient context engineering
Rule Setting: Poor rule establishment led to project chaos
Systematic Approach: Need to shift from ad-hoc to systematic development
Sequential Development: Avoid simultaneous backend/frontend development
Modular Design: Ensure workflows and pipelines are modular

1.3 Documentation Constraints

Strict LOC Limits: All documentation must remain under 600 lines of code
Approval Process: Changes to CLAUDE.md or AGENTS.md require explicit approval
Changelog Requirements: Comprehensive changelog for all modifications
Update Protocol: Request approval before starting new work or after completing tasks

2. Comprehensive Report Methodology (General Context Engineering)

2.1 Interactive Report Process

Expectation: A comprehensive report on how to start over, including deep repo search and breaking up reports for manageable review.

2.2 Six-Checkpoint Process (Permission Required at Each Stage)

Phase 1: Current State Analysis

CHECKPOINT 1: Repository Inventory Report

Complete file structure analysis, codebase assessment, documentation review
Configuration system analysis, dependencies and technical debt
Media processing pipeline analysis, YouTube API integration assessment
REQUIRES APPROVAL before proceeding

CHECKPOINT 2: Historical Context Report

Analysis of built/discarded media processing features, development patterns
Failed approaches to content generation, lessons learned, success patterns to preserve
YouTube summarizer evolution analysis, educational content generation experiments
REQUIRES APPROVAL before proceeding

Phase 2: Strategic Planning

CHECKPOINT 3: Architecture Design Report

Modular backend architecture for media processing, database migration strategy
Testing framework design, context engineering system design for AI workflows
Content generation pipeline architecture, educational output formatting system
REQUIRES APPROVAL before proceeding

CHECKPOINT 4: Team Structure Report

Role definitions for media processing team, skill requirements, collaboration workflows
Communication protocols and decision-making distribution for content generation pipeline
Educational content specialist roles, AI workflow coordination protocols
REQUIRES APPROVAL before proceeding

Phase 3: Implementation Roadmap

CHECKPOINT 5: Technical Migration Report

uv package manager migration, documentation consolidation
Code quality standards, development environment setup for media processing
YouTube API integration migration, content generation workflow setup
REQUIRES APPROVAL before proceeding

CHECKPOINT 6: Product Vision Report

Feature prioritization matrix for educational content types, development phases and milestones
Success metrics for content generation quality, KPIs for user engagement with educational outputs
Risk mitigation strategies for AI content generation, media processing reliability
FINAL APPROVAL required before implementation

3. Trax-Specific Implementation Plan

3.1 Development Roadmap

Phase 1: CLI Transcription Service

Goal: Iterate back to CLI enhanced transcription service for YouTube content
Focus: Backend-first approach with modular workflows for transcript processing
Requirements: Robust testing, clean architecture, YouTube API integration

Phase 2: Directus Integration

Goal: Add connection to Directus CMS for content management
Focus: Database reliability and migration testing for media content storage
Requirements: Excellent migration testing suite, content metadata management

Phase 3: Frontend Development

Goal: Develop frontend interface for content viewing and management
Focus: Tailwind + Vanilla JS approach for educational content display
Requirements: Separate from backend development, responsive design for study materials

Phase 4: AI Content Generation

Goal: Add AI-powered content generation (summaries, glossaries, study guides)
Focus: Context engineering and AI agent integration for educational content creation
Requirements: Strong context management system, multiple AI workflow pipelines

3.2 Required Team Structure

Backend Python Developer (and separate researcher) - for transcript processing and AI workflows
Audio Engineer Specialist - for media processing and quality assurance
Tailwind + Vanilla JS Researcher (and separate frontend developer) - for educational content display
AI/Machine Learning Deep Researcher (and separate developer) - for content generation algorithms

3.3 Deliverables Required

PRODUCT-VISION.md Report: Historical analysis of media processing features, lessons learned, clear product vision for educational content generation
Team Structure Recommendation: Role definitions and collaboration protocols for media processing team
Development Roadmap: Phased implementation plan with milestones for transcript-to-educational-content pipeline

4. Interactive Development Process

4.1 Permission-Based Workflow

CRITICAL: Each checkpoint requires explicit approval before proceeding. Ask clarifying questions and wait for confirmation.

4.2 Phase 1: Analysis & Discovery

CHECKPOINT 1: Repository Inventory

Task: Deep dive into current codebase and documentation, especially media processing components
Deliverable: Comprehensive technical analysis report including YouTube integration assessment
Questions to Ask:
- What aspects of current media processing architecture should be preserved?
- Which dependencies are critical for content generation vs. replaceable?
- What technical debt in the transcript processing pipeline should be prioritized?

CHECKPOINT 2: Historical Context

Task: Research project evolution and lessons learned, especially YouTube summarizer development
Deliverable: Historical analysis and pattern recognition for media processing workflows
Questions to Ask:
- Which failed approaches to content generation should be avoided?
- What successful patterns in educational content creation should be replicated?
- What media processing features are still desired but need better implementation?

4.3 Phase 2: Strategic Planning

CHECKPOINT 3: Architecture Design

Task: Design modular backend architecture for media processing and content generation
Deliverable: Technical architecture proposal for transcript-to-educational-content pipeline
Questions to Ask:
- What level of modularity is desired for content generation workflows?
- Which architectural patterns align with your vision for educational content processing?
- What are the critical non-functional requirements for media processing reliability?

CHECKPOINT 4: Team Structure

Task: Define roles, responsibilities, and workflows for media processing team
Deliverable: Team structure and collaboration plan for content generation pipeline
Questions to Ask:
- Are the proposed roles sufficient for your vision of educational content creation?
- What collaboration patterns work best for media processing workflows?
- How should decision-making be distributed across content generation pipeline?

4.4 Phase 3: Implementation Planning

CHECKPOINT 5: Technical Migration

Task: Plan technical implementation and migration for media processing system
Deliverable: Detailed implementation roadmap for transcript processing and content generation
Questions to Ask:
- What migration approach minimizes risk for YouTube API integration?
- Which technical decisions need your input for content generation workflows?
- What rollback strategies should be planned for media processing pipeline?

CHECKPOINT 6: Product Vision

Task: Define product roadmap and success metrics for educational content generation
Deliverable: Comprehensive product vision document for media processing platform
Questions to Ask:
- Does this vision align with your long-term goals for educational content creation?
- Are the success metrics meaningful for content quality and user engagement?
- What risks or concerns need additional attention for AI content generation reliability?

5. Quality Assurance & Success Criteria

5.1 Quality Assurance Requirements

Documentation Standards: Ensure all docs meet 600 LOC limit
Approval Workflows: Implement change management processes
Testing Framework: Create comprehensive test suite
Development Processes: Establish systematic workflows

5.2 Success Criteria

Clean Architecture: Maintainable, modular codebase with clear context for media processing
Reliable Agents: Robust AI agent system with strong context management for content generation
Systematic Development: Prevent future chaos through proper processes for educational content creation
Scalable Team: Clear documentation and structure for team growth in media processing
Comprehensive Reports: Detailed analysis and planning documents for content generation pipeline
Interactive Process: Maintain your control and input throughout development
Permission-Based Workflow: No major decisions made without your approval
Content Quality: High-quality educational outputs (summaries, glossaries, study guides)
Media Processing Reliability: Robust YouTube transcript processing and content transformation

5.3 Communication Protocol

Checkpoint Reviews: Each checkpoint requires your review and approval
Clarifying Questions: Developer must ask specific questions at each stage
Decision Points: All architectural and strategic decisions need your input
Progress Updates: Regular status updates between checkpoints
Risk Escalation: Immediate notification of any blockers or concerns

11 KiB Raw Permalink Blame History