trax/.cursor/rules/agents.mdc

254 lines
7.5 KiB
Plaintext

---
description: Project context and navigation hub - loaded by agent_workflow.mdc
globs: AGENTS.md, .cursor/rules/*.mdc
alwaysApply: false
---
# AGENTS.md Usage Guidelines
**⚠️ IMPORTANT: This rule is loaded by agent_workflow.mdc. Do not read directly.**
## Purpose and Scope
**AGENTS.md** serves as the **project onboarding document** and **navigation hub** for the Trax project. It provides high-level context, quick start information, and links to detailed documentation.
## Project Context
Trax is a subproject within the my-ai-projects ecosystem that uses the ultra-fast `uv` package manager for Python dependency management. The project inherits all API tokens from the root project's `.env` file located at `../../.env`.
**Core Mission**: Deterministic, iterative media transcription platform that transforms raw audio/video into structured, enhanced, and searchable text content through progressive AI-powered processing.
## When to Reference AGENTS.md
- **Project Overview**: Understanding what Trax is and its core mission
- **Quick Start**: Essential commands and development setup
- **Current Status**: Project phase, completed work, and next milestones
- **Navigation**: Finding detailed documentation and resources
- **Team Context**: Understanding the project's place in the my-ai-projects ecosystem
## Quick Start
### Essential Commands
```sh
# Install dependencies in development mode
uv pip install -e ".[dev]"
# Start the development server
uv run python src/main.py
# Run all tests with coverage
uv run pytest
# Format and lint code
uv run black src/ tests/
uv run ruff check --fix src/ tests/
```
### Development Workflow
```sh
# Get next task to work on
./scripts/tm_master.sh next
# Start working on a task
./scripts/tm_master.sh start 15
# Complete a task
./scripts/tm_master.sh done 15
# Search for tasks
./scripts/tm_master.sh search whisper
```
## Project Status
### Current Phase: Foundation (Weeks 1-2)
**Goal**: Working CLI transcription tool
**✅ Completed**:
- PostgreSQL database setup with JSONB
- YouTube metadata extraction and download pipeline
- CLI implementation with Click
**🚧 Ready for Implementation**:
- Basic Whisper transcription service (v1)
- JSON/TXT export functionality
**🎯 Next Milestones**:
- Process 5-minute audio in <30 seconds
- 95% transcription accuracy on clear audio
### Version Progression
- **v1**: Basic transcription (95% accuracy, <30s for 5min audio)
- **v2**: AI enhancement (99% accuracy, <35s processing)
- **v3**: Multi-pass accuracy (99.5% accuracy, <25s processing)
- **v4**: Speaker diarization (90% speaker accuracy)
## Key Tools & Features
### Research Agent
Powerful Streamlit Research Agent with Perplexity AI for real-time web search:
```sh
# Launch the web interface
python launch_research_agent.py
# Quick CLI research
python -m src.cli.main research "your research question"
```
### Taskmaster Integration
Fast task management using CLI directly:
```sh
# Get project overview
task-master list
# Find next task
task-master next
# Show task details
task-master show <id>
# Start working on a task
./scripts/tm_workflow_simple.sh start <id>
# Update progress
./scripts/tm_workflow_simple.sh update <id> <message>
# Complete a task
./scripts/tm_workflow_simple.sh complete <id>
```
### Cursor Rules System
Advanced development rules for consistent code patterns:
```sh
# Analyze current rules
./scripts/generate_rules.sh --analyze
# Generate rules for new features
./scripts/generate_rules.sh --generate src/services --type python
```
## Common Workflows
### Adding New Dependencies
```sh
# Add production dependency
uv pip install package-name
# Add development dependency
uv pip install package-name --dev
# Update requirements.txt
uv pip compile pyproject.toml -o requirements.txt
```
### Database Changes
```sh
# Create new migration
alembic revision -m "description"
# Apply migrations
alembic upgrade head
# Check current version
alembic current
```
### Debugging
```sh
# Start interactive Python shell
uv run ipython
# Run with debug logging
uv run python -m src.main --debug
```
## Performance Targets
### Audio Processing
- **Model**: distil-large-v3 for M3 optimization (20-70x speed improvement)
- **Preprocessing**: Convert to 16kHz mono WAV (3x data reduction)
- **Memory**: <2GB for v1 pipeline
### Caching Strategy
- **Embeddings**: 24h TTL
- **Analysis**: 7d TTL
- **Queries**: 6h TTL
- **Compression**: LZ4 for storage efficiency
## Documentation Separation
- **AGENTS.md**: Project context, quick start, navigation
- **Cursor Rules**: Technical implementation patterns and constraints
- **Detailed Docs**: In-depth technical documentation in `docs/`
## Rule Loading Information
**This rule is automatically loaded by [agent_workflow.mdc](mdc:.cursor/rules/agent_workflow.mdc) for all operations.**
**For decision-making and workflow guidance, see [agent_workflow.mdc](mdc:.cursor/rules/agent_workflow.mdc).**
## Reference Documentation
### Development Rules & Patterns
- **[Cursor Rules](./.cursor/rules/)** - Detailed development rules and patterns
- **[Implementation Guide](./docs/CURSOR_RULES_IMPLEMENTATION.md)** - Setup and maintenance
- **[Rule Templates](./.cursor/rules/templates/rule-templates.mdc)** - Rule creation templates
### Architecture & Design
- **[Development Patterns](./docs/architecture/development-patterns.md)** - Historical learnings
- **[Audio Processing](./docs/architecture/audio-processing.md)** - Audio pipeline architecture
- **[Iterative Pipeline](./docs/architecture/iterative-pipeline.md)** - Version progression
### Project Reports
- **[Product Vision](./docs/reports/06-product-vision.md)** - Product goals and roadmap
- **[Technical Migration](./docs/reports/05-technical-migration.md)** - Migration strategy
- **[Executive Summary](./EXECUTIVE-SUMMARY.md)** - High-level project overview
### Development Tools
- **[Taskmaster Helper Scripts](./scripts/README_taskmaster_helpers.md)** - CLI helper scripts
- **[Research Agent](./docs/RESEARCH_AGENT.md)** - Research agent documentation
- **[CLI Reference](./docs/CLI.md)** - Command-line interface documentation
### Test Data
- **[Test Videos](./videos.csv)** - Collection of YouTube URLs for testing
## Quick Reference
### File Organization
- Keep each file under 300 LOC (350 max if justified)
- Use meaningful file and function names
- Group related functionality in modules
### Code Style
- **Python Version**: 3.11+ with strict type checking
- **Formatting**: Black with line length 100
- **Linting**: Ruff with auto-fix enabled
- **Type Checking**: MyPy strict mode
### Critical Patterns
- **Backend-First Development**: Get data layer right before UI
- **Test-First**: Write test, then implementation
- **Download-First**: Never stream media, always download first
- **Real Files Testing**: Use actual audio files, no mocks
- **Protocol-Based Services**: Use typing.Protocol for all service interfaces
## Troubleshooting
### Common Issues
- **Missing .env file**: Ensure `../../.env` exists in the root project
- **Import errors**: Check that dependencies are installed with `uv pip install -e ".[dev]"`
- **Type errors**: Run `uv run mypy src/` to identify issues
- **Formatting issues**: Run `uv run black src/ tests/` to auto-format
### Getting Help
- Review existing code patterns in `src/` directory
- Consult the project maintainers for architecture decisions
## Best Practices
- **Use Cursor rules** for technical implementation guidance
- **Maintain consistency** between documentation and implementation
- **Update documentation** when project status or workflows change
- **Link to specific rules** when referencing technical patterns