233 lines
6.4 KiB
Markdown
233 lines
6.4 KiB
Markdown
# AGENTS.md - Project Onboarding
|
|
|
|
AGENTS.md is for defining agent instructions. It ONLY works in the project's root directory.
|
|
|
|
It's perfect for projects that need simple, readable instructions without the overhead of structured rules.
|
|
|
|
---
|
|
## Project Context
|
|
|
|
Trax is a subproject within the my-ai-projects ecosystem that uses the ultra-fast `uv` package manager for Python dependency management. The project inherits all API tokens from the root project's `.env` file located at `../../.env`.
|
|
|
|
**Core Mission**: Deterministic, iterative media transcription platform that transforms raw audio/video into structured, enhanced, and searchable text content through progressive AI-powered processing.
|
|
|
|
---
|
|
## Quick Start
|
|
|
|
### Essential Commands
|
|
```sh
|
|
# Install dependencies in development mode
|
|
uv pip install -e ".[dev]"
|
|
|
|
# Start the development server
|
|
uv run python src/main.py
|
|
|
|
# Run all tests with coverage
|
|
uv run pytest
|
|
|
|
# Format and lint code
|
|
uv run black src/ tests/
|
|
uv run ruff check --fix src/ tests/
|
|
```
|
|
|
|
### Development Workflow
|
|
```sh
|
|
# Get next task to work on
|
|
./scripts/tm_master.sh next
|
|
|
|
# Start working on a task
|
|
./scripts/tm_master.sh start 15
|
|
|
|
# Complete a task
|
|
./scripts/tm_master.sh done 15
|
|
|
|
# Search for tasks
|
|
./scripts/tm_master.sh search whisper
|
|
```
|
|
|
|
---
|
|
## Project Status
|
|
|
|
### Current Phase: Foundation (Weeks 1-2)
|
|
**Goal**: Working CLI transcription tool
|
|
|
|
**✅ Completed**:
|
|
- PostgreSQL database setup with JSONB
|
|
- YouTube metadata extraction and download pipeline
|
|
- CLI implementation with Click
|
|
|
|
**🚧 Ready for Implementation**:
|
|
- Basic Whisper transcription service (v1)
|
|
- JSON/TXT export functionality
|
|
|
|
**🎯 Next Milestones**:
|
|
- Process 5-minute audio in <30 seconds
|
|
- 95% transcription accuracy on clear audio
|
|
|
|
### Version Progression
|
|
- **v1**: Basic transcription (95% accuracy, <30s for 5min audio)
|
|
- **v2**: AI enhancement (99% accuracy, <35s processing)
|
|
- **v3**: Multi-pass accuracy (99.5% accuracy, <25s processing)
|
|
- **v4**: Speaker diarization (90% speaker accuracy)
|
|
|
|
---
|
|
## Key Tools & Features
|
|
|
|
### Research Agent
|
|
Powerful Streamlit Research Agent with Perplexity AI for real-time web search:
|
|
|
|
```sh
|
|
# Launch the web interface
|
|
python launch_research_agent.py
|
|
|
|
# Quick CLI research
|
|
python -m src.cli.main research "your research question"
|
|
```
|
|
|
|
### Taskmaster Integration
|
|
Fast task management using CLI directly:
|
|
|
|
```sh
|
|
# Get project overview
|
|
task-master list
|
|
|
|
# Find next task
|
|
task-master next
|
|
|
|
# Show task details
|
|
task-master show <id>
|
|
|
|
# Start working on a task
|
|
./scripts/tm_workflow_simple.sh start <id>
|
|
|
|
# Update progress
|
|
./scripts/tm_workflow_simple.sh update <id> <message>
|
|
|
|
# Complete a task
|
|
./scripts/tm_workflow_simple.sh complete <id>
|
|
```
|
|
|
|
### Cursor Rules System
|
|
Advanced development rules for consistent code patterns:
|
|
|
|
```sh
|
|
# Analyze current rules
|
|
./scripts/generate_rules.sh --analyze
|
|
|
|
# Generate rules for new features
|
|
./scripts/generate_rules.sh --generate src/services --type python
|
|
```
|
|
|
|
---
|
|
## Common Workflows
|
|
|
|
### Adding New Dependencies
|
|
```sh
|
|
# Add production dependency
|
|
uv pip install package-name
|
|
|
|
# Add development dependency
|
|
uv pip install package-name --dev
|
|
|
|
# Update requirements.txt
|
|
uv pip compile pyproject.toml -o requirements.txt
|
|
```
|
|
|
|
### Database Changes
|
|
```sh
|
|
# Create new migration
|
|
alembic revision -m "description"
|
|
|
|
# Apply migrations
|
|
alembic upgrade head
|
|
|
|
# Check current version
|
|
alembic current
|
|
```
|
|
|
|
### Debugging
|
|
```sh
|
|
# Start interactive Python shell
|
|
uv run ipython
|
|
|
|
# Run with debug logging
|
|
uv run python -m src.main --debug
|
|
```
|
|
|
|
---
|
|
## Performance Targets
|
|
|
|
### Audio Processing
|
|
- **Model**: distil-large-v3 for M3 optimization (20-70x speed improvement)
|
|
- **Preprocessing**: Convert to 16kHz mono WAV (3x data reduction)
|
|
- **Memory**: <2GB for v1 pipeline
|
|
|
|
### Caching Strategy
|
|
- **Embeddings**: 24h TTL
|
|
- **Analysis**: 7d TTL
|
|
- **Queries**: 6h TTL
|
|
- **Compression**: LZ4 for storage efficiency
|
|
|
|
---
|
|
## Troubleshooting
|
|
|
|
### Common Issues
|
|
- **Missing .env file**: Ensure `../../.env` exists in the root project
|
|
- **Import errors**: Check that dependencies are installed with `uv pip install -e ".[dev]"`
|
|
- **Type errors**: Run `uv run mypy src/` to identify issues
|
|
- **Formatting issues**: Run `uv run black src/ tests/` to auto-format
|
|
|
|
### Getting Help
|
|
- Check the `CLAUDE.md` file for detailed project context
|
|
- Review existing code patterns in `src/` directory
|
|
- Consult the project maintainers for architecture decisions
|
|
|
|
---
|
|
## Reference Documentation
|
|
|
|
### Development Rules & Patterns
|
|
- **[Cursor Rules](./.cursor/rules/)** - Detailed development rules and patterns
|
|
- **[Implementation Guide](./docs/CURSOR_RULES_IMPLEMENTATION.md)** - Setup and maintenance
|
|
- **[Rule Templates](./.cursor/rules/templates/rule-templates.mdc)** - Rule creation templates
|
|
|
|
### Architecture & Design
|
|
- **[Development Patterns](./docs/architecture/development-patterns.md)** - Historical learnings
|
|
- **[Audio Processing](./docs/architecture/audio-processing.md)** - Audio pipeline architecture
|
|
- **[Iterative Pipeline](./docs/architecture/iterative-pipeline.md)** - Version progression
|
|
|
|
### Project Reports
|
|
- **[Product Vision](./docs/reports/06-product-vision.md)** - Product goals and roadmap
|
|
- **[Technical Migration](./docs/reports/05-technical-migration.md)** - Migration strategy
|
|
- **[Executive Summary](./EXECUTIVE-SUMMARY.md)** - High-level project overview
|
|
|
|
### Development Tools
|
|
- **[Taskmaster Helper Scripts](./scripts/README_taskmaster_helpers.md)** - CLI helper scripts
|
|
- **[Research Agent](./docs/RESEARCH_AGENT.md)** - Research agent documentation
|
|
- **[CLI Reference](./docs/CLI.md)** - Command-line interface documentation
|
|
|
|
### Test Data
|
|
- **[Test Videos](./videos.csv)** - Collection of YouTube URLs for testing
|
|
|
|
---
|
|
## Quick Reference
|
|
|
|
### File Organization
|
|
- Keep each file under 300 LOC (350 max if justified)
|
|
- Use meaningful file and function names
|
|
- Group related functionality in modules
|
|
|
|
### Code Style
|
|
- **Python Version**: 3.11+ with strict type checking
|
|
- **Formatting**: Black with line length 100
|
|
- **Linting**: Ruff with auto-fix enabled
|
|
- **Type Checking**: MyPy strict mode
|
|
|
|
### Critical Patterns
|
|
- **Backend-First Development**: Get data layer right before UI
|
|
- **Test-First**: Write test, then implementation
|
|
- **Download-First**: Never stream media, always download first
|
|
- **Real Files Testing**: Use actual audio files, no mocks
|
|
- **Protocol-Based Services**: Use typing.Protocol for all service interfaces
|
|
|
|
---
|
|
*This document provides quick access to essential project information. For detailed development rules and patterns, see the [Cursor Rules](./.cursor/rules/) directory.* |