7.6 KiB
Backend Developer Agent - Capabilities & Tools
🎯 Agent Overview
The Backend Python Developer Agent is a comprehensive representation of the first backend developer hire for the Trax media processing platform. This agent has access to specific tools and capabilities needed to build the protocol-based transcription pipeline from v1 to v4.
Agent Profile
- Name: Backend Python Developer
- Role: Senior Backend Developer
- Experience Level: Senior
- Salary Range: $150,000 - $200,000
- Current Focus: Phase 1: Foundation (Weeks 1-2)
🛠️ Available Tools by Category
1. Core Development Tools
Tools: 3 | Skills: 8
Python 3.11+ Development
- Async Programming: Write async/await code for concurrent operations
- Protocol Design: Create protocol-based service interfaces
- Type Hints: Use comprehensive type hints throughout
uv Package Manager
- Install Dependencies: Install project dependencies
- Compile Requirements: Generate requirements.txt from pyproject.toml
- Run Commands: Execute Python commands with uv
Click CLI Framework
- Create transcription commands
- Build batch processing interface
- Implement export functionality
2. Database Tools
Tools: 2 | Skills: 4
PostgreSQL + SQLAlchemy
- Model Definition: Define SQLAlchemy models with JSONB
- Database Migrations: Create and apply Alembic migrations
- JSONB Operations: Perform JSONB queries and operations
Database Registry Pattern
- Implement centralized model registry
- Handle multiple database connections
- Manage model relationships
3. ML Integration Tools
Tools: 3 | Skills: 6
Whisper Integration
- Model Loading: Load Whisper models with faster-whisper
- Audio Transcription: Transcribe audio files with Whisper
- Chunking Strategy: Handle large audio files with chunking
Protocol-Based Services
- Design service interfaces
- Implement version compatibility
- Create swappable components
DeepSeek API Integration
- Enhance transcript quality
- Implement structured outputs
- Handle API rate limits
4. Testing Tools
Tools: 2 | Skills: 4
pytest with Real Files
- Real File Testing: Test with actual audio files instead of mocks
- Test Fixtures: Create reusable test fixtures with real files
- Performance Testing: Benchmark transcription performance
Coverage Reporting
- Achieve >80% code coverage
- Identify untested code
- Track test quality
5. Architecture Tools
Tools: 3 | Skills: 3
Iterative Pipeline Design
- Version Management: Manage different pipeline versions
- Backward Compatibility: Ensure new versions work with old data
- Feature Flags: Enable/disable features by version
Batch Processing System
- Process multiple files
- Handle independent failures
- Track progress
Caching Strategy
- Cache expensive operations
- Implement different TTLs
- Handle cache invalidation
6. Performance Tools
Tools: 2 | Skills: 3
Performance Profiling
- Profile transcription speed
- Optimize memory usage
- Benchmark improvements
M3 Hardware Optimization
- Metal Performance Shaders: Use M3 GPU for Whisper inference
- Memory Optimization: Optimize memory usage for large files
- Performance Profiling: Profile and optimize performance
7. Deployment Tools
Tools: 2 | Skills: 2
Docker Containerization
- Create production images
- Handle dependencies
- Optimize image size
CI/CD Pipeline
- Automate testing
- Deploy to staging
- Monitor deployments
📊 Agent Statistics
- Total Tools Available: 17
- Required Skills: 30
- Categories: 7
- Development Phases: 4 (v1, v2, v3, v4)
🎯 Phase-Specific Tool Availability
Phase 1 (v1): Foundation
Focus: Basic Whisper transcription (95% accuracy, <30s for 5min audio) Tools: Core Development, Database, Testing
Phase 2 (v2): Enhancement
Focus: AI enhancement (99% accuracy, <35s processing) Tools: + ML Integration
Phase 3 (v3): Optimization
Focus: Multi-pass accuracy (99.5% accuracy, <25s processing) Tools: + Performance
Phase 4 (v4): Advanced Features
Focus: Speaker diarization (90% speaker accuracy) Tools: + Deployment
🚀 Success Metrics
The agent must achieve these targets:
| Metric | Target |
|---|---|
| Processing Speed | 5-minute audio in <30 seconds |
| Accuracy | 99.5% transcription accuracy with multi-pass |
| Batch Capacity | Process 100+ files efficiently |
| Memory Usage | <4GB peak memory usage |
| Cost | <$0.01 per transcript |
| Code Coverage | >80% with real file testing |
| CLI Response | <1 second CLI response time |
| File Size | Handle files up to 500MB |
| Data Loss | Zero data loss on errors |
💻 Development Workflow
1. Environment Setup
uv venv
source .venv/bin/activate
uv pip install -e .[dev]
2. Database Setup
alembic revision -m 'Initial schema'
alembic upgrade head
3. Core Development
class TranscriptionService(Protocol):
async def transcribe(self, audio: Path) -> Transcript: ...
4. ML Integration
from faster_whisper import WhisperModel
model = WhisperModel('distil-large-v3', device='mps')
5. Testing
uv run pytest tests/
uv run pytest --cov=src
6. Performance Optimization
model.transcribe(audio_path, chunk_length=30, overlap=2)
python -m cProfile src/main.py
🔧 Key Capabilities
Protocol-Based Architecture
- Design clean service interfaces
- Implement dependency injection
- Create swappable components
- Maintain version compatibility
Real File Testing
- Test with actual audio files
- No mocks in test suite
- Benchmark real performance
- Handle edge cases
Performance Optimization
- M3 hardware acceleration
- Memory usage optimization
- Chunking for large files
- Profiling and benchmarking
Batch Processing
- Handle 100+ files efficiently
- Independent failure handling
- Progress tracking
- Queue management
📁 File Structure
src/agents/
├── backend_developer_agent.py # Main agent definition
├── tools/
│ └── backend_developer_tools.py # Detailed tool definitions
└── demo_backend_developer.py # Demo script
🎮 Usage Examples
Running the Demo
cd src/agents
python demo_backend_developer.py
Checking Tool Availability
from agents.backend_developer_agent import check_tool_availability
# Check if agent can use a specific tool
can_use_whisper = check_tool_availability("Whisper Integration")
print(f"Can use Whisper: {can_use_whisper}")
Getting Tools by Category
from agents.tools.backend_developer_tools import get_tools_by_category
# Get all database tools
db_tools = get_tools_by_category("database")
for tool in db_tools:
print(f"Database tool: {tool.name}")
Getting Phase-Specific Tools
from agents.tools.backend_developer_tools import get_tools_by_phase
# Get tools available in v1
v1_tools = get_tools_by_phase("v1")
for tool in v1_tools:
print(f"v1 tool: {tool.name}")
🎯 Next Steps
- Run the demo script to see all capabilities
- Review the job posting for hiring
- Set up development environment for the agent
- Begin Phase 1 development with core tools
- Implement protocol-based architecture from day one
The Backend Developer Agent is ready to build the future of media processing with clean, scalable, and reliable architecture! 🚀