286 lines
7.6 KiB
Markdown
286 lines
7.6 KiB
Markdown
# Backend Developer Agent - Capabilities & Tools
|
|
|
|
## 🎯 Agent Overview
|
|
|
|
The **Backend Python Developer Agent** is a comprehensive representation of the first backend developer hire for the Trax media processing platform. This agent has access to specific tools and capabilities needed to build the protocol-based transcription pipeline from v1 to v4.
|
|
|
|
### Agent Profile
|
|
- **Name**: Backend Python Developer
|
|
- **Role**: Senior Backend Developer
|
|
- **Experience Level**: Senior
|
|
- **Salary Range**: $150,000 - $200,000
|
|
- **Current Focus**: Phase 1: Foundation (Weeks 1-2)
|
|
|
|
## 🛠️ Available Tools by Category
|
|
|
|
### 1. Core Development Tools
|
|
**Tools**: 3 | **Skills**: 8
|
|
|
|
#### Python 3.11+ Development
|
|
- **Async Programming**: Write async/await code for concurrent operations
|
|
- **Protocol Design**: Create protocol-based service interfaces
|
|
- **Type Hints**: Use comprehensive type hints throughout
|
|
|
|
#### uv Package Manager
|
|
- **Install Dependencies**: Install project dependencies
|
|
- **Compile Requirements**: Generate requirements.txt from pyproject.toml
|
|
- **Run Commands**: Execute Python commands with uv
|
|
|
|
#### Click CLI Framework
|
|
- **Create transcription commands**
|
|
- **Build batch processing interface**
|
|
- **Implement export functionality**
|
|
|
|
### 2. Database Tools
|
|
**Tools**: 2 | **Skills**: 4
|
|
|
|
#### PostgreSQL + SQLAlchemy
|
|
- **Model Definition**: Define SQLAlchemy models with JSONB
|
|
- **Database Migrations**: Create and apply Alembic migrations
|
|
- **JSONB Operations**: Perform JSONB queries and operations
|
|
|
|
#### Database Registry Pattern
|
|
- **Implement centralized model registry**
|
|
- **Handle multiple database connections**
|
|
- **Manage model relationships**
|
|
|
|
### 3. ML Integration Tools
|
|
**Tools**: 3 | **Skills**: 6
|
|
|
|
#### Whisper Integration
|
|
- **Model Loading**: Load Whisper models with faster-whisper
|
|
- **Audio Transcription**: Transcribe audio files with Whisper
|
|
- **Chunking Strategy**: Handle large audio files with chunking
|
|
|
|
#### Protocol-Based Services
|
|
- **Design service interfaces**
|
|
- **Implement version compatibility**
|
|
- **Create swappable components**
|
|
|
|
#### DeepSeek API Integration
|
|
- **Enhance transcript quality**
|
|
- **Implement structured outputs**
|
|
- **Handle API rate limits**
|
|
|
|
### 4. Testing Tools
|
|
**Tools**: 2 | **Skills**: 4
|
|
|
|
#### pytest with Real Files
|
|
- **Real File Testing**: Test with actual audio files instead of mocks
|
|
- **Test Fixtures**: Create reusable test fixtures with real files
|
|
- **Performance Testing**: Benchmark transcription performance
|
|
|
|
#### Coverage Reporting
|
|
- **Achieve >80% code coverage**
|
|
- **Identify untested code**
|
|
- **Track test quality**
|
|
|
|
### 5. Architecture Tools
|
|
**Tools**: 3 | **Skills**: 3
|
|
|
|
#### Iterative Pipeline Design
|
|
- **Version Management**: Manage different pipeline versions
|
|
- **Backward Compatibility**: Ensure new versions work with old data
|
|
- **Feature Flags**: Enable/disable features by version
|
|
|
|
#### Batch Processing System
|
|
- **Process multiple files**
|
|
- **Handle independent failures**
|
|
- **Track progress**
|
|
|
|
#### Caching Strategy
|
|
- **Cache expensive operations**
|
|
- **Implement different TTLs**
|
|
- **Handle cache invalidation**
|
|
|
|
### 6. Performance Tools
|
|
**Tools**: 2 | **Skills**: 3
|
|
|
|
#### Performance Profiling
|
|
- **Profile transcription speed**
|
|
- **Optimize memory usage**
|
|
- **Benchmark improvements**
|
|
|
|
#### M3 Hardware Optimization
|
|
- **Metal Performance Shaders**: Use M3 GPU for Whisper inference
|
|
- **Memory Optimization**: Optimize memory usage for large files
|
|
- **Performance Profiling**: Profile and optimize performance
|
|
|
|
### 7. Deployment Tools
|
|
**Tools**: 2 | **Skills**: 2
|
|
|
|
#### Docker Containerization
|
|
- **Create production images**
|
|
- **Handle dependencies**
|
|
- **Optimize image size**
|
|
|
|
#### CI/CD Pipeline
|
|
- **Automate testing**
|
|
- **Deploy to staging**
|
|
- **Monitor deployments**
|
|
|
|
## 📊 Agent Statistics
|
|
|
|
- **Total Tools Available**: 17
|
|
- **Required Skills**: 30
|
|
- **Categories**: 7
|
|
- **Development Phases**: 4 (v1, v2, v3, v4)
|
|
|
|
## 🎯 Phase-Specific Tool Availability
|
|
|
|
### Phase 1 (v1): Foundation
|
|
**Focus**: Basic Whisper transcription (95% accuracy, <30s for 5min audio)
|
|
**Tools**: Core Development, Database, Testing
|
|
|
|
### Phase 2 (v2): Enhancement
|
|
**Focus**: AI enhancement (99% accuracy, <35s processing)
|
|
**Tools**: + ML Integration
|
|
|
|
### Phase 3 (v3): Optimization
|
|
**Focus**: Multi-pass accuracy (99.5% accuracy, <25s processing)
|
|
**Tools**: + Performance
|
|
|
|
### Phase 4 (v4): Advanced Features
|
|
**Focus**: Speaker diarization (90% speaker accuracy)
|
|
**Tools**: + Deployment
|
|
|
|
## 🚀 Success Metrics
|
|
|
|
The agent must achieve these targets:
|
|
|
|
| Metric | Target |
|
|
|--------|--------|
|
|
| Processing Speed | 5-minute audio in <30 seconds |
|
|
| Accuracy | 99.5% transcription accuracy with multi-pass |
|
|
| Batch Capacity | Process 100+ files efficiently |
|
|
| Memory Usage | <4GB peak memory usage |
|
|
| Cost | <$0.01 per transcript |
|
|
| Code Coverage | >80% with real file testing |
|
|
| CLI Response | <1 second CLI response time |
|
|
| File Size | Handle files up to 500MB |
|
|
| Data Loss | Zero data loss on errors |
|
|
|
|
## 💻 Development Workflow
|
|
|
|
### 1. Environment Setup
|
|
```bash
|
|
uv venv
|
|
source .venv/bin/activate
|
|
uv pip install -e .[dev]
|
|
```
|
|
|
|
### 2. Database Setup
|
|
```bash
|
|
alembic revision -m 'Initial schema'
|
|
alembic upgrade head
|
|
```
|
|
|
|
### 3. Core Development
|
|
```python
|
|
class TranscriptionService(Protocol):
|
|
async def transcribe(self, audio: Path) -> Transcript: ...
|
|
```
|
|
|
|
### 4. ML Integration
|
|
```python
|
|
from faster_whisper import WhisperModel
|
|
model = WhisperModel('distil-large-v3', device='mps')
|
|
```
|
|
|
|
### 5. Testing
|
|
```bash
|
|
uv run pytest tests/
|
|
uv run pytest --cov=src
|
|
```
|
|
|
|
### 6. Performance Optimization
|
|
```python
|
|
model.transcribe(audio_path, chunk_length=30, overlap=2)
|
|
python -m cProfile src/main.py
|
|
```
|
|
|
|
## 🔧 Key Capabilities
|
|
|
|
### Protocol-Based Architecture
|
|
- Design clean service interfaces
|
|
- Implement dependency injection
|
|
- Create swappable components
|
|
- Maintain version compatibility
|
|
|
|
### Real File Testing
|
|
- Test with actual audio files
|
|
- No mocks in test suite
|
|
- Benchmark real performance
|
|
- Handle edge cases
|
|
|
|
### Performance Optimization
|
|
- M3 hardware acceleration
|
|
- Memory usage optimization
|
|
- Chunking for large files
|
|
- Profiling and benchmarking
|
|
|
|
### Batch Processing
|
|
- Handle 100+ files efficiently
|
|
- Independent failure handling
|
|
- Progress tracking
|
|
- Queue management
|
|
|
|
## 📁 File Structure
|
|
|
|
```
|
|
src/agents/
|
|
├── backend_developer_agent.py # Main agent definition
|
|
├── tools/
|
|
│ └── backend_developer_tools.py # Detailed tool definitions
|
|
└── demo_backend_developer.py # Demo script
|
|
```
|
|
|
|
## 🎮 Usage Examples
|
|
|
|
### Running the Demo
|
|
```bash
|
|
cd src/agents
|
|
python demo_backend_developer.py
|
|
```
|
|
|
|
### Checking Tool Availability
|
|
```python
|
|
from agents.backend_developer_agent import check_tool_availability
|
|
|
|
# Check if agent can use a specific tool
|
|
can_use_whisper = check_tool_availability("Whisper Integration")
|
|
print(f"Can use Whisper: {can_use_whisper}")
|
|
```
|
|
|
|
### Getting Tools by Category
|
|
```python
|
|
from agents.tools.backend_developer_tools import get_tools_by_category
|
|
|
|
# Get all database tools
|
|
db_tools = get_tools_by_category("database")
|
|
for tool in db_tools:
|
|
print(f"Database tool: {tool.name}")
|
|
```
|
|
|
|
### Getting Phase-Specific Tools
|
|
```python
|
|
from agents.tools.backend_developer_tools import get_tools_by_phase
|
|
|
|
# Get tools available in v1
|
|
v1_tools = get_tools_by_phase("v1")
|
|
for tool in v1_tools:
|
|
print(f"v1 tool: {tool.name}")
|
|
```
|
|
|
|
## 🎯 Next Steps
|
|
|
|
1. **Run the demo script** to see all capabilities
|
|
2. **Review the job posting** for hiring
|
|
3. **Set up development environment** for the agent
|
|
4. **Begin Phase 1 development** with core tools
|
|
5. **Implement protocol-based architecture** from day one
|
|
|
|
---
|
|
|
|
**The Backend Developer Agent is ready to build the future of media processing with clean, scalable, and reliable architecture!** 🚀
|