7.6 KiB

Raw Blame History

Backend Developer Agent - Capabilities & Tools

🎯 Agent Overview

The Backend Python Developer Agent is a comprehensive representation of the first backend developer hire for the Trax media processing platform. This agent has access to specific tools and capabilities needed to build the protocol-based transcription pipeline from v1 to v4.

Agent Profile

Name: Backend Python Developer
Role: Senior Backend Developer
Experience Level: Senior
Salary Range: $150,000 - $200,000
Current Focus: Phase 1: Foundation (Weeks 1-2)

🛠️ Available Tools by Category

1. Core Development Tools

Tools: 3 | Skills: 8

Python 3.11+ Development

Async Programming: Write async/await code for concurrent operations
Protocol Design: Create protocol-based service interfaces
Type Hints: Use comprehensive type hints throughout

uv Package Manager

Install Dependencies: Install project dependencies
Compile Requirements: Generate requirements.txt from pyproject.toml
Run Commands: Execute Python commands with uv

Click CLI Framework

Create transcription commands
Build batch processing interface
Implement export functionality

2. Database Tools

Tools: 2 | Skills: 4

PostgreSQL + SQLAlchemy

Model Definition: Define SQLAlchemy models with JSONB
Database Migrations: Create and apply Alembic migrations
JSONB Operations: Perform JSONB queries and operations

Database Registry Pattern

Implement centralized model registry
Handle multiple database connections
Manage model relationships

3. ML Integration Tools

Tools: 3 | Skills: 6

Whisper Integration

Model Loading: Load Whisper models with faster-whisper
Audio Transcription: Transcribe audio files with Whisper
Chunking Strategy: Handle large audio files with chunking

Protocol-Based Services

Design service interfaces
Implement version compatibility
Create swappable components

DeepSeek API Integration

Enhance transcript quality
Implement structured outputs
Handle API rate limits

4. Testing Tools

Tools: 2 | Skills: 4

pytest with Real Files

Real File Testing: Test with actual audio files instead of mocks
Test Fixtures: Create reusable test fixtures with real files
Performance Testing: Benchmark transcription performance

Coverage Reporting

Achieve >80% code coverage
Identify untested code
Track test quality

5. Architecture Tools

Tools: 3 | Skills: 3

Iterative Pipeline Design

Version Management: Manage different pipeline versions
Backward Compatibility: Ensure new versions work with old data
Feature Flags: Enable/disable features by version

Batch Processing System

Process multiple files
Handle independent failures
Track progress

Caching Strategy

Cache expensive operations
Implement different TTLs
Handle cache invalidation

6. Performance Tools

Tools: 2 | Skills: 3

Performance Profiling

Profile transcription speed
Optimize memory usage
Benchmark improvements

M3 Hardware Optimization

Metal Performance Shaders: Use M3 GPU for Whisper inference
Memory Optimization: Optimize memory usage for large files
Performance Profiling: Profile and optimize performance

7. Deployment Tools

Tools: 2 | Skills: 2

Docker Containerization

Create production images
Handle dependencies
Optimize image size

CI/CD Pipeline

Automate testing
Deploy to staging
Monitor deployments

📊 Agent Statistics

Total Tools Available: 17
Required Skills: 30
Categories: 7
Development Phases: 4 (v1, v2, v3, v4)

🎯 Phase-Specific Tool Availability

Phase 1 (v1): Foundation

Focus: Basic Whisper transcription (95% accuracy, <30s for 5min audio) Tools: Core Development, Database, Testing

Phase 2 (v2): Enhancement

Focus: AI enhancement (99% accuracy, <35s processing) Tools: + ML Integration

Phase 3 (v3): Optimization

Focus: Multi-pass accuracy (99.5% accuracy, <25s processing) Tools: + Performance

Phase 4 (v4): Advanced Features

Focus: Speaker diarization (90% speaker accuracy) Tools: + Deployment

🚀 Success Metrics

The agent must achieve these targets:

Metric	Target
Processing Speed	5-minute audio in <30 seconds
Accuracy	99.5% transcription accuracy with multi-pass
Batch Capacity	Process 100+ files efficiently
Memory Usage	<4GB peak memory usage
Cost	<$0.01 per transcript
Code Coverage	>80% with real file testing
CLI Response	<1 second CLI response time
File Size	Handle files up to 500MB
Data Loss	Zero data loss on errors

💻 Development Workflow

1. Environment Setup

uv venv
source .venv/bin/activate
uv pip install -e .[dev]

2. Database Setup

alembic revision -m 'Initial schema'
alembic upgrade head

3. Core Development

class TranscriptionService(Protocol):
    async def transcribe(self, audio: Path) -> Transcript: ...

4. ML Integration

from faster_whisper import WhisperModel
model = WhisperModel('distil-large-v3', device='mps')

5. Testing

uv run pytest tests/
uv run pytest --cov=src

6. Performance Optimization

model.transcribe(audio_path, chunk_length=30, overlap=2)
python -m cProfile src/main.py

🔧 Key Capabilities

Protocol-Based Architecture

Design clean service interfaces
Implement dependency injection
Create swappable components
Maintain version compatibility

Real File Testing

Test with actual audio files
No mocks in test suite
Benchmark real performance
Handle edge cases

Performance Optimization

M3 hardware acceleration
Memory usage optimization
Chunking for large files
Profiling and benchmarking

Batch Processing

Handle 100+ files efficiently
Independent failure handling
Progress tracking
Queue management

📁 File Structure

src/agents/
├── backend_developer_agent.py          # Main agent definition
├── tools/
│   └── backend_developer_tools.py      # Detailed tool definitions
└── demo_backend_developer.py           # Demo script

🎮 Usage Examples

Running the Demo

cd src/agents
python demo_backend_developer.py

Checking Tool Availability

from agents.backend_developer_agent import check_tool_availability

# Check if agent can use a specific tool
can_use_whisper = check_tool_availability("Whisper Integration")
print(f"Can use Whisper: {can_use_whisper}")

Getting Tools by Category

from agents.tools.backend_developer_tools import get_tools_by_category

# Get all database tools
db_tools = get_tools_by_category("database")
for tool in db_tools:
    print(f"Database tool: {tool.name}")

Getting Phase-Specific Tools

from agents.tools.backend_developer_tools import get_tools_by_phase

# Get tools available in v1
v1_tools = get_tools_by_phase("v1")
for tool in v1_tools:
    print(f"v1 tool: {tool.name}")

🎯 Next Steps

Run the demo script to see all capabilities
Review the job posting for hiring
Set up development environment for the agent
Begin Phase 1 development with core tools
Implement protocol-based architecture from day one

The Backend Developer Agent is ready to build the future of media processing with clean, scalable, and reliable architecture! 🚀

7.6 KiB Raw Blame History