12 KiB
AI Assistant Library Integration Guide
Overview
The Trax project leverages the AI Assistant Class Library - a comprehensive, production-tested library that provides common functionality for AI-powered applications. This guide explains how Trax uses the library and how to extend it for your needs.
Library Components Used by Trax
1. Core Base Classes
BaseService
All Trax services extend BaseService for consistent service lifecycle management:
from ai_assistant_lib import BaseService
class TraxService(BaseService):
async def _initialize_impl(self):
# Service-specific initialization
pass
Benefits:
- Standardized initialization/shutdown
- Health checking
- Status tracking
- Error counting
BaseRepository
Database operations use BaseRepository for CRUD operations:
from ai_assistant_lib import BaseRepository, TimestampedRepository
class MediaFileRepository(TimestampedRepository):
# Inherits create, find_by_id, find_all, update, delete
# Plus automatic timestamp management
Benefits:
- Type-safe CRUD operations
- Automatic timestamp handling
- Built-in pagination
- Error handling
2. Retry and Resilience Patterns
RetryHandler
Automatic retry with exponential backoff:
from ai_assistant_lib import async_retry, RetryConfig
@async_retry(max_attempts=3, backoff_factor=2.0)
async def transcribe_with_retry(audio_path):
return await transcribe(audio_path)
CircuitBreaker
Prevent cascading failures:
from ai_assistant_lib import CircuitBreaker
breaker = CircuitBreaker(
failure_threshold=5,
recovery_timeout=60
)
async with breaker:
result = await risky_operation()
3. Caching Infrastructure
Multi-Layer Caching
from ai_assistant_lib import MemoryCache, CacheManager, cached
# Memory cache for hot data
memory_cache = MemoryCache(default_ttl=3600)
# Decorator for automatic caching
@cached(ttl=7200)
async def expensive_operation(param):
return await compute_result(param)
Cache Layers:
- Memory Cache - Fast, limited size
- Database Cache - Persistent, searchable
- Filesystem Cache - Large files
4. AI Service Integration
BaseAIService
Standardized AI service integration:
from ai_assistant_lib import BaseAIService, AIModelConfig
class EnhancementService(BaseAIService):
def __init__(self):
config = AIModelConfig(
model_name="deepseek-chat",
temperature=0.0,
max_tokens=4096
)
super().__init__("EnhancementService", config)
Features:
- Unified API interface
- Automatic retry logic
- Cost tracking
- Model versioning
Trax-Specific Extensions
1. Protocol-Based Services
Trax extends the library with protocol definitions for maximum flexibility:
from typing import Protocol
class TranscriptionProtocol(Protocol):
async def transcribe(self, audio_path: Path) -> Dict[str, Any]:
...
def can_handle(self, audio_path: Path) -> bool:
...
2. Pipeline Versioning
Trax adds pipeline version tracking to services:
class TraxService(BaseService):
def __init__(self, name, config=None):
super().__init__(name, config)
self.pipeline_version = config.get("pipeline_version", "v1")
3. JSONB Support
PostgreSQL JSONB columns for flexible data:
from sqlalchemy.dialects.postgresql import JSONB
class Transcript(TimestampedModel):
raw_content = Column(JSONB, nullable=False)
enhanced_content = Column(JSONB)
Usage Examples
Example 1: Creating a New Service
from ai_assistant_lib import BaseService, ServiceStatus
from src.base.services import TraxService
class WhisperService(TraxService):
"""Whisper transcription service."""
async def _initialize_impl(self):
"""Load Whisper model."""
self.model = await load_whisper_model()
logger.info(f"Loaded Whisper model")
async def transcribe(self, audio_path: Path):
"""Transcribe audio file."""
if self.status != ServiceStatus.HEALTHY:
raise ServiceUnavailableError("Service not ready")
return await self.model.transcribe(audio_path)
Example 2: Repository with Caching
from ai_assistant_lib import TimestampedRepository, cached
class TranscriptRepository(TimestampedRepository):
@cached(ttl=3600)
async def find_by_media_file(self, media_file_id):
"""Find transcript with caching."""
return self.session.query(Transcript).filter(
Transcript.media_file_id == media_file_id
).first()
Example 3: Batch Processing with Circuit Breaker
from ai_assistant_lib import AsyncProcessor, CircuitBreaker
class BatchProcessor(AsyncProcessor):
def __init__(self):
super().__init__("BatchProcessor")
self.breaker = CircuitBreaker(failure_threshold=5)
async def process_batch(self, files):
results = []
for file in files:
try:
async with self.breaker:
result = await self.process_file(file)
results.append(result)
except CircuitBreakerOpen:
logger.error("Circuit breaker open, stopping batch")
break
return results
Configuration
Library Configuration
Configure the library globally:
from ai_assistant_lib import LibraryConfig
LibraryConfig.configure(
log_level="INFO",
default_timeout_seconds=30,
default_retry_attempts=3,
enable_metrics=True,
enable_tracing=False
)
Service Configuration
Each service can have custom configuration:
config = {
"pipeline_version": "v2",
"max_retries": 5,
"timeout": 60,
"cache_ttl": 7200
}
service = TranscriptionService(config=config)
Testing with the Library
Test Utilities
The library provides test utilities:
from ai_assistant_lib.testing import AsyncTestCase, mock_service
class TestTranscription(AsyncTestCase):
async def setUp(self):
self.mock_ai = mock_service(BaseAIService)
self.service = TranscriptionService()
async def test_transcribe(self):
result = await self.service.transcribe(test_file)
self.assertIsNotNone(result)
Mock Implementations
Create mock services for testing:
class MockTranscriptionService(TranscriptionProtocol):
async def transcribe(self, audio_path):
return {"text": "Mock transcript", "duration": 10.0}
def can_handle(self, audio_path):
return True
Performance Optimization
1. Connection Pooling
The library provides connection pooling:
from ai_assistant_lib import ConnectionPool
pool = ConnectionPool(
max_connections=100,
min_connections=10,
timeout=30
)
2. Batch Operations
Optimize database operations:
from ai_assistant_lib import bulk_insert, bulk_update
# Insert many records efficiently
await bulk_insert(session, records)
# Update many records in one query
await bulk_update(session, updates)
3. Async Patterns
Use async throughout:
import asyncio
# Process multiple files concurrently
results = await asyncio.gather(*[
process_file(f) for f in files
])
Error Handling
Exception Hierarchy
The library provides a comprehensive exception hierarchy:
from ai_assistant_lib import (
AIAssistantError, # Base exception
RetryableError, # Can be retried
NonRetryableError, # Should not retry
ServiceUnavailableError,
RateLimitError,
ValidationError
)
Error Recovery
Built-in error recovery patterns:
try:
result = await service.process()
except RetryableError as e:
# Will be automatically retried by decorator
logger.warning(f"Retryable error: {e}")
except NonRetryableError as e:
# Fatal error, don't retry
logger.error(f"Fatal error: {e}")
raise
Monitoring and Metrics
Health Checks
All services provide health status:
health = service.get_health_status()
# {
# "status": "healthy",
# "is_healthy": true,
# "uptime_seconds": 3600,
# "error_count": 0
# }
Performance Metrics
Track performance automatically:
from ai_assistant_lib import MetricsCollector
metrics = MetricsCollector()
metrics.track("transcription_time", elapsed)
metrics.track("cache_hit_rate", hit_rate)
report = metrics.get_report()
Migration from YouTube Summarizer
Pattern Mapping
| YouTube Summarizer Pattern | AI Assistant Library Equivalent |
|---|---|
| Custom retry logic | @async_retry decorator |
| Manual cache management | CacheManager class |
| Database operations | BaseRepository |
| Service initialization | BaseService |
| Error handling | Exception hierarchy |
Code Migration Example
Before (YouTube Summarizer):
class TranscriptService:
def __init__(self):
self.cache = {}
async def get_transcript(self, video_id):
if video_id in self.cache:
return self.cache[video_id]
# Retry logic
for attempt in range(3):
try:
result = await self.fetch_transcript(video_id)
self.cache[video_id] = result
return result
except Exception as e:
if attempt == 2:
raise
await asyncio.sleep(2 ** attempt)
After (With Library):
from ai_assistant_lib import BaseService, cached, async_retry
class TranscriptService(BaseService):
@cached(ttl=3600)
@async_retry(max_attempts=3)
async def get_transcript(self, video_id):
return await self.fetch_transcript(video_id)
Best Practices
1. Always Use Protocols
Define protocols for all services to enable easy swapping:
class ProcessorProtocol(Protocol):
async def process(self, data: Any) -> Any: ...
2. Leverage Type Hints
Use type hints for better IDE support:
async def process_batch(
self,
files: List[Path],
processor: ProcessorProtocol
) -> Dict[str, Any]:
...
3. Configuration Over Code
Use configuration files instead of hardcoding:
config = load_config("config.yaml")
service = MyService(config=config)
4. Test with Real Data
Use the library's support for real file testing:
test_file = Path("tests/fixtures/audio/sample.wav")
result = await service.transcribe(test_file)
Troubleshooting
Common Issues
-
Import Errors
- Ensure symlink is created:
ln -s ../../lib lib - Check Python path includes library
- Ensure symlink is created:
-
Type Errors
- Library requires Python 3.11+
- Use proper type hints
-
Async Errors
- Always use
async/await - Don't mix sync and async code
- Always use
Debug Mode
Enable debug logging:
import logging
logging.getLogger("ai_assistant_lib").setLevel(logging.DEBUG)
Summary
The AI Assistant Library provides Trax with:
✅ Production-tested components - Used across multiple projects
✅ Consistent patterns - Same patterns everywhere
✅ Built-in resilience - Retry, circuit breaker, caching
✅ Type safety - Full typing support
✅ Performance optimization - Connection pooling, batch operations
✅ Comprehensive testing - Test utilities and fixtures
By leveraging this library, Trax can focus on its unique media processing capabilities while relying on proven infrastructure components.
For more information about the library, see: