# AI Assistant Library Integration Guide ## Overview The Trax project leverages the **AI Assistant Class Library** - a comprehensive, production-tested library that provides common functionality for AI-powered applications. This guide explains how Trax uses the library and how to extend it for your needs. ## Library Components Used by Trax ### 1. Core Base Classes #### BaseService All Trax services extend `BaseService` for consistent service lifecycle management: ```python from ai_assistant_lib import BaseService class TraxService(BaseService): async def _initialize_impl(self): # Service-specific initialization pass ``` **Benefits:** - Standardized initialization/shutdown - Health checking - Status tracking - Error counting #### BaseRepository Database operations use `BaseRepository` for CRUD operations: ```python from ai_assistant_lib import BaseRepository, TimestampedRepository class MediaFileRepository(TimestampedRepository): # Inherits create, find_by_id, find_all, update, delete # Plus automatic timestamp management ``` **Benefits:** - Type-safe CRUD operations - Automatic timestamp handling - Built-in pagination - Error handling ### 2. Retry and Resilience Patterns #### RetryHandler Automatic retry with exponential backoff: ```python from ai_assistant_lib import async_retry, RetryConfig @async_retry(max_attempts=3, backoff_factor=2.0) async def transcribe_with_retry(audio_path): return await transcribe(audio_path) ``` #### CircuitBreaker Prevent cascading failures: ```python from ai_assistant_lib import CircuitBreaker breaker = CircuitBreaker( failure_threshold=5, recovery_timeout=60 ) async with breaker: result = await risky_operation() ``` ### 3. Caching Infrastructure #### Multi-Layer Caching ```python from ai_assistant_lib import MemoryCache, CacheManager, cached # Memory cache for hot data memory_cache = MemoryCache(default_ttl=3600) # Decorator for automatic caching @cached(ttl=7200) async def expensive_operation(param): return await compute_result(param) ``` **Cache Layers:** 1. **Memory Cache** - Fast, limited size 2. **Database Cache** - Persistent, searchable 3. **Filesystem Cache** - Large files ### 4. AI Service Integration #### BaseAIService Standardized AI service integration: ```python from ai_assistant_lib import BaseAIService, AIModelConfig class EnhancementService(BaseAIService): def __init__(self): config = AIModelConfig( model_name="deepseek-chat", temperature=0.0, max_tokens=4096 ) super().__init__("EnhancementService", config) ``` **Features:** - Unified API interface - Automatic retry logic - Cost tracking - Model versioning ## Trax-Specific Extensions ### 1. Protocol-Based Services Trax extends the library with protocol definitions for maximum flexibility: ```python from typing import Protocol class TranscriptionProtocol(Protocol): async def transcribe(self, audio_path: Path) -> Dict[str, Any]: ... def can_handle(self, audio_path: Path) -> bool: ... ``` ### 2. Pipeline Versioning Trax adds pipeline version tracking to services: ```python class TraxService(BaseService): def __init__(self, name, config=None): super().__init__(name, config) self.pipeline_version = config.get("pipeline_version", "v1") ``` ### 3. JSONB Support PostgreSQL JSONB columns for flexible data: ```python from sqlalchemy.dialects.postgresql import JSONB class Transcript(TimestampedModel): raw_content = Column(JSONB, nullable=False) enhanced_content = Column(JSONB) ``` ## Usage Examples ### Example 1: Creating a New Service ```python from ai_assistant_lib import BaseService, ServiceStatus from src.base.services import TraxService class WhisperService(TraxService): """Whisper transcription service.""" async def _initialize_impl(self): """Load Whisper model.""" self.model = await load_whisper_model() logger.info(f"Loaded Whisper model") async def transcribe(self, audio_path: Path): """Transcribe audio file.""" if self.status != ServiceStatus.HEALTHY: raise ServiceUnavailableError("Service not ready") return await self.model.transcribe(audio_path) ``` ### Example 2: Repository with Caching ```python from ai_assistant_lib import TimestampedRepository, cached class TranscriptRepository(TimestampedRepository): @cached(ttl=3600) async def find_by_media_file(self, media_file_id): """Find transcript with caching.""" return self.session.query(Transcript).filter( Transcript.media_file_id == media_file_id ).first() ``` ### Example 3: Batch Processing with Circuit Breaker ```python from ai_assistant_lib import AsyncProcessor, CircuitBreaker class BatchProcessor(AsyncProcessor): def __init__(self): super().__init__("BatchProcessor") self.breaker = CircuitBreaker(failure_threshold=5) async def process_batch(self, files): results = [] for file in files: try: async with self.breaker: result = await self.process_file(file) results.append(result) except CircuitBreakerOpen: logger.error("Circuit breaker open, stopping batch") break return results ``` ## Configuration ### Library Configuration Configure the library globally: ```python from ai_assistant_lib import LibraryConfig LibraryConfig.configure( log_level="INFO", default_timeout_seconds=30, default_retry_attempts=3, enable_metrics=True, enable_tracing=False ) ``` ### Service Configuration Each service can have custom configuration: ```python config = { "pipeline_version": "v2", "max_retries": 5, "timeout": 60, "cache_ttl": 7200 } service = TranscriptionService(config=config) ``` ## Testing with the Library ### Test Utilities The library provides test utilities: ```python from ai_assistant_lib.testing import AsyncTestCase, mock_service class TestTranscription(AsyncTestCase): async def setUp(self): self.mock_ai = mock_service(BaseAIService) self.service = TranscriptionService() async def test_transcribe(self): result = await self.service.transcribe(test_file) self.assertIsNotNone(result) ``` ### Mock Implementations Create mock services for testing: ```python class MockTranscriptionService(TranscriptionProtocol): async def transcribe(self, audio_path): return {"text": "Mock transcript", "duration": 10.0} def can_handle(self, audio_path): return True ``` ## Performance Optimization ### 1. Connection Pooling The library provides connection pooling: ```python from ai_assistant_lib import ConnectionPool pool = ConnectionPool( max_connections=100, min_connections=10, timeout=30 ) ``` ### 2. Batch Operations Optimize database operations: ```python from ai_assistant_lib import bulk_insert, bulk_update # Insert many records efficiently await bulk_insert(session, records) # Update many records in one query await bulk_update(session, updates) ``` ### 3. Async Patterns Use async throughout: ```python import asyncio # Process multiple files concurrently results = await asyncio.gather(*[ process_file(f) for f in files ]) ``` ## Error Handling ### Exception Hierarchy The library provides a comprehensive exception hierarchy: ```python from ai_assistant_lib import ( AIAssistantError, # Base exception RetryableError, # Can be retried NonRetryableError, # Should not retry ServiceUnavailableError, RateLimitError, ValidationError ) ``` ### Error Recovery Built-in error recovery patterns: ```python try: result = await service.process() except RetryableError as e: # Will be automatically retried by decorator logger.warning(f"Retryable error: {e}") except NonRetryableError as e: # Fatal error, don't retry logger.error(f"Fatal error: {e}") raise ``` ## Monitoring and Metrics ### Health Checks All services provide health status: ```python health = service.get_health_status() # { # "status": "healthy", # "is_healthy": true, # "uptime_seconds": 3600, # "error_count": 0 # } ``` ### Performance Metrics Track performance automatically: ```python from ai_assistant_lib import MetricsCollector metrics = MetricsCollector() metrics.track("transcription_time", elapsed) metrics.track("cache_hit_rate", hit_rate) report = metrics.get_report() ``` ## Migration from YouTube Summarizer ### Pattern Mapping | YouTube Summarizer Pattern | AI Assistant Library Equivalent | |---------------------------|----------------------------------| | Custom retry logic | `@async_retry` decorator | | Manual cache management | `CacheManager` class | | Database operations | `BaseRepository` | | Service initialization | `BaseService` | | Error handling | Exception hierarchy | ### Code Migration Example **Before (YouTube Summarizer):** ```python class TranscriptService: def __init__(self): self.cache = {} async def get_transcript(self, video_id): if video_id in self.cache: return self.cache[video_id] # Retry logic for attempt in range(3): try: result = await self.fetch_transcript(video_id) self.cache[video_id] = result return result except Exception as e: if attempt == 2: raise await asyncio.sleep(2 ** attempt) ``` **After (With Library):** ```python from ai_assistant_lib import BaseService, cached, async_retry class TranscriptService(BaseService): @cached(ttl=3600) @async_retry(max_attempts=3) async def get_transcript(self, video_id): return await self.fetch_transcript(video_id) ``` ## Best Practices ### 1. Always Use Protocols Define protocols for all services to enable easy swapping: ```python class ProcessorProtocol(Protocol): async def process(self, data: Any) -> Any: ... ``` ### 2. Leverage Type Hints Use type hints for better IDE support: ```python async def process_batch( self, files: List[Path], processor: ProcessorProtocol ) -> Dict[str, Any]: ... ``` ### 3. Configuration Over Code Use configuration files instead of hardcoding: ```python config = load_config("config.yaml") service = MyService(config=config) ``` ### 4. Test with Real Data Use the library's support for real file testing: ```python test_file = Path("tests/fixtures/audio/sample.wav") result = await service.transcribe(test_file) ``` ## Troubleshooting ### Common Issues 1. **Import Errors** - Ensure symlink is created: `ln -s ../../lib lib` - Check Python path includes library 2. **Type Errors** - Library requires Python 3.11+ - Use proper type hints 3. **Async Errors** - Always use `async`/`await` - Don't mix sync and async code ### Debug Mode Enable debug logging: ```python import logging logging.getLogger("ai_assistant_lib").setLevel(logging.DEBUG) ``` ## Summary The AI Assistant Library provides Trax with: ✅ **Production-tested components** - Used across multiple projects ✅ **Consistent patterns** - Same patterns everywhere ✅ **Built-in resilience** - Retry, circuit breaker, caching ✅ **Type safety** - Full typing support ✅ **Performance optimization** - Connection pooling, batch operations ✅ **Comprehensive testing** - Test utilities and fixtures By leveraging this library, Trax can focus on its unique media processing capabilities while relying on proven infrastructure components. --- For more information about the library, see: - [Library Source](../../lib/) - [Library Tests](../../lib/tests/) - [Usage Examples](../examples/)