youtube-summarizer/docs/stories/2.4.multi-model-support.md

917 lines
37 KiB
Markdown

# Story 2.4: Multi-Model Support
## Status
Draft
## Story
**As a** user
**I want** the system to support multiple AI models (OpenAI, Anthropic, DeepSeek) with intelligent selection
**so that** I can choose the best model for my content type and optimize for cost or quality preferences
## Acceptance Criteria
1. Support for multiple AI providers: OpenAI GPT-4o-mini, Anthropic Claude, DeepSeek V2
2. Intelligent model selection based on content type, length, and user preferences
3. Automatic fallback to alternative models when primary model fails or is unavailable
4. Cost comparison and optimization recommendations for different model choices
5. Model performance tracking and quality comparison across different content types
6. User preference management for model selection and fallback strategies
## Tasks / Subtasks
- [ ] **Task 1: Multi-Model Service Architecture** (AC: 1, 3)
- [ ] Create `AIModelRegistry` for managing multiple model providers
- [ ] Implement provider-specific adapters (OpenAI, Anthropic, DeepSeek)
- [ ] Create unified interface for model switching and fallback logic
- [ ] Add model availability monitoring and health checks
- [ ] **Task 2: Model-Specific Implementations** (AC: 1)
- [ ] Implement `AnthropicSummarizer` for Claude 3.5 Sonnet integration
- [ ] Create `DeepSeekSummarizer` for DeepSeek V2 integration
- [ ] Standardize prompt optimization for each model's strengths
- [ ] Add model-specific parameter tuning and optimization
- [ ] **Task 3: Intelligent Model Selection** (AC: 2, 4)
- [ ] Create content analysis for optimal model matching
- [ ] Implement cost-quality optimization algorithms
- [ ] Add model recommendation engine based on content characteristics
- [ ] Create user preference learning system
- [ ] **Task 4: Fallback and Reliability** (AC: 3)
- [ ] Implement automatic failover logic with error classification
- [ ] Create model health monitoring and status tracking
- [ ] Add graceful degradation with quality maintenance
- [ ] Implement retry logic with model rotation
- [ ] **Task 5: Performance and Cost Analytics** (AC: 4, 5)
- [ ] Create model performance comparison dashboard
- [ ] Implement cost tracking and optimization recommendations
- [ ] Add quality scoring across different models and content types
- [ ] Create model usage analytics and insights
- [ ] **Task 6: User Experience and Configuration** (AC: 6)
- [ ] Add model selection options in frontend interface
- [ ] Create user preference management for model choices
- [ ] Implement model comparison tools for users
- [ ] Add real-time cost estimates and recommendations
- [ ] **Task 7: Integration and Testing** (AC: 1, 2, 3, 4, 5, 6)
- [ ] Update SummaryPipeline to use multi-model system
- [ ] Test model switching and fallback scenarios
- [ ] Validate cost calculations and performance metrics
- [ ] Create comprehensive model comparison testing
## Dev Notes
### Architecture Context
This story transforms the single-model AI service into a sophisticated multi-model system that can intelligently choose and switch between AI providers. The system must maintain consistency while optimizing for user preferences, content requirements, and cost efficiency.
### Multi-Model Architecture Design
[Source: docs/architecture.md#multi-model-ai-architecture]
```python
# backend/services/ai_model_registry.py
from abc import ABC, abstractmethod
from enum import Enum
from typing import Dict, List, Optional, Any, Union
from dataclasses import dataclass
import asyncio
import time
from ..services.ai_service import AIService, SummaryRequest, SummaryResult
class ModelProvider(Enum):
OPENAI = "openai"
ANTHROPIC = "anthropic"
DEEPSEEK = "deepseek"
class ModelCapability(Enum):
GENERAL_SUMMARIZATION = "general_summarization"
TECHNICAL_CONTENT = "technical_content"
CREATIVE_CONTENT = "creative_content"
LONG_FORM_CONTENT = "long_form_content"
MULTILINGUAL = "multilingual"
COST_OPTIMIZED = "cost_optimized"
HIGH_QUALITY = "high_quality"
@dataclass
class ModelSpecs:
provider: ModelProvider
model_name: str
max_input_tokens: int
max_output_tokens: int
cost_per_1k_input_tokens: float
cost_per_1k_output_tokens: float
capabilities: List[ModelCapability]
quality_score: float # 0.0 to 1.0
speed_score: float # 0.0 to 1.0 (relative)
reliability_score: float # 0.0 to 1.0
@dataclass
class ModelSelection:
primary_model: ModelProvider
fallback_models: List[ModelProvider]
reasoning: str
estimated_cost: float
estimated_quality: float
class AIModelRegistry:
"""Registry and orchestrator for multiple AI models"""
def __init__(self):
self.models: Dict[ModelProvider, AIService] = {}
self.model_specs: Dict[ModelProvider, ModelSpecs] = {}
self.model_health: Dict[ModelProvider, Dict[str, Any]] = {}
self._initialize_model_specs()
def _initialize_model_specs(self):
"""Initialize model specifications and capabilities"""
self.model_specs[ModelProvider.OPENAI] = ModelSpecs(
provider=ModelProvider.OPENAI,
model_name="gpt-4o-mini",
max_input_tokens=128000,
max_output_tokens=16384,
cost_per_1k_input_tokens=0.00015,
cost_per_1k_output_tokens=0.0006,
capabilities=[
ModelCapability.GENERAL_SUMMARIZATION,
ModelCapability.TECHNICAL_CONTENT,
ModelCapability.CREATIVE_CONTENT,
ModelCapability.COST_OPTIMIZED
],
quality_score=0.85,
speed_score=0.90,
reliability_score=0.95
)
self.model_specs[ModelProvider.ANTHROPIC] = ModelSpecs(
provider=ModelProvider.ANTHROPIC,
model_name="claude-3-5-haiku-20241022",
max_input_tokens=200000,
max_output_tokens=8192,
cost_per_1k_input_tokens=0.001,
cost_per_1k_output_tokens=0.005,
capabilities=[
ModelCapability.GENERAL_SUMMARIZATION,
ModelCapability.TECHNICAL_CONTENT,
ModelCapability.LONG_FORM_CONTENT,
ModelCapability.HIGH_QUALITY
],
quality_score=0.95,
speed_score=0.80,
reliability_score=0.92
)
self.model_specs[ModelProvider.DEEPSEEK] = ModelSpecs(
provider=ModelProvider.DEEPSEEK,
model_name="deepseek-chat",
max_input_tokens=64000,
max_output_tokens=4096,
cost_per_1k_input_tokens=0.00014,
cost_per_1k_output_tokens=0.00028,
capabilities=[
ModelCapability.GENERAL_SUMMARIZATION,
ModelCapability.TECHNICAL_CONTENT,
ModelCapability.COST_OPTIMIZED
],
quality_score=0.80,
speed_score=0.85,
reliability_score=0.88
)
def register_model(self, provider: ModelProvider, model_service: AIService):
"""Register a model service with the registry"""
self.models[provider] = model_service
self.model_health[provider] = {
"status": "healthy",
"last_check": time.time(),
"error_count": 0,
"success_rate": 1.0
}
async def select_optimal_model(
self,
request: SummaryRequest,
user_preferences: Optional[Dict[str, Any]] = None
) -> ModelSelection:
"""Select optimal model based on content and preferences"""
# Analyze content characteristics
content_analysis = await self._analyze_content_for_model_selection(request)
# Get user preferences
preferences = user_preferences or {}
priority = preferences.get("priority", "balanced") # cost, quality, speed, balanced
# Score models based on requirements
model_scores = {}
for provider, specs in self.model_specs.items():
if provider not in self.models:
continue # Skip unavailable models
score = self._calculate_model_score(specs, content_analysis, priority)
model_scores[provider] = score
# Sort by score and filter healthy models
healthy_models = [
provider for provider, health in self.model_health.items()
if health["status"] == "healthy" and provider in model_scores
]
if not healthy_models:
raise Exception("No healthy AI models available")
# Select primary and fallback models
sorted_models = sorted(healthy_models, key=lambda p: model_scores[p], reverse=True)
primary_model = sorted_models[0]
fallback_models = sorted_models[1:3] # Top 2 fallbacks
# Calculate estimates
primary_specs = self.model_specs[primary_model]
estimated_cost = self._estimate_cost(request, primary_specs)
estimated_quality = primary_specs.quality_score
# Generate reasoning
reasoning = self._generate_selection_reasoning(
primary_model, content_analysis, priority, model_scores[primary_model]
)
return ModelSelection(
primary_model=primary_model,
fallback_models=fallback_models,
reasoning=reasoning,
estimated_cost=estimated_cost,
estimated_quality=estimated_quality
)
async def generate_summary_with_fallback(
self,
request: SummaryRequest,
model_selection: ModelSelection
) -> SummaryResult:
"""Generate summary with automatic fallback"""
models_to_try = [model_selection.primary_model] + model_selection.fallback_models
last_error = None
for model_provider in models_to_try:
try:
model_service = self.models[model_provider]
# Update health monitoring
start_time = time.time()
result = await model_service.generate_summary(request)
# Record success
await self._record_model_success(model_provider, time.time() - start_time)
# Add model info to result
result.processing_metadata["model_provider"] = model_provider.value
result.processing_metadata["model_name"] = self.model_specs[model_provider].model_name
result.processing_metadata["fallback_used"] = model_provider != model_selection.primary_model
return result
except Exception as e:
last_error = e
await self._record_model_error(model_provider, str(e))
# If this was the last model to try, raise the error
if model_provider == models_to_try[-1]:
raise Exception(f"All AI models failed. Last error: {str(e)}")
# Continue to next model
continue
raise Exception("No AI models available for processing")
async def _analyze_content_for_model_selection(self, request: SummaryRequest) -> Dict[str, Any]:
"""Analyze content to determine optimal model characteristics"""
transcript = request.transcript
analysis = {
"length": len(transcript),
"word_count": len(transcript.split()),
"token_estimate": len(transcript) // 4, # Rough estimate
"complexity": "medium",
"content_type": "general",
"technical_density": 0.0,
"required_capabilities": [ModelCapability.GENERAL_SUMMARIZATION]
}
# Analyze content type and complexity
lower_transcript = transcript.lower()
# Technical content detection
technical_indicators = [
"algorithm", "function", "variable", "database", "api", "code",
"programming", "software", "technical", "implementation", "architecture"
]
technical_count = sum(1 for word in technical_indicators if word in lower_transcript)
if technical_count >= 5:
analysis["content_type"] = "technical"
analysis["technical_density"] = min(1.0, technical_count / 20)
analysis["required_capabilities"].append(ModelCapability.TECHNICAL_CONTENT)
# Long-form content detection
if analysis["word_count"] > 5000:
analysis["required_capabilities"].append(ModelCapability.LONG_FORM_CONTENT)
# Creative content detection
creative_indicators = ["story", "creative", "art", "design", "narrative", "experience"]
if sum(1 for word in creative_indicators if word in lower_transcript) >= 3:
analysis["content_type"] = "creative"
analysis["required_capabilities"].append(ModelCapability.CREATIVE_CONTENT)
# Complexity assessment
avg_sentence_length = analysis["word_count"] / len(transcript.split('.'))
if avg_sentence_length > 25:
analysis["complexity"] = "high"
elif avg_sentence_length < 15:
analysis["complexity"] = "low"
return analysis
def _calculate_model_score(
self,
specs: ModelSpecs,
content_analysis: Dict[str, Any],
priority: str
) -> float:
"""Calculate score for model based on requirements and preferences"""
score = 0.0
# Base capability matching
required_capabilities = content_analysis["required_capabilities"]
capability_match = len([cap for cap in required_capabilities if cap in specs.capabilities])
capability_score = capability_match / len(required_capabilities) if required_capabilities else 1.0
# Token limit checking
token_estimate = content_analysis["token_estimate"]
if token_estimate > specs.max_input_tokens:
return 0.0 # Cannot handle this content
# Priority-based scoring
if priority == "cost":
cost_score = 1.0 - (specs.cost_per_1k_input_tokens / 0.002) # Normalize against max expected cost
score = 0.4 * capability_score + 0.5 * cost_score + 0.1 * specs.reliability_score
elif priority == "quality":
score = 0.3 * capability_score + 0.6 * specs.quality_score + 0.1 * specs.reliability_score
elif priority == "speed":
score = 0.3 * capability_score + 0.5 * specs.speed_score + 0.2 * specs.reliability_score
else: # balanced
score = (0.3 * capability_score + 0.25 * specs.quality_score +
0.2 * specs.speed_score + 0.15 * specs.reliability_score +
0.1 * (1.0 - specs.cost_per_1k_input_tokens / 0.002))
# Bonus for specific content type alignment
if content_analysis["content_type"] == "technical" and ModelCapability.TECHNICAL_CONTENT in specs.capabilities:
score += 0.1
return min(1.0, max(0.0, score))
def _estimate_cost(self, request: SummaryRequest, specs: ModelSpecs) -> float:
"""Estimate cost for processing with specific model"""
input_tokens = len(request.transcript) // 4 # Rough estimate
output_tokens = 500 # Average summary length
input_cost = (input_tokens / 1000) * specs.cost_per_1k_input_tokens
output_cost = (output_tokens / 1000) * specs.cost_per_1k_output_tokens
return input_cost + output_cost
def _generate_selection_reasoning(
self,
selected_model: ModelProvider,
content_analysis: Dict[str, Any],
priority: str,
score: float
) -> str:
"""Generate human-readable reasoning for model selection"""
specs = self.model_specs[selected_model]
reasons = [f"Selected {specs.model_name} (score: {score:.2f})"]
if priority == "cost":
reasons.append(f"Cost-optimized choice at ${specs.cost_per_1k_input_tokens:.5f} per 1K tokens")
elif priority == "quality":
reasons.append(f"High quality option (quality score: {specs.quality_score:.2f})")
elif priority == "speed":
reasons.append(f"Fast processing (speed score: {specs.speed_score:.2f})")
if content_analysis["content_type"] == "technical":
reasons.append("Optimized for technical content")
if content_analysis["word_count"] > 3000:
reasons.append("Suitable for long-form content")
return ". ".join(reasons)
async def _record_model_success(self, provider: ModelProvider, processing_time: float):
"""Record successful model usage"""
health = self.model_health[provider]
health["status"] = "healthy"
health["last_check"] = time.time()
health["success_rate"] = min(1.0, health["success_rate"] + 0.01)
health["avg_processing_time"] = processing_time
async def _record_model_error(self, provider: ModelProvider, error: str):
"""Record model error for health monitoring"""
health = self.model_health[provider]
health["error_count"] += 1
health["last_error"] = error
health["last_check"] = time.time()
health["success_rate"] = max(0.0, health["success_rate"] - 0.05)
# Mark as unhealthy if too many errors
if health["error_count"] > 5 and health["success_rate"] < 0.3:
health["status"] = "unhealthy"
async def get_model_comparison(self, request: SummaryRequest) -> Dict[str, Any]:
"""Get comparison of all available models for the request"""
content_analysis = await self._analyze_content_for_model_selection(request)
comparisons = {}
for provider, specs in self.model_specs.items():
if provider not in self.models:
continue
comparisons[provider.value] = {
"model_name": specs.model_name,
"estimated_cost": self._estimate_cost(request, specs),
"quality_score": specs.quality_score,
"speed_score": specs.speed_score,
"capabilities": [cap.value for cap in specs.capabilities],
"health_status": self.model_health[provider]["status"],
"suitability_scores": {
"cost_optimized": self._calculate_model_score(specs, content_analysis, "cost"),
"quality_focused": self._calculate_model_score(specs, content_analysis, "quality"),
"speed_focused": self._calculate_model_score(specs, content_analysis, "speed"),
"balanced": self._calculate_model_score(specs, content_analysis, "balanced")
}
}
return {
"content_analysis": content_analysis,
"model_comparisons": comparisons,
"recommendation": await self.select_optimal_model(request)
}
def get_health_status(self) -> Dict[str, Any]:
"""Get health status of all registered models"""
return {
"models": {
provider.value: {
"status": health["status"],
"success_rate": health["success_rate"],
"error_count": health["error_count"],
"last_check": health["last_check"],
"model_name": self.model_specs[provider].model_name
}
for provider, health in self.model_health.items()
},
"total_healthy": sum(1 for h in self.model_health.values() if h["status"] == "healthy"),
"total_models": len(self.model_health)
}
```
### Model-Specific Implementations
[Source: docs/architecture.md#model-adapters]
```python
# backend/services/anthropic_summarizer.py
import anthropic
from .ai_service import AIService, SummaryRequest, SummaryResult, SummaryLength
class AnthropicSummarizer(AIService):
def __init__(self, api_key: str, model: str = "claude-3-5-haiku-20241022"):
self.client = anthropic.AsyncAnthropic(api_key=api_key)
self.model = model
# Cost per 1K tokens (as of 2025)
self.input_cost_per_1k = 0.001 # $1.00 per 1M input tokens
self.output_cost_per_1k = 0.005 # $5.00 per 1M output tokens
async def generate_summary(self, request: SummaryRequest) -> SummaryResult:
"""Generate summary using Anthropic Claude"""
prompt = self._build_anthropic_prompt(request)
try:
start_time = time.time()
message = await self.client.messages.create(
model=self.model,
max_tokens=self._get_max_tokens(request.length),
temperature=0.3,
messages=[
{"role": "user", "content": prompt}
]
)
processing_time = time.time() - start_time
# Parse response (Anthropic returns structured text)
result_data = self._parse_anthropic_response(message.content[0].text)
# Calculate costs
input_tokens = message.usage.input_tokens
output_tokens = message.usage.output_tokens
input_cost = (input_tokens / 1000) * self.input_cost_per_1k
output_cost = (output_tokens / 1000) * self.output_cost_per_1k
return SummaryResult(
summary=result_data["summary"],
key_points=result_data["key_points"],
main_themes=result_data["main_themes"],
actionable_insights=result_data["actionable_insights"],
confidence_score=result_data["confidence_score"],
processing_metadata={
"model": self.model,
"processing_time_seconds": processing_time,
"input_tokens": input_tokens,
"output_tokens": output_tokens,
"provider": "anthropic"
},
cost_data={
"input_cost_usd": input_cost,
"output_cost_usd": output_cost,
"total_cost_usd": input_cost + output_cost
}
)
except Exception as e:
raise AIServiceError(f"Anthropic summarization failed: {str(e)}")
def _build_anthropic_prompt(self, request: SummaryRequest) -> str:
"""Build prompt optimized for Claude's instruction-following"""
length_words = {
SummaryLength.BRIEF: "100-200 words",
SummaryLength.STANDARD: "300-500 words",
SummaryLength.DETAILED: "500-800 words"
}
return f"""Please analyze this YouTube video transcript and provide a comprehensive summary.
Summary Requirements:
- Length: {length_words[request.length]}
- Focus areas: {', '.join(request.focus_areas) if request.focus_areas else 'general content'}
- Language: {request.language}
Please structure your response as follows:
## Summary
[Main summary text here - {length_words[request.length]}]
## Key Points
- [Point 1]
- [Point 2]
- [Point 3-7 as appropriate]
## Main Themes
- [Theme 1]
- [Theme 2]
- [Theme 3-4 as appropriate]
## Actionable Insights
- [Insight 1]
- [Insight 2]
- [Insight 3-5 as appropriate]
## Confidence Score
[Rate your confidence in this summary from 0.0 to 1.0]
Transcript:
{request.transcript}"""
# backend/services/deepseek_summarizer.py
import httpx
from .ai_service import AIService, SummaryRequest, SummaryResult, SummaryLength
class DeepSeekSummarizer(AIService):
def __init__(self, api_key: str, model: str = "deepseek-chat"):
self.api_key = api_key
self.model = model
self.base_url = "https://api.deepseek.com/v1"
# Cost per 1K tokens (DeepSeek pricing)
self.input_cost_per_1k = 0.00014 # $0.14 per 1M input tokens
self.output_cost_per_1k = 0.00028 # $0.28 per 1M output tokens
async def generate_summary(self, request: SummaryRequest) -> SummaryResult:
"""Generate summary using DeepSeek API"""
prompt = self._build_deepseek_prompt(request)
async with httpx.AsyncClient() as client:
try:
start_time = time.time()
response = await client.post(
f"{self.base_url}/chat/completions",
headers={
"Authorization": f"Bearer {self.api_key}",
"Content-Type": "application/json"
},
json={
"model": self.model,
"messages": [
{"role": "system", "content": "You are an expert content summarizer."},
{"role": "user", "content": prompt}
],
"temperature": 0.3,
"max_tokens": self._get_max_tokens(request.length),
"response_format": {"type": "json_object"}
},
timeout=60.0
)
response.raise_for_status()
data = response.json()
processing_time = time.time() - start_time
usage = data["usage"]
# Parse JSON response
result_data = json.loads(data["choices"][0]["message"]["content"])
# Calculate costs
input_cost = (usage["prompt_tokens"] / 1000) * self.input_cost_per_1k
output_cost = (usage["completion_tokens"] / 1000) * self.output_cost_per_1k
return SummaryResult(
summary=result_data.get("summary", ""),
key_points=result_data.get("key_points", []),
main_themes=result_data.get("main_themes", []),
actionable_insights=result_data.get("actionable_insights", []),
confidence_score=result_data.get("confidence_score", 0.8),
processing_metadata={
"model": self.model,
"processing_time_seconds": processing_time,
"prompt_tokens": usage["prompt_tokens"],
"completion_tokens": usage["completion_tokens"],
"provider": "deepseek"
},
cost_data={
"input_cost_usd": input_cost,
"output_cost_usd": output_cost,
"total_cost_usd": input_cost + output_cost
}
)
except Exception as e:
raise AIServiceError(f"DeepSeek summarization failed: {str(e)}")
```
### Frontend Model Selection Interface
[Source: docs/architecture.md#frontend-integration]
```typescript
// frontend/src/components/forms/ModelSelector.tsx
import { useState } from 'react';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { Button } from '@/components/ui/button';
import { Badge } from '@/components/ui/badge';
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select';
import { Tabs, TabsContent, TabsList, TabsTrigger } from '@/components/ui/tabs';
interface ModelComparison {
model_name: string;
estimated_cost: number;
quality_score: number;
speed_score: number;
capabilities: string[];
health_status: string;
suitability_scores: {
cost_optimized: number;
quality_focused: number;
speed_focused: number;
balanced: number;
};
}
interface ModelSelectorProps {
comparisons: Record<string, ModelComparison>;
selectedModel?: string;
onModelSelect: (model: string, priority: string) => void;
}
export function ModelSelector({ comparisons, selectedModel, onModelSelect }: ModelSelectorProps) {
const [priority, setPriority] = useState<string>('balanced');
const [showComparison, setShowComparison] = useState(false);
const getBestModelForPriority = (priority: string) => {
const scores = Object.entries(comparisons).map(([provider, data]) => ({
provider,
score: data.suitability_scores[priority as keyof typeof data.suitability_scores]
}));
return scores.sort((a, b) => b.score - a.score)[0]?.provider;
};
const formatCost = (cost: number) => `$${cost.toFixed(4)}`;
const getQualityBadgeColor = (score: number) => {
if (score >= 0.9) return 'bg-green-100 text-green-800';
if (score >= 0.8) return 'bg-blue-100 text-blue-800';
return 'bg-yellow-100 text-yellow-800';
};
return (
<Card className="w-full">
<CardHeader>
<CardTitle className="flex items-center justify-between">
<span>AI Model Selection</span>
<Button
variant="outline"
size="sm"
onClick={() => setShowComparison(!showComparison)}
>
{showComparison ? 'Hide' : 'Show'} Comparison
</Button>
</CardTitle>
</CardHeader>
<CardContent>
<div className="space-y-4">
<div className="flex items-center space-x-4">
<label className="text-sm font-medium">Priority:</label>
<Select value={priority} onValueChange={setPriority}>
<SelectTrigger className="w-40">
<SelectValue />
</SelectTrigger>
<SelectContent>
<SelectItem value="cost">Cost Optimized</SelectItem>
<SelectItem value="quality">High Quality</SelectItem>
<SelectItem value="speed">Fast Processing</SelectItem>
<SelectItem value="balanced">Balanced</SelectItem>
</SelectContent>
</Select>
<Button
onClick={() => onModelSelect(getBestModelForPriority(priority), priority)}
variant="default"
>
Use Recommended ({getBestModelForPriority(priority)})
</Button>
</div>
{showComparison && (
<Tabs defaultValue="overview" className="w-full">
<TabsList className="grid w-full grid-cols-2">
<TabsTrigger value="overview">Overview</TabsTrigger>
<TabsTrigger value="detailed">Detailed Comparison</TabsTrigger>
</TabsList>
<TabsContent value="overview" className="space-y-4">
<div className="grid grid-cols-1 md:grid-cols-3 gap-4">
{Object.entries(comparisons).map(([provider, data]) => (
<Card
key={provider}
className={`cursor-pointer transition-colors ${
selectedModel === provider ? 'ring-2 ring-blue-500' : 'hover:bg-gray-50'
}`}
onClick={() => onModelSelect(provider, priority)}
>
<CardHeader className="pb-2">
<CardTitle className="text-sm flex items-center justify-between">
<span>{data.model_name}</span>
<Badge
className={
data.health_status === 'healthy'
? 'bg-green-100 text-green-800'
: 'bg-red-100 text-red-800'
}
>
{data.health_status}
</Badge>
</CardTitle>
</CardHeader>
<CardContent className="pt-0 space-y-2">
<div className="flex justify-between text-sm">
<span>Cost:</span>
<span className="font-mono">{formatCost(data.estimated_cost)}</span>
</div>
<div className="flex justify-between text-sm">
<span>Quality:</span>
<Badge className={getQualityBadgeColor(data.quality_score)}>
{(data.quality_score * 100).toFixed(0)}%
</Badge>
</div>
<div className="flex justify-between text-sm">
<span>Speed:</span>
<Badge className={getQualityBadgeColor(data.speed_score)}>
{(data.speed_score * 100).toFixed(0)}%
</Badge>
</div>
<div className="text-xs text-gray-600">
Suitability: {(data.suitability_scores[priority as keyof typeof data.suitability_scores] * 100).toFixed(0)}%
</div>
</CardContent>
</Card>
))}
</div>
</TabsContent>
<TabsContent value="detailed" className="space-y-4">
<div className="overflow-x-auto">
<table className="min-w-full divide-y divide-gray-200">
<thead className="bg-gray-50">
<tr>
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Model</th>
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Cost</th>
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Quality</th>
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Speed</th>
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Capabilities</th>
<th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Status</th>
</tr>
</thead>
<tbody className="bg-white divide-y divide-gray-200">
{Object.entries(comparisons).map(([provider, data]) => (
<tr key={provider} className="hover:bg-gray-50">
<td className="px-6 py-4 whitespace-nowrap text-sm font-medium text-gray-900">
{data.model_name}
</td>
<td className="px-6 py-4 whitespace-nowrap text-sm text-gray-500 font-mono">
{formatCost(data.estimated_cost)}
</td>
<td className="px-6 py-4 whitespace-nowrap text-sm text-gray-500">
{(data.quality_score * 100).toFixed(0)}%
</td>
<td className="px-6 py-4 whitespace-nowrap text-sm text-gray-500">
{(data.speed_score * 100).toFixed(0)}%
</td>
<td className="px-6 py-4 text-sm text-gray-500">
<div className="flex flex-wrap gap-1">
{data.capabilities.slice(0, 3).map(cap => (
<Badge key={cap} variant="secondary" className="text-xs">
{cap.replace('_', ' ')}
</Badge>
))}
{data.capabilities.length > 3 && (
<Badge variant="secondary" className="text-xs">
+{data.capabilities.length - 3} more
</Badge>
)}
</div>
</td>
<td className="px-6 py-4 whitespace-nowrap">
<Badge
className={
data.health_status === 'healthy'
? 'bg-green-100 text-green-800'
: 'bg-red-100 text-red-800'
}
>
{data.health_status}
</Badge>
</td>
</tr>
))}
</tbody>
</table>
</div>
</TabsContent>
</Tabs>
)}
</div>
</CardContent>
</Card>
);
}
```
### Performance Benefits
- **Intelligent Model Selection**: Automatically chooses optimal model based on content and preferences
- **Cost Optimization**: Up to 50% cost savings by selecting appropriate model for content type
- **Quality Assurance**: Fallback mechanisms ensure consistent quality even during model outages
- **Flexibility**: Users can prioritize cost, quality, or speed based on their needs
- **Reliability**: Multi-model redundancy provides 99.9% uptime for summarization service
## Change Log
| Date | Version | Description | Author |
|------|---------|-------------|--------|
| 2025-01-25 | 1.0 | Initial story creation | Bob (Scrum Master) |
## Dev Agent Record
*This section will be populated by the development agent during implementation*
## QA Results
*Results from QA Agent review of the completed story implementation will be added here*