youtube-summarizer/docs/stories/2.4.multi-model-support.md

# Story 2.4: Multi-Model Support

## Status
Draft

## Story

**As a** user
**I want** the system to support multiple AI models (OpenAI, Anthropic, DeepSeek) with intelligent selection
**so that** I can choose the best model for my content type and optimize for cost or quality preferences

## Acceptance Criteria

1. Support for multiple AI providers: OpenAI GPT-4o-mini, Anthropic Claude, DeepSeek V2
2. Intelligent model selection based on content type, length, and user preferences
3. Automatic fallback to alternative models when primary model fails or is unavailable
4. Cost comparison and optimization recommendations for different model choices
5. Model performance tracking and quality comparison across different content types
6. User preference management for model selection and fallback strategies

## Tasks / Subtasks

- [ ] **Task 1: Multi-Model Service Architecture** (AC: 1, 3)
  - [ ] Create `AIModelRegistry` for managing multiple model providers
  - [ ] Implement provider-specific adapters (OpenAI, Anthropic, DeepSeek)
  - [ ] Create unified interface for model switching and fallback logic
  - [ ] Add model availability monitoring and health checks

- [ ] **Task 2: Model-Specific Implementations** (AC: 1)
  - [ ] Implement `AnthropicSummarizer` for Claude 3.5 Sonnet integration
  - [ ] Create `DeepSeekSummarizer` for DeepSeek V2 integration
  - [ ] Standardize prompt optimization for each model's strengths
  - [ ] Add model-specific parameter tuning and optimization

- [ ] **Task 3: Intelligent Model Selection** (AC: 2, 4)
  - [ ] Create content analysis for optimal model matching
  - [ ] Implement cost-quality optimization algorithms
  - [ ] Add model recommendation engine based on content characteristics
  - [ ] Create user preference learning system

- [ ] **Task 4: Fallback and Reliability** (AC: 3)
  - [ ] Implement automatic failover logic with error classification
  - [ ] Create model health monitoring and status tracking
  - [ ] Add graceful degradation with quality maintenance
  - [ ] Implement retry logic with model rotation

- [ ] **Task 5: Performance and Cost Analytics** (AC: 4, 5)
  - [ ] Create model performance comparison dashboard
  - [ ] Implement cost tracking and optimization recommendations
  - [ ] Add quality scoring across different models and content types
  - [ ] Create model usage analytics and insights

- [ ] **Task 6: User Experience and Configuration** (AC: 6)
  - [ ] Add model selection options in frontend interface
  - [ ] Create user preference management for model choices
  - [ ] Implement model comparison tools for users
  - [ ] Add real-time cost estimates and recommendations

- [ ] **Task 7: Integration and Testing** (AC: 1, 2, 3, 4, 5, 6)
  - [ ] Update SummaryPipeline to use multi-model system
  - [ ] Test model switching and fallback scenarios
  - [ ] Validate cost calculations and performance metrics
  - [ ] Create comprehensive model comparison testing

## Dev Notes

### Architecture Context
This story transforms the single-model AI service into a sophisticated multi-model system that can intelligently choose and switch between AI providers. The system must maintain consistency while optimizing for user preferences, content requirements, and cost efficiency.

### Multi-Model Architecture Design
[Source: docs/architecture.md#multi-model-ai-architecture]

```python
# backend/services/ai_model_registry.py
from abc import ABC, abstractmethod
from enum import Enum
from typing import Dict, List, Optional, Any, Union
from dataclasses import dataclass
import asyncio
import time
from ..services.ai_service import AIService, SummaryRequest, SummaryResult

class ModelProvider(Enum):
    OPENAI = "openai"
    ANTHROPIC = "anthropic"
    DEEPSEEK = "deepseek"

class ModelCapability(Enum):
    GENERAL_SUMMARIZATION = "general_summarization"
    TECHNICAL_CONTENT = "technical_content"
    CREATIVE_CONTENT = "creative_content"
    LONG_FORM_CONTENT = "long_form_content"
    MULTILINGUAL = "multilingual"
    COST_OPTIMIZED = "cost_optimized"
    HIGH_QUALITY = "high_quality"

@dataclass
class ModelSpecs:
    provider: ModelProvider
    model_name: str
    max_input_tokens: int
    max_output_tokens: int
    cost_per_1k_input_tokens: float
    cost_per_1k_output_tokens: float
    capabilities: List[ModelCapability]
    quality_score: float  # 0.0 to 1.0
    speed_score: float    # 0.0 to 1.0 (relative)
    reliability_score: float  # 0.0 to 1.0

@dataclass
class ModelSelection:
    primary_model: ModelProvider
    fallback_models: List[ModelProvider]
    reasoning: str
    estimated_cost: float
    estimated_quality: float

class AIModelRegistry:
    """Registry and orchestrator for multiple AI models"""

    def __init__(self):
        self.models: Dict[ModelProvider, AIService] = {}
        self.model_specs: Dict[ModelProvider, ModelSpecs] = {}
        self.model_health: Dict[ModelProvider, Dict[str, Any]] = {}

        self._initialize_model_specs()

    def _initialize_model_specs(self):
        """Initialize model specifications and capabilities"""

        self.model_specs[ModelProvider.OPENAI] = ModelSpecs(
            provider=ModelProvider.OPENAI,
            model_name="gpt-4o-mini",
            max_input_tokens=128000,
            max_output_tokens=16384,
            cost_per_1k_input_tokens=0.00015,
            cost_per_1k_output_tokens=0.0006,
            capabilities=[
                ModelCapability.GENERAL_SUMMARIZATION,
                ModelCapability.TECHNICAL_CONTENT,
                ModelCapability.CREATIVE_CONTENT,
                ModelCapability.COST_OPTIMIZED
            ],
            quality_score=0.85,
            speed_score=0.90,
            reliability_score=0.95
        )

        self.model_specs[ModelProvider.ANTHROPIC] = ModelSpecs(
            provider=ModelProvider.ANTHROPIC,
            model_name="claude-3-5-haiku-20241022",
            max_input_tokens=200000,
            max_output_tokens=8192,
            cost_per_1k_input_tokens=0.001,
            cost_per_1k_output_tokens=0.005,
            capabilities=[
                ModelCapability.GENERAL_SUMMARIZATION,
                ModelCapability.TECHNICAL_CONTENT,
                ModelCapability.LONG_FORM_CONTENT,
                ModelCapability.HIGH_QUALITY
            ],
            quality_score=0.95,
            speed_score=0.80,
            reliability_score=0.92
        )

        self.model_specs[ModelProvider.DEEPSEEK] = ModelSpecs(
            provider=ModelProvider.DEEPSEEK,
            model_name="deepseek-chat",
            max_input_tokens=64000,
            max_output_tokens=4096,
            cost_per_1k_input_tokens=0.00014,
            cost_per_1k_output_tokens=0.00028,
            capabilities=[
                ModelCapability.GENERAL_SUMMARIZATION,
                ModelCapability.TECHNICAL_CONTENT,
                ModelCapability.COST_OPTIMIZED
            ],
            quality_score=0.80,
            speed_score=0.85,
            reliability_score=0.88
        )

    def register_model(self, provider: ModelProvider, model_service: AIService):
        """Register a model service with the registry"""
        self.models[provider] = model_service
        self.model_health[provider] = {
            "status": "healthy",
            "last_check": time.time(),
            "error_count": 0,
            "success_rate": 1.0
        }

    async def select_optimal_model(
        self,
        request: SummaryRequest,
        user_preferences: Optional[Dict[str, Any]] = None
    ) -> ModelSelection:
        """Select optimal model based on content and preferences"""

        # Analyze content characteristics
        content_analysis = await self._analyze_content_for_model_selection(request)

        # Get user preferences
        preferences = user_preferences or {}
        priority = preferences.get("priority", "balanced")  # cost, quality, speed, balanced

        # Score models based on requirements
        model_scores = {}
        for provider, specs in self.model_specs.items():
            if provider not in self.models:
                continue  # Skip unavailable models

            score = self._calculate_model_score(specs, content_analysis, priority)
            model_scores[provider] = score

        # Sort by score and filter healthy models
        healthy_models = [
            provider for provider, health in self.model_health.items()
            if health["status"] == "healthy" and provider in model_scores
        ]

        if not healthy_models:
            raise Exception("No healthy AI models available")

        # Select primary and fallback models
        sorted_models = sorted(healthy_models, key=lambda p: model_scores[p], reverse=True)
        primary_model = sorted_models[0]
        fallback_models = sorted_models[1:3]  # Top 2 fallbacks

        # Calculate estimates
        primary_specs = self.model_specs[primary_model]
        estimated_cost = self._estimate_cost(request, primary_specs)
        estimated_quality = primary_specs.quality_score

        # Generate reasoning
        reasoning = self._generate_selection_reasoning(
            primary_model, content_analysis, priority, model_scores[primary_model]
        )

        return ModelSelection(
            primary_model=primary_model,
            fallback_models=fallback_models,
            reasoning=reasoning,
            estimated_cost=estimated_cost,
            estimated_quality=estimated_quality
        )

    async def generate_summary_with_fallback(
        self,
        request: SummaryRequest,
        model_selection: ModelSelection
    ) -> SummaryResult:
        """Generate summary with automatic fallback"""

        models_to_try = [model_selection.primary_model] + model_selection.fallback_models
        last_error = None

        for model_provider in models_to_try:
            try:
                model_service = self.models[model_provider]

                # Update health monitoring
                start_time = time.time()

                result = await model_service.generate_summary(request)

                # Record success
                await self._record_model_success(model_provider, time.time() - start_time)

                # Add model info to result
                result.processing_metadata["model_provider"] = model_provider.value
                result.processing_metadata["model_name"] = self.model_specs[model_provider].model_name
                result.processing_metadata["fallback_used"] = model_provider != model_selection.primary_model

                return result

            except Exception as e:
                last_error = e
                await self._record_model_error(model_provider, str(e))

                # If this was the last model to try, raise the error
                if model_provider == models_to_try[-1]:
                    raise Exception(f"All AI models failed. Last error: {str(e)}")

                # Continue to next model
                continue

        raise Exception("No AI models available for processing")

    async def _analyze_content_for_model_selection(self, request: SummaryRequest) -> Dict[str, Any]:
        """Analyze content to determine optimal model characteristics"""

        transcript = request.transcript
        analysis = {
            "length": len(transcript),
            "word_count": len(transcript.split()),
            "token_estimate": len(transcript) // 4,  # Rough estimate
            "complexity": "medium",
            "content_type": "general",
            "technical_density": 0.0,
            "required_capabilities": [ModelCapability.GENERAL_SUMMARIZATION]
        }

        # Analyze content type and complexity
        lower_transcript = transcript.lower()

        # Technical content detection
        technical_indicators = [
            "algorithm", "function", "variable", "database", "api", "code",
            "programming", "software", "technical", "implementation", "architecture"
        ]
        technical_count = sum(1 for word in technical_indicators if word in lower_transcript)

        if technical_count >= 5:
            analysis["content_type"] = "technical"
            analysis["technical_density"] = min(1.0, technical_count / 20)
            analysis["required_capabilities"].append(ModelCapability.TECHNICAL_CONTENT)

        # Long-form content detection
        if analysis["word_count"] > 5000:
            analysis["required_capabilities"].append(ModelCapability.LONG_FORM_CONTENT)

        # Creative content detection
        creative_indicators = ["story", "creative", "art", "design", "narrative", "experience"]
        if sum(1 for word in creative_indicators if word in lower_transcript) >= 3:
            analysis["content_type"] = "creative"
            analysis["required_capabilities"].append(ModelCapability.CREATIVE_CONTENT)

        # Complexity assessment
        avg_sentence_length = analysis["word_count"] / len(transcript.split('.'))
        if avg_sentence_length > 25:
            analysis["complexity"] = "high"
        elif avg_sentence_length < 15:
            analysis["complexity"] = "low"

        return analysis

    def _calculate_model_score(
        self,
        specs: ModelSpecs,
        content_analysis: Dict[str, Any],
        priority: str
    ) -> float:
        """Calculate score for model based on requirements and preferences"""

        score = 0.0

        # Base capability matching
        required_capabilities = content_analysis["required_capabilities"]
        capability_match = len([cap for cap in required_capabilities if cap in specs.capabilities])
        capability_score = capability_match / len(required_capabilities) if required_capabilities else 1.0

        # Token limit checking
        token_estimate = content_analysis["token_estimate"]
        if token_estimate > specs.max_input_tokens:
            return 0.0  # Cannot handle this content

        # Priority-based scoring
        if priority == "cost":
            cost_score = 1.0 - (specs.cost_per_1k_input_tokens / 0.002)  # Normalize against max expected cost
            score = 0.4 * capability_score + 0.5 * cost_score + 0.1 * specs.reliability_score

        elif priority == "quality":
            score = 0.3 * capability_score + 0.6 * specs.quality_score + 0.1 * specs.reliability_score

        elif priority == "speed":
            score = 0.3 * capability_score + 0.5 * specs.speed_score + 0.2 * specs.reliability_score

        else:  # balanced
            score = (0.3 * capability_score + 0.25 * specs.quality_score +
                    0.2 * specs.speed_score + 0.15 * specs.reliability_score +
                    0.1 * (1.0 - specs.cost_per_1k_input_tokens / 0.002))

        # Bonus for specific content type alignment
        if content_analysis["content_type"] == "technical" and ModelCapability.TECHNICAL_CONTENT in specs.capabilities:
            score += 0.1

        return min(1.0, max(0.0, score))

    def _estimate_cost(self, request: SummaryRequest, specs: ModelSpecs) -> float:
        """Estimate cost for processing with specific model"""

        input_tokens = len(request.transcript) // 4  # Rough estimate
        output_tokens = 500  # Average summary length

        input_cost = (input_tokens / 1000) * specs.cost_per_1k_input_tokens
        output_cost = (output_tokens / 1000) * specs.cost_per_1k_output_tokens

        return input_cost + output_cost

    def _generate_selection_reasoning(
        self,
        selected_model: ModelProvider,
        content_analysis: Dict[str, Any],
        priority: str,
        score: float
    ) -> str:
        """Generate human-readable reasoning for model selection"""

        specs = self.model_specs[selected_model]

        reasons = [f"Selected {specs.model_name} (score: {score:.2f})"]

        if priority == "cost":
            reasons.append(f"Cost-optimized choice at ${specs.cost_per_1k_input_tokens:.5f} per 1K tokens")
        elif priority == "quality":
            reasons.append(f"High quality option (quality score: {specs.quality_score:.2f})")
        elif priority == "speed":
            reasons.append(f"Fast processing (speed score: {specs.speed_score:.2f})")

        if content_analysis["content_type"] == "technical":
            reasons.append("Optimized for technical content")

        if content_analysis["word_count"] > 3000:
            reasons.append("Suitable for long-form content")

        return ". ".join(reasons)

    async def _record_model_success(self, provider: ModelProvider, processing_time: float):
        """Record successful model usage"""

        health = self.model_health[provider]
        health["status"] = "healthy"
        health["last_check"] = time.time()
        health["success_rate"] = min(1.0, health["success_rate"] + 0.01)
        health["avg_processing_time"] = processing_time

    async def _record_model_error(self, provider: ModelProvider, error: str):
        """Record model error for health monitoring"""

        health = self.model_health[provider]
        health["error_count"] += 1
        health["last_error"] = error
        health["last_check"] = time.time()
        health["success_rate"] = max(0.0, health["success_rate"] - 0.05)

        # Mark as unhealthy if too many errors
        if health["error_count"] > 5 and health["success_rate"] < 0.3:
            health["status"] = "unhealthy"

    async def get_model_comparison(self, request: SummaryRequest) -> Dict[str, Any]:
        """Get comparison of all available models for the request"""

        content_analysis = await self._analyze_content_for_model_selection(request)

        comparisons = {}
        for provider, specs in self.model_specs.items():
            if provider not in self.models:
                continue

            comparisons[provider.value] = {
                "model_name": specs.model_name,
                "estimated_cost": self._estimate_cost(request, specs),
                "quality_score": specs.quality_score,
                "speed_score": specs.speed_score,
                "capabilities": [cap.value for cap in specs.capabilities],
                "health_status": self.model_health[provider]["status"],
                "suitability_scores": {
                    "cost_optimized": self._calculate_model_score(specs, content_analysis, "cost"),
                    "quality_focused": self._calculate_model_score(specs, content_analysis, "quality"),
                    "speed_focused": self._calculate_model_score(specs, content_analysis, "speed"),
                    "balanced": self._calculate_model_score(specs, content_analysis, "balanced")
                }
            }

        return {
            "content_analysis": content_analysis,
            "model_comparisons": comparisons,
            "recommendation": await self.select_optimal_model(request)
        }

    def get_health_status(self) -> Dict[str, Any]:
        """Get health status of all registered models"""

        return {
            "models": {
                provider.value: {
                    "status": health["status"],
                    "success_rate": health["success_rate"],
                    "error_count": health["error_count"],
                    "last_check": health["last_check"],
                    "model_name": self.model_specs[provider].model_name
                }
                for provider, health in self.model_health.items()
            },
            "total_healthy": sum(1 for h in self.model_health.values() if h["status"] == "healthy"),
            "total_models": len(self.model_health)
        }
```

### Model-Specific Implementations
[Source: docs/architecture.md#model-adapters]

```python
# backend/services/anthropic_summarizer.py
import anthropic
from .ai_service import AIService, SummaryRequest, SummaryResult, SummaryLength

class AnthropicSummarizer(AIService):
    def __init__(self, api_key: str, model: str = "claude-3-5-haiku-20241022"):
        self.client = anthropic.AsyncAnthropic(api_key=api_key)
        self.model = model

        # Cost per 1K tokens (as of 2025)
        self.input_cost_per_1k = 0.001    # $1.00 per 1M input tokens
        self.output_cost_per_1k = 0.005   # $5.00 per 1M output tokens

    async def generate_summary(self, request: SummaryRequest) -> SummaryResult:
        """Generate summary using Anthropic Claude"""

        prompt = self._build_anthropic_prompt(request)

        try:
            start_time = time.time()

            message = await self.client.messages.create(
                model=self.model,
                max_tokens=self._get_max_tokens(request.length),
                temperature=0.3,
                messages=[
                    {"role": "user", "content": prompt}
                ]
            )

            processing_time = time.time() - start_time

            # Parse response (Anthropic returns structured text)
            result_data = self._parse_anthropic_response(message.content[0].text)

            # Calculate costs
            input_tokens = message.usage.input_tokens
            output_tokens = message.usage.output_tokens
            input_cost = (input_tokens / 1000) * self.input_cost_per_1k
            output_cost = (output_tokens / 1000) * self.output_cost_per_1k

            return SummaryResult(
                summary=result_data["summary"],
                key_points=result_data["key_points"],
                main_themes=result_data["main_themes"],
                actionable_insights=result_data["actionable_insights"],
                confidence_score=result_data["confidence_score"],
                processing_metadata={
                    "model": self.model,
                    "processing_time_seconds": processing_time,
                    "input_tokens": input_tokens,
                    "output_tokens": output_tokens,
                    "provider": "anthropic"
                },
                cost_data={
                    "input_cost_usd": input_cost,
                    "output_cost_usd": output_cost,
                    "total_cost_usd": input_cost + output_cost
                }
            )

        except Exception as e:
            raise AIServiceError(f"Anthropic summarization failed: {str(e)}")

    def _build_anthropic_prompt(self, request: SummaryRequest) -> str:
        """Build prompt optimized for Claude's instruction-following"""

        length_words = {
            SummaryLength.BRIEF: "100-200 words",
            SummaryLength.STANDARD: "300-500 words",
            SummaryLength.DETAILED: "500-800 words"
        }

        return f"""Please analyze this YouTube video transcript and provide a comprehensive summary.

Summary Requirements:
- Length: {length_words[request.length]}
- Focus areas: {', '.join(request.focus_areas) if request.focus_areas else 'general content'}
- Language: {request.language}

Please structure your response as follows:

## Summary
[Main summary text here - {length_words[request.length]}]

## Key Points
- [Point 1]
- [Point 2]
- [Point 3-7 as appropriate]

## Main Themes
- [Theme 1]
- [Theme 2]
- [Theme 3-4 as appropriate]

## Actionable Insights
- [Insight 1]
- [Insight 2]
- [Insight 3-5 as appropriate]

## Confidence Score
[Rate your confidence in this summary from 0.0 to 1.0]

Transcript:
{request.transcript}"""

# backend/services/deepseek_summarizer.py
import httpx
from .ai_service import AIService, SummaryRequest, SummaryResult, SummaryLength

class DeepSeekSummarizer(AIService):
    def __init__(self, api_key: str, model: str = "deepseek-chat"):
        self.api_key = api_key
        self.model = model
        self.base_url = "https://api.deepseek.com/v1"

        # Cost per 1K tokens (DeepSeek pricing)
        self.input_cost_per_1k = 0.00014   # $0.14 per 1M input tokens
        self.output_cost_per_1k = 0.00028  # $0.28 per 1M output tokens

    async def generate_summary(self, request: SummaryRequest) -> SummaryResult:
        """Generate summary using DeepSeek API"""

        prompt = self._build_deepseek_prompt(request)

        async with httpx.AsyncClient() as client:
            try:
                start_time = time.time()

                response = await client.post(
                    f"{self.base_url}/chat/completions",
                    headers={
                        "Authorization": f"Bearer {self.api_key}",
                        "Content-Type": "application/json"
                    },
                    json={
                        "model": self.model,
                        "messages": [
                            {"role": "system", "content": "You are an expert content summarizer."},
                            {"role": "user", "content": prompt}
                        ],
                        "temperature": 0.3,
                        "max_tokens": self._get_max_tokens(request.length),
                        "response_format": {"type": "json_object"}
                    },
                    timeout=60.0
                )

                response.raise_for_status()
                data = response.json()

                processing_time = time.time() - start_time
                usage = data["usage"]

                # Parse JSON response
                result_data = json.loads(data["choices"][0]["message"]["content"])

                # Calculate costs
                input_cost = (usage["prompt_tokens"] / 1000) * self.input_cost_per_1k
                output_cost = (usage["completion_tokens"] / 1000) * self.output_cost_per_1k

                return SummaryResult(
                    summary=result_data.get("summary", ""),
                    key_points=result_data.get("key_points", []),
                    main_themes=result_data.get("main_themes", []),
                    actionable_insights=result_data.get("actionable_insights", []),
                    confidence_score=result_data.get("confidence_score", 0.8),
                    processing_metadata={
                        "model": self.model,
                        "processing_time_seconds": processing_time,
                        "prompt_tokens": usage["prompt_tokens"],
                        "completion_tokens": usage["completion_tokens"],
                        "provider": "deepseek"
                    },
                    cost_data={
                        "input_cost_usd": input_cost,
                        "output_cost_usd": output_cost,
                        "total_cost_usd": input_cost + output_cost
                    }
                )

            except Exception as e:
                raise AIServiceError(f"DeepSeek summarization failed: {str(e)}")
```

### Frontend Model Selection Interface
[Source: docs/architecture.md#frontend-integration]

```typescript
// frontend/src/components/forms/ModelSelector.tsx
import { useState } from 'react';
import { Card, CardContent, CardHeader, CardTitle } from '@/components/ui/card';
import { Button } from '@/components/ui/button';
import { Badge } from '@/components/ui/badge';
import { Select, SelectContent, SelectItem, SelectTrigger, SelectValue } from '@/components/ui/select';
import { Tabs, TabsContent, TabsList, TabsTrigger } from '@/components/ui/tabs';

interface ModelComparison {
  model_name: string;
  estimated_cost: number;
  quality_score: number;
  speed_score: number;
  capabilities: string[];
  health_status: string;
  suitability_scores: {
    cost_optimized: number;
    quality_focused: number;
    speed_focused: number;
    balanced: number;
  };
}

interface ModelSelectorProps {
  comparisons: Record<string, ModelComparison>;
  selectedModel?: string;
  onModelSelect: (model: string, priority: string) => void;
}

export function ModelSelector({ comparisons, selectedModel, onModelSelect }: ModelSelectorProps) {
  const [priority, setPriority] = useState<string>('balanced');
  const [showComparison, setShowComparison] = useState(false);

  const getBestModelForPriority = (priority: string) => {
    const scores = Object.entries(comparisons).map(([provider, data]) => ({
      provider,
      score: data.suitability_scores[priority as keyof typeof data.suitability_scores]
    }));

    return scores.sort((a, b) => b.score - a.score)[0]?.provider;
  };

  const formatCost = (cost: number) => `$${cost.toFixed(4)}`;

  const getQualityBadgeColor = (score: number) => {
    if (score >= 0.9) return 'bg-green-100 text-green-800';
    if (score >= 0.8) return 'bg-blue-100 text-blue-800';
    return 'bg-yellow-100 text-yellow-800';
  };

  return (
    <Card className="w-full">
      <CardHeader>
        <CardTitle className="flex items-center justify-between">
          <span>AI Model Selection</span>
          <Button
            variant="outline"
            size="sm"
            onClick={() => setShowComparison(!showComparison)}
          >
            {showComparison ? 'Hide' : 'Show'} Comparison
          </Button>
        </CardTitle>
      </CardHeader>
      <CardContent>
        <div className="space-y-4">
          <div className="flex items-center space-x-4">
            <label className="text-sm font-medium">Priority:</label>
            <Select value={priority} onValueChange={setPriority}>
              <SelectTrigger className="w-40">
                <SelectValue />
              </SelectTrigger>
              <SelectContent>
                <SelectItem value="cost">Cost Optimized</SelectItem>
                <SelectItem value="quality">High Quality</SelectItem>
                <SelectItem value="speed">Fast Processing</SelectItem>
                <SelectItem value="balanced">Balanced</SelectItem>
              </SelectContent>
            </Select>

            <Button
              onClick={() => onModelSelect(getBestModelForPriority(priority), priority)}
              variant="default"
            >
              Use Recommended ({getBestModelForPriority(priority)})
            </Button>
          </div>

          {showComparison && (
            <Tabs defaultValue="overview" className="w-full">
              <TabsList className="grid w-full grid-cols-2">
                <TabsTrigger value="overview">Overview</TabsTrigger>
                <TabsTrigger value="detailed">Detailed Comparison</TabsTrigger>
              </TabsList>

              <TabsContent value="overview" className="space-y-4">
                <div className="grid grid-cols-1 md:grid-cols-3 gap-4">
                  {Object.entries(comparisons).map(([provider, data]) => (
                    <Card
                      key={provider}
                      className={`cursor-pointer transition-colors ${
                        selectedModel === provider ? 'ring-2 ring-blue-500' : 'hover:bg-gray-50'
                      }`}
                      onClick={() => onModelSelect(provider, priority)}
                    >
                      <CardHeader className="pb-2">
                        <CardTitle className="text-sm flex items-center justify-between">
                          <span>{data.model_name}</span>
                          <Badge
                            className={
                              data.health_status === 'healthy'
                                ? 'bg-green-100 text-green-800'
                                : 'bg-red-100 text-red-800'
                            }
                          >
                            {data.health_status}
                          </Badge>
                        </CardTitle>
                      </CardHeader>
                      <CardContent className="pt-0 space-y-2">
                        <div className="flex justify-between text-sm">
                          <span>Cost:</span>
                          <span className="font-mono">{formatCost(data.estimated_cost)}</span>
                        </div>
                        <div className="flex justify-between text-sm">
                          <span>Quality:</span>
                          <Badge className={getQualityBadgeColor(data.quality_score)}>
                            {(data.quality_score * 100).toFixed(0)}%
                          </Badge>
                        </div>
                        <div className="flex justify-between text-sm">
                          <span>Speed:</span>
                          <Badge className={getQualityBadgeColor(data.speed_score)}>
                            {(data.speed_score * 100).toFixed(0)}%
                          </Badge>
                        </div>
                        <div className="text-xs text-gray-600">
                          Suitability: {(data.suitability_scores[priority as keyof typeof data.suitability_scores] * 100).toFixed(0)}%
                        </div>
                      </CardContent>
                    </Card>
                  ))}
                </div>
              </TabsContent>

              <TabsContent value="detailed" className="space-y-4">
                <div className="overflow-x-auto">
                  <table className="min-w-full divide-y divide-gray-200">
                    <thead className="bg-gray-50">
                      <tr>
                        <th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Model</th>
                        <th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Cost</th>
                        <th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Quality</th>
                        <th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Speed</th>
                        <th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Capabilities</th>
                        <th className="px-6 py-3 text-left text-xs font-medium text-gray-500 uppercase">Status</th>
                      </tr>
                    </thead>
                    <tbody className="bg-white divide-y divide-gray-200">
                      {Object.entries(comparisons).map(([provider, data]) => (
                        <tr key={provider} className="hover:bg-gray-50">
                          <td className="px-6 py-4 whitespace-nowrap text-sm font-medium text-gray-900">
                            {data.model_name}
                          </td>
                          <td className="px-6 py-4 whitespace-nowrap text-sm text-gray-500 font-mono">
                            {formatCost(data.estimated_cost)}
                          </td>
                          <td className="px-6 py-4 whitespace-nowrap text-sm text-gray-500">
                            {(data.quality_score * 100).toFixed(0)}%
                          </td>
                          <td className="px-6 py-4 whitespace-nowrap text-sm text-gray-500">
                            {(data.speed_score * 100).toFixed(0)}%
                          </td>
                          <td className="px-6 py-4 text-sm text-gray-500">
                            <div className="flex flex-wrap gap-1">
                              {data.capabilities.slice(0, 3).map(cap => (
                                <Badge key={cap} variant="secondary" className="text-xs">
                                  {cap.replace('_', ' ')}
                                </Badge>
                              ))}
                              {data.capabilities.length > 3 && (
                                <Badge variant="secondary" className="text-xs">
                                  +{data.capabilities.length - 3} more
                                </Badge>
                              )}
                            </div>
                          </td>
                          <td className="px-6 py-4 whitespace-nowrap">
                            <Badge
                              className={
                                data.health_status === 'healthy'
                                  ? 'bg-green-100 text-green-800'
                                  : 'bg-red-100 text-red-800'
                              }
                            >
                              {data.health_status}
                            </Badge>
                          </td>
                        </tr>
                      ))}
                    </tbody>
                  </table>
                </div>
              </TabsContent>
            </Tabs>
          )}
        </div>
      </CardContent>
    </Card>
  );
}
```

### Performance Benefits
- **Intelligent Model Selection**: Automatically chooses optimal model based on content and preferences
- **Cost Optimization**: Up to 50% cost savings by selecting appropriate model for content type
- **Quality Assurance**: Fallback mechanisms ensure consistent quality even during model outages
- **Flexibility**: Users can prioritize cost, quality, or speed based on their needs
- **Reliability**: Multi-model redundancy provides 99.9% uptime for summarization service

## Change Log

| Date | Version | Description | Author |
|------|---------|-------------|--------|
| 2025-01-25 | 1.0 | Initial story creation | Bob (Scrum Master) |

## Dev Agent Record

*This section will be populated by the development agent during implementation*

## QA Results

*Results from QA Agent review of the completed story implementation will be added here*