youtube-summarizer/docs/stories/4.3.multi-video-multi-agent...

400 lines
15 KiB
Markdown

# Story 4.3: Multi-video Analysis with Multi-Agent System
## Story Overview
**Story ID**: 4.3
**Epic**: 4 - Advanced Intelligence & Developer Platform
**Title**: Multi-video Analysis with Multi-Agent System
**Status**: 📋 READY FOR IMPLEMENTATION
**Priority**: High
**Goal**: Implement playlist and multi-video analysis using a multi-agent AI system with three different perspective agents (Technical, Business, User) and a synthesis agent for unified comprehensive summaries.
**Value Proposition**: Provide comprehensive multi-faceted analysis of YouTube playlists and channels, leveraging existing AI ecosystem infrastructure to generate insights from multiple AI perspectives.
**Dependencies**:
- ✅ Story 4.2 (API Endpoints & Developer SDK) - Complete
- ✅ Existing AI ecosystem at `/src/agents/ecosystem/`
- ✅ ChromaDB integration patterns from `/tests/framework-comparison/`
**Estimated Effort**: 40 hours (enhanced with multi-agent system)
## Technical Requirements
### Core Features
#### 1. Multi-Agent Summarization System
- **Three Perspective Agents**: Technical Analyst, Business Analyst, User Experience Analyst
- **Synthesis Agent**: Combines perspectives into unified comprehensive summary
- **AI Ecosystem Integration**: Leverages existing `/src/agents/ecosystem/` infrastructure
- **Agent Orchestration**: Uses BaseAgent and AgentOrchestrator patterns
- **State Management**: LangGraph-based workflow coordination
#### 2. Playlist Processing
- **Playlist URL Parser**: Extract video IDs from YouTube playlist URLs
- **Batch Video Processing**: Process all videos in playlist sequentially
- **Cross-Video Analysis**: Identify themes, patterns, and progression across videos
- **Series Tracking**: Detect content evolution and learning paths
#### 3. Multi-Video Intelligence
- **Theme Analysis**: Identify common topics across multiple videos
- **Content Evolution**: Track how topics develop across series
- **Channel Analysis**: Comprehensive channel content analysis
- **Trend Detection**: Identify patterns in content creation and topics
#### 4. Enhanced Export Features
- **Multi-Agent Summaries**: Export includes all three perspective summaries plus synthesis
- **Cross-Video Insights**: Aggregated insights from playlist analysis
- **Comparison Reports**: Side-by-side analysis of videos in series
- **Bulk Export**: Export all playlist summaries with agent analysis
### Technical Architecture
#### Multi-Agent System Components
```python
# Agent Perspective Definitions
class PerspectiveAgent:
TECHNICAL = "technical" # Focus on technical concepts, implementation, tools
BUSINESS = "business" # Focus on business value, ROI, market implications
USER_EXPERIENCE = "user" # Focus on user journey, usability, accessibility
SYNTHESIS = "synthesis" # Combines all perspectives into unified view
# Agent Orchestration
class PlaylistAnalysisOrchestrator(BaseAgent):
def __init__(self):
self.technical_agent = TechnicalAnalysisAgent()
self.business_agent = BusinessAnalysisAgent()
self.ux_agent = UserExperienceAgent()
self.synthesis_agent = SynthesisAgent()
async def analyze_playlist(self, playlist_url: str) -> MultiAgentAnalysisResult:
# Extract videos and process with multiple agents
# Orchestrate agent communication and synthesis
```
#### Database Schema Extensions
```sql
-- Multi-agent analysis storage
CREATE TABLE agent_summaries (
id UUID PRIMARY KEY,
summary_id UUID REFERENCES summaries(id),
agent_type VARCHAR(20), -- 'technical', 'business', 'user', 'synthesis'
agent_summary TEXT,
key_insights JSONB,
confidence_score FLOAT,
processing_time_seconds FLOAT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Playlist analysis
CREATE TABLE playlists (
id UUID PRIMARY KEY,
playlist_id VARCHAR(50),
playlist_url TEXT,
title VARCHAR(500),
channel_name VARCHAR(200),
video_count INTEGER,
total_duration INTEGER,
analyzed_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Cross-video analysis
CREATE TABLE playlist_analysis (
id UUID PRIMARY KEY,
playlist_id UUID REFERENCES playlists(id),
themes JSONB,
content_progression JSONB,
key_insights JSONB,
agent_perspectives JSONB,
synthesis_summary TEXT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```
## Implementation Tasks
### Task 4.3.1: Multi-Agent System Integration (12 hours)
#### Subtasks:
1. **Agent Interface Setup** (3 hours)
- Import and configure BaseAgent from `/src/agents/ecosystem/`
- Set up AgentOrchestrator for workflow management
- Configure DeepSeek integration for all agents (no Anthropic)
- Test agent communication and state management
2. **Perspective Agent Implementation** (6 hours)
- Create TechnicalAnalysisAgent with technical focus prompts
- Create BusinessAnalysisAgent with business value prompts
- Create UserExperienceAgent with UX/accessibility prompts
- Implement agent-specific prompt templates and analysis patterns
3. **Synthesis Agent Development** (3 hours)
- Create SynthesisAgent to combine multiple perspectives
- Implement perspective weighting and conflict resolution
- Design unified summary generation logic
- Test multi-agent coordination and output quality
### Task 4.3.2: Playlist Processing Engine (16 hours)
#### Subtasks:
1. **Playlist URL Parser** (4 hours)
- Extract playlist ID from various YouTube playlist URL formats
- Validate playlist accessibility and permissions
- Handle private/unlisted playlist edge cases
- Integrate with YouTube Data API for metadata
2. **Video Discovery Service** (4 hours)
- Fetch all video IDs from playlist using YouTube Data API
- Extract video metadata (title, duration, upload date)
- Handle pagination for large playlists (100+ videos)
- Implement error handling for deleted/private videos
3. **Batch Processing Pipeline** (6 hours)
- Process videos sequentially with progress tracking
- Integrate with existing SummaryPipeline for individual videos
- Implement multi-agent analysis for each video
- Handle failures and retry logic for individual videos
4. **Cross-Video Analysis Engine** (2 hours)
- Identify common themes across multiple videos
- Track content progression and evolution
- Generate playlist-level insights and patterns
- Create aggregated analysis from agent perspectives
### Task 4.3.3: API Endpoints and Integration (8 hours)
#### Subtasks:
1. **Multi-Agent API Endpoints** (4 hours)
- `POST /api/analysis/multi-agent/{video_id}` - Analyze single video with all agents
- `GET /api/analysis/agent-perspectives/{summary_id}` - Get all agent perspectives
- `POST /api/analysis/playlist` - Start playlist analysis
- `GET /api/analysis/playlist/{job_id}/status` - Monitor playlist processing
2. **Frontend Integration** (3 hours)
- Create MultiAgentAnalysisView component
- Add playlist URL input and validation
- Display agent perspectives in tabbed interface
- Show synthesis summary with highlighting from each perspective
3. **Background Job Management** (1 hour)
- Integrate playlist processing with existing job system
- Add WebSocket progress updates for multi-video processing
- Implement job cancellation for long-running playlist analysis
- Add monitoring and statistics endpoints
### Task 4.3.4: Enhanced Export and Reporting (4 hours)
#### Subtasks:
1. **Multi-Agent Export Formats** (2 hours)
- Enhanced markdown with agent perspective sections
- JSON export with structured agent analysis data
- CSV export for playlist comparison and analysis
- Add agent analysis to existing PDF export
2. **Playlist Reports** (2 hours)
- Generate comprehensive playlist analysis reports
- Include cross-video insights and theme analysis
- Create comparison tables for video progression
- Add executive summary with key playlist insights
## Data Models
### Multi-Agent Analysis Models
```python
from pydantic import BaseModel
from typing import List, Dict, Optional, Any
from enum import Enum
class AgentType(str, Enum):
TECHNICAL = "technical"
BUSINESS = "business"
USER_EXPERIENCE = "user"
SYNTHESIS = "synthesis"
class AgentPerspective(BaseModel):
agent_type: AgentType
summary: str
key_insights: List[str]
confidence_score: float
focus_areas: List[str]
recommendations: List[str]
class MultiAgentAnalysisResult(BaseModel):
video_id: str
perspectives: List[AgentPerspective]
synthesis_summary: str
unified_insights: List[str]
processing_time_seconds: float
class PlaylistAnalysisRequest(BaseModel):
playlist_url: str
include_cross_video_analysis: bool = True
agent_types: List[AgentType] = [AgentType.TECHNICAL, AgentType.BUSINESS, AgentType.USER_EXPERIENCE]
class PlaylistAnalysisResult(BaseModel):
playlist_id: str
video_summaries: List[MultiAgentAnalysisResult]
cross_video_analysis: Dict[str, Any]
playlist_insights: List[str]
themes: List[str]
content_progression: Dict[str, Any]
```
## Testing Strategy
### Unit Tests
- **Agent Integration Tests**: Test BaseAgent and orchestrator integration
- **Perspective Analysis Tests**: Validate each agent's analysis quality
- **Synthesis Logic Tests**: Test perspective combination and conflict resolution
- **Playlist Parser Tests**: Validate URL parsing and video discovery
- **Cross-Video Analysis Tests**: Test theme detection and progression analysis
### Integration Tests
- **Multi-Agent API Tests**: Test all new API endpoints
- **Playlist Processing Tests**: End-to-end playlist analysis workflow
- **Database Integration Tests**: Validate agent_summaries and playlist schema
- **Export Format Tests**: Test enhanced export with agent perspectives
### Performance Tests
- **Large Playlist Tests**: Test with 50+ video playlists
- **Concurrent Analysis Tests**: Multiple playlists processing simultaneously
- **Agent Response Time Tests**: Measure individual agent processing times
- **Memory Usage Tests**: Monitor resource usage during large batch processing
## API Specification
### Multi-Agent Analysis Endpoints
```yaml
/api/analysis/multi-agent/{video_id}:
post:
summary: Analyze video with multiple agent perspectives
parameters:
- name: video_id
in: path
required: true
schema:
type: string
requestBody:
required: true
content:
application/json:
schema:
type: object
properties:
agent_types:
type: array
items:
type: string
enum: [technical, business, user]
include_synthesis:
type: boolean
default: true
responses:
200:
content:
application/json:
schema:
$ref: '#/components/schemas/MultiAgentAnalysisResult'
/api/analysis/playlist:
post:
summary: Start playlist analysis with multi-agent system
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/PlaylistAnalysisRequest'
responses:
202:
content:
application/json:
schema:
type: object
properties:
job_id:
type: string
status:
type: string
enum: [pending, processing]
estimated_completion_time:
type: string
format: date-time
```
## Success Criteria
### Functional Requirements ✅
- [ ] Multi-agent system analyzes videos with 3 different perspectives (Technical, Business, UX)
- [ ] Synthesis agent combines perspectives into unified comprehensive summary
- [ ] Playlist URL parser extracts and validates playlist information
- [ ] Batch processing handles playlists of 20+ videos
- [ ] Cross-video analysis identifies themes and content progression
- [ ] Enhanced export includes all agent perspectives and synthesis
- [ ] API endpoints support multi-agent analysis and playlist processing
### Quality Requirements ✅
- [ ] Agent perspectives show distinct focus areas and insights
- [ ] Synthesis summary effectively combines perspectives without redundancy
- [ ] Processing time under 45 seconds per video for multi-agent analysis
- [ ] Cross-video analysis provides meaningful insights for playlists
- [ ] Export formats maintain readability with multi-agent content
- [ ] Error handling gracefully manages failed videos in playlists
### Performance Requirements ✅
- [ ] Handle playlists up to 100 videos
- [ ] Process 3 agent perspectives + synthesis in under 60 seconds per video
- [ ] Support concurrent playlist processing (2-3 simultaneous jobs)
- [ ] Memory usage remains stable during large playlist processing
- [ ] WebSocket progress updates every 10 seconds during processing
## Implementation Notes
### AI Ecosystem Integration
- Use existing BaseAgent class from `/src/agents/ecosystem/core/base_agent.py`
- Leverage AgentOrchestrator from `/src/agents/ecosystem/orchestration/orchestrator.py`
- Follow LangGraph patterns from `/tests/framework-comparison/test_langgraph_chromadb.py`
- Implement DeepSeek integration for all agents (avoid Anthropic per user requirements)
### Agent Perspective Design
- **Technical Agent**: Focus on implementation details, tools, technical concepts, architecture
- **Business Agent**: Analyze ROI, market implications, business value, competitive analysis
- **UX Agent**: Evaluate user journey, accessibility, usability, user experience patterns
- **Synthesis Agent**: Combine insights, resolve conflicts, create unified narrative
### Cross-Video Analysis Patterns
- Theme extraction using semantic similarity across video summaries
- Content progression tracking through timestamp and topic analysis
- Series detection through title patterns and content similarity
- Channel analysis through upload patterns and topic evolution
### Performance Considerations
- Parallel agent processing where possible (Technical, Business, UX can run concurrently)
- Intelligent caching of agent analyses for repeated playlist processing
- Batch database operations for playlist video processing
- WebSocket connection management for long-running playlist jobs
## Risk Mitigation
### High Risk: Agent Analysis Quality
- **Risk**: Agents provide similar or redundant perspectives
- **Mitigation**: Distinct prompt engineering with clear focus areas, quality scoring, and validation
### Medium Risk: Processing Time
- **Risk**: Multi-agent analysis too slow for user experience
- **Mitigation**: Parallel agent processing, progress indicators, background processing
### Medium Risk: Large Playlist Performance
- **Risk**: Memory/resource exhaustion with 50+ video playlists
- **Mitigation**: Batch processing limits, resource monitoring, job queuing system
---
**Story Owner**: Development Team
**Architecture Reference**: BMad Method Epic-Story Structure
**Implementation Status**: Ready for Development
**Last Updated**: 2025-08-27