History

enias 053e8fc63b feat: Enhanced Epic 4 with Multi-Agent System and RAG Chat ### Updated Epic 4 Documentation - Enhanced Story 4.3: Multi-video Analysis with Multi-Agent System - Three perspective agents (Technical, Business, User) - Synthesis agent for unified summaries - Integration with existing AI ecosystem - Increased effort from 28 to 40 hours - Enhanced Story 4.4: Custom Models & Enhanced Markdown Export - Executive summary generation (2-3 paragraphs) - Timestamped sections with [HH:MM:SS] format - Enhanced markdown structure with table of contents - Increased effort from 24 to 32 hours - Enhanced Story 4.6: RAG-Powered Video Chat with ChromaDB - ChromaDB vector database integration - RAG implementation using existing test patterns - Chat interface with timestamp source references - DeepSeek integration for AI responses ### Epic Effort Updates - Total Epic 4 effort: 126 → 146 hours - Remaining work: 72 → 92 hours - Implementation timeline extended to 4-5 weeks 🤖 Generated with [Claude Code](https://claude.ai/code) Co-Authored-By: Claude <noreply@anthropic.com>		2025-08-27 04:22:46 -04:00
..
youtube_summarizer_sdk	feat: Enhanced Epic 4 with Multi-Agent System and RAG Chat	2025-08-27 04:22:46 -04:00
README.md	feat: Enhanced Epic 4 with Multi-Agent System and RAG Chat	2025-08-27 04:22:46 -04:00
setup.py	feat: Enhanced Epic 4 with Multi-Agent System and RAG Chat	2025-08-27 04:22:46 -04:00

README.md

YouTube Summarizer Python SDK

Official Python client library for the YouTube Summarizer Developer Platform. Extract transcripts, generate summaries, and integrate AI-powered video analysis into your applications.

Features

Async/Await Support - Built for modern Python applications
Dual Transcript Sources - YouTube captions, Whisper AI, or both
Real-time Updates - WebSocket support for progress tracking
Batch Processing - Process multiple videos simultaneously
Quality Analysis - Transcript quality scoring and comparison
MCP Integration - Model Context Protocol support for AI development
Cost Estimation - Processing time and cost predictions
Export Options - JSON, CSV, Markdown, and PDF formats

Installation

pip install youtube-summarizer-sdk

Optional Dependencies

# For MCP (Model Context Protocol) support
pip install youtube-summarizer-sdk[mcp]

# For development
pip install youtube-summarizer-sdk[dev]

# Install all extras
pip install youtube-summarizer-sdk[all]

Quick Start

import asyncio
from youtube_summarizer_sdk import create_client, TranscriptRequest

async def main():
    # Initialize client with your API key
    client = create_client(api_key="ys_pro_your_api_key_here")
    
    async with client:
        # Extract transcript from YouTube video
        request = TranscriptRequest(
            video_url="https://youtube.com/watch?v=dQw4w9WgXcQ",
            transcript_source="youtube",
            include_quality_analysis=True
        )
        
        # Wait for completion (blocks until done)
        result = await client.extract_and_wait(request)
        
        print(f"Transcript: {result.transcript[:200]}...")
        print(f"Quality Score: {result.quality_score}")
        print(f"Processing Time: {result.processing_time_seconds}s")

asyncio.run(main())

Core Features

Transcript Extraction

from youtube_summarizer_sdk import TranscriptRequest, TranscriptSource

# YouTube captions
request = TranscriptRequest(
    video_url="https://youtube.com/watch?v=VIDEO_ID",
    transcript_source=TranscriptSource.YOUTUBE
)

# Whisper AI transcription  
request = TranscriptRequest(
    video_url="https://youtube.com/watch?v=VIDEO_ID",
    transcript_source=TranscriptSource.WHISPER,
    whisper_model_size="small"  # tiny, base, small, medium, large
)

# Both sources with comparison
request = TranscriptRequest(
    video_url="https://youtube.com/watch?v=VIDEO_ID", 
    transcript_source=TranscriptSource.BOTH,
    include_quality_analysis=True
)

# Submit and wait for result
result = await client.extract_and_wait(request, timeout=300)

Batch Processing

from youtube_summarizer_sdk import BatchProcessingRequest

# Process multiple videos
batch_request = BatchProcessingRequest(
    video_urls=[
        "https://youtube.com/watch?v=VIDEO1",
        "https://youtube.com/watch?v=VIDEO2",
        "https://youtube.com/watch?v=VIDEO3"
    ],
    batch_name="My Video Collection",
    transcript_source="youtube",
    parallel_processing=True,
    max_concurrent_jobs=3
)

batch_job = await client.batch_process(batch_request)
print(f"Batch ID: {batch_job.batch_id}")

Real-time Progress Tracking

# Connect WebSocket for real-time updates
await client.connect_websocket()

# Submit job
job = await client.extract_transcript(request)

# Listen for updates
async for update in client.listen_for_updates():
    if update.data.get("job_id") == job.job_id:
        print(f"Progress: {update.data.get('progress', 0)}%")
        
        if update.event == "job.completed":
            result = await client.get_job_result(job.job_id)
            break

Processing Estimates

# Get time and cost estimate
estimate = await client.get_processing_estimate(
    video_url="https://youtube.com/watch?v=VIDEO_ID",
    transcript_source="whisper"
)

print(f"Estimated time: {estimate.estimated_time_seconds}s")
print(f"Estimated cost: ${estimate.estimated_cost:.4f}")

Data Export

# Export data in various formats
export_data = await client.export_data(
    format="json",  # json, csv, markdown, pdf
    date_from="2024-01-01",
    date_to="2024-12-31"
)

print(export_data)

MCP (Model Context Protocol) Integration

The SDK includes MCP support for AI development environments like Claude Code:

from youtube_summarizer_sdk import create_mcp_interface

# Create MCP interface
mcp = create_mcp_interface(api_key="your_api_key")

# List available tools
tools = await mcp.list_tools()

# Execute MCP tool
from youtube_summarizer_sdk import MCPToolRequest
request = MCPToolRequest(
    name="extract_transcript",
    arguments={
        "video_url": "https://youtube.com/watch?v=VIDEO_ID",
        "transcript_source": "youtube",
        "wait_for_completion": True
    }
)

result = await mcp.call_tool(request)

Configuration

Client Configuration

from youtube_summarizer_sdk import SDKConfig, YouTubeSummarizerClient

config = SDKConfig(
    api_key="your_api_key",
    base_url="https://api.youtube-summarizer.com",
    timeout=60.0,
    max_retries=3,
    retry_delay=1.0,
    verify_ssl=True
)

client = YouTubeSummarizerClient(config)

WebSocket Configuration

from youtube_summarizer_sdk import WebSocketConfig

ws_config = WebSocketConfig(
    url="wss://api.youtube-summarizer.com/ws", 
    auto_reconnect=True,
    max_reconnect_attempts=5,
    heartbeat_interval=30.0
)

await client.connect_websocket(ws_config)

API Reference

Models

TranscriptRequest - Video transcript extraction request
BatchProcessingRequest - Batch video processing request
JobResponse - Job creation and status response
TranscriptResult - Single transcript extraction result
DualTranscriptResult - Dual transcript comparison result
APIUsageStats - Usage statistics and limits
ProcessingTimeEstimate - Time and cost estimates

Enums

TranscriptSource - youtube, whisper, both
WhisperModelSize - tiny, base, small, medium, large
ProcessingPriority - low, normal, high, urgent
JobStatus - queued, processing, completed, failed, cancelled

Main Client Methods

# Core API methods
await client.extract_transcript(request: TranscriptRequest) -> JobResponse
await client.batch_process(request: BatchProcessingRequest) -> BatchJobResponse
await client.get_job_status(job_id: str) -> JobResponse
await client.get_job_result(job_id: str) -> Union[TranscriptResult, DualTranscriptResult]
await client.cancel_job(job_id: str) -> Dict[str, Any]

# Utility methods
await client.get_processing_estimate(video_url: str) -> ProcessingTimeEstimate
await client.get_usage_stats() -> APIUsageStats
await client.search_summaries(query: str) -> Dict[str, Any]
await client.export_data(format: str = 'json') -> Dict[str, Any]

# Convenience methods
await client.extract_and_wait(request: TranscriptRequest, timeout: float = 300) -> Union[TranscriptResult, DualTranscriptResult]
await client.wait_for_job(job_id: str, timeout: float = 300) -> Union[TranscriptResult, DualTranscriptResult]

# WebSocket methods
await client.connect_websocket(config: Optional[WebSocketConfig] = None) -> bool
async for update in client.listen_for_updates(): # -> AsyncGenerator[WebhookPayload, None]
await client.disconnect_websocket()

Error Handling

from youtube_summarizer_sdk import (
    YouTubeSummarizerError, AuthenticationError, RateLimitError,
    ValidationError, APIError, JobTimeoutError
)

try:
    result = await client.extract_transcript(request)
except AuthenticationError:
    print("Invalid API key")
except RateLimitError as e:
    print(f"Rate limited. Remaining: {e.remaining}, Reset: {e.reset_time}")
except ValidationError as e:
    print(f"Validation failed: {e.validation_errors}")
except JobTimeoutError as e:
    print(f"Job {e.job_id} timed out after {e.timeout_seconds}s")
except YouTubeSummarizerError as e:
    print(f"SDK error: {e.message}")

Examples

Basic Usage

import asyncio
from youtube_summarizer_sdk import create_client, TranscriptRequest

async def extract_transcript():
    client = create_client(api_key="your_api_key")
    
    async with client:
        request = TranscriptRequest(
            video_url="https://youtube.com/watch?v=dQw4w9WgXcQ"
        )
        
        result = await client.extract_and_wait(request)
        print(f"Transcript: {result.transcript}")
        return result

asyncio.run(extract_transcript())

Dual Transcript Comparison

async def compare_transcripts():
    client = create_client(api_key="your_api_key")
    
    async with client:
        request = TranscriptRequest(
            video_url="https://youtube.com/watch?v=VIDEO_ID",
            transcript_source="both",  # Extract both YouTube and Whisper
            include_quality_analysis=True
        )
        
        result = await client.extract_and_wait(request)
        
        if hasattr(result, 'quality_comparison'):
            comparison = result.quality_comparison
            print(f"Similarity Score: {comparison.similarity_score}")
            print(f"Recommended Source: {comparison.recommendation}")
            print(f"YouTube Transcript: {result.youtube_transcript[:200]}...")
            print(f"Whisper Transcript: {result.whisper_transcript[:200]}...")

asyncio.run(compare_transcripts())

Batch Processing with Progress

async def batch_process_with_progress():
    client = create_client(api_key="your_api_key")
    
    async with client:
        # Connect WebSocket for real-time updates
        await client.connect_websocket()
        
        # Submit batch job
        batch_request = BatchProcessingRequest(
            video_urls=[
                "https://youtube.com/watch?v=VIDEO1",
                "https://youtube.com/watch?v=VIDEO2"
            ],
            batch_name="Tutorial Series",
            parallel_processing=True
        )
        
        batch_job = await client.batch_process(batch_request)
        
        # Listen for progress updates
        async for update in client.listen_for_updates():
            if update.event == "batch.completed":
                print("Batch processing completed!")
                break
            elif update.event == "job.progress":
                print(f"Progress: {update.data}")

asyncio.run(batch_process_with_progress())

Contributing

Clone the repository
Install development dependencies: pip install -e .[dev]
Run tests: pytest
Format code: black youtube_summarizer_sdk/
Type check: mypy youtube_summarizer_sdk/

API Tiers & Rate Limits

Tier	Requests/Minute	Requests/Day	Requests/Month
Free	10	1,000	10,000
Pro	100	25,000	500,000
Enterprise	1,000	100,000	2,000,000

Support

Documentation: https://docs.youtube-summarizer.com/python-sdk
API Reference: https://api.youtube-summarizer.com/docs
Issues: https://github.com/youtube-summarizer/python-sdk/issues
Email: support@youtube-summarizer.com

License

This SDK is licensed under the MIT License. See the LICENSE file for details.