youtube-summarizer/docs/stories/4.4.custom-models-enhanced-...

485 lines
17 KiB
Markdown

# Story 4.4: Custom AI Models & Enhanced Markdown Export
## Story Overview
**Story ID**: 4.4
**Epic**: 4 - Advanced Intelligence & Developer Platform
**Title**: Custom AI Models & Enhanced Markdown Export
**Status**: 📋 READY FOR IMPLEMENTATION
**Priority**: High
**Goal**: Implement custom AI model configurations with enhanced markdown export featuring executive summaries, timestamped sections with clickable navigation, and professional formatting.
**Value Proposition**: Provide professional export format with executive-level insights and improved navigation, while offering custom prompt templates and model parameter configurations for domain-specific summarization needs.
**Dependencies**:
- ✅ Story 4.2 (API Endpoints & Developer SDK) - Complete
- ✅ Existing export system foundation
- ✅ AI service infrastructure
**Estimated Effort**: 32 hours (enhanced with export features)
## Technical Requirements
### Core Features
#### 1. Executive Summary Generation
- **2-3 Paragraph Overview**: Concise executive summary at top of all exports
- **Key Metrics**: Video duration, word count, main topics, sentiment analysis
- **Decision-Maker Focus**: Business value, ROI implications, action items
- **Executive Language**: Professional tone suitable for leadership consumption
#### 2. Timestamped Sections Enhancement
- **Format**: `[HH:MM:SS] Section Title` with clickable navigation
- **Semantic Segmentation**: Intelligent topic-based section detection
- **Jump-to-Video**: Links that open YouTube video at specific timestamps
- **Section Summaries**: Brief summary for each timestamped section
- **Progress Indicators**: Visual progress through video content
#### 3. Enhanced Markdown Structure
- **Table of Contents**: Auto-generated with timestamp links
- **Hierarchical Sections**: Nested structure following video content flow
- **Improved Formatting**: Professional typography with consistent styling
- **Metadata Header**: Video info, analysis date, processing details
- **Footer**: Analysis metadata and quality indicators
#### 4. Custom Prompt Template Management
- **Template Library**: Predefined templates for different use cases
- **Template Editor**: Web interface for creating custom prompt templates
- **Parameter Configuration**: Temperature, token limits, model selection
- **Template Versioning**: Track changes and performance metrics
- **Sharing System**: Public/private template sharing
#### 5. Domain-Specific Presets
- **Educational**: Focus on learning objectives, key concepts, exercises
- **Business**: Emphasize ROI, market implications, strategic insights
- **Technical**: Highlight implementation details, tools, architecture
- **Content Creation**: Analyze engagement patterns, audience insights
- **Research**: Academic focus with citations and methodology
#### 6. A/B Testing Framework
- **Prompt Optimization**: Test different prompt variations
- **Model Comparison**: Compare outputs across different AI models
- **Quality Metrics**: Automated quality scoring and user feedback
- **Performance Analytics**: Processing time, cost, accuracy tracking
- **Statistical Analysis**: Confidence intervals and significance testing
### Technical Architecture
#### Enhanced Export Pipeline
```python
class EnhancedExportService:
def __init__(self):
self.executive_generator = ExecutiveSummaryGenerator()
self.timestamp_processor = TimestampProcessor()
self.markdown_formatter = MarkdownFormatter()
self.template_manager = TemplateManager()
async def generate_enhanced_export(
self,
summary_id: str,
template_id: Optional[str] = None,
export_config: ExportConfig = None
) -> EnhancedMarkdownExport:
# Generate executive summary
# Process timestamps and create navigation
# Apply custom formatting and template
# Return structured export with metadata
```
#### Database Schema Extensions
```sql
-- Custom prompt templates
CREATE TABLE prompt_templates (
id UUID PRIMARY KEY,
user_id UUID REFERENCES users(id),
name VARCHAR(200),
description TEXT,
prompt_text TEXT,
domain_category VARCHAR(50), -- 'educational', 'business', 'technical', etc.
model_config JSONB, -- temperature, max_tokens, etc.
is_public BOOLEAN DEFAULT FALSE,
usage_count INTEGER DEFAULT 0,
rating FLOAT DEFAULT 0.0,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
updated_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- A/B testing experiments
CREATE TABLE prompt_experiments (
id UUID PRIMARY KEY,
name VARCHAR(200),
description TEXT,
baseline_template_id UUID REFERENCES prompt_templates(id),
variant_template_id UUID REFERENCES prompt_templates(id),
status VARCHAR(20) DEFAULT 'active', -- 'active', 'completed', 'paused'
success_metric VARCHAR(50), -- 'quality_score', 'user_rating', 'processing_time'
statistical_significance FLOAT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Enhanced export metadata
CREATE TABLE export_metadata (
id UUID PRIMARY KEY,
summary_id UUID REFERENCES summaries(id),
template_id UUID REFERENCES prompt_templates(id),
export_type VARCHAR(20), -- 'markdown', 'pdf', 'json'
executive_summary TEXT,
section_count INTEGER,
timestamp_count INTEGER,
processing_time_seconds FLOAT,
quality_score FLOAT,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
-- Section detection and timestamps
CREATE TABLE summary_sections (
id UUID PRIMARY KEY,
summary_id UUID REFERENCES summaries(id),
section_index INTEGER,
title VARCHAR(300),
start_timestamp INTEGER, -- seconds
end_timestamp INTEGER,
content TEXT,
summary TEXT,
key_points JSONB,
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);
```
## Implementation Tasks
### Task 4.4.1: Executive Summary Generation (8 hours)
#### Subtasks:
1. **Executive Summary Generator** (4 hours)
- Create ExecutiveSummaryGenerator service class
- Implement professional executive summary prompts
- Extract key metrics (duration, topics, sentiment)
- Generate business value and action items
- Test summary quality and executive readability
2. **Metadata Collection Service** (2 hours)
- Extract video metadata (duration, views, publish date)
- Calculate transcript statistics (word count, reading time)
- Perform sentiment analysis on content
- Generate quality and confidence scores
- Create metadata header for exports
3. **Executive Template System** (2 hours)
- Create executive summary template variations
- Implement length controls (brief, standard, detailed)
- Add business context detection and emphasis
- Create executive language style guide
- Test with various video content types
### Task 4.4.2: Timestamped Sections & Navigation (12 hours)
#### Subtasks:
1. **Semantic Section Detection** (5 hours)
- Implement topic segmentation algorithm using transcript
- Identify natural section breaks in video content
- Extract meaningful section titles from content
- Handle various video formats (lectures, tutorials, discussions)
- Validate section quality and coherence
2. **Timestamp Processing Engine** (4 hours)
- Convert transcript timestamps to `[HH:MM:SS]` format
- Generate clickable YouTube links with timestamp parameters
- Create section navigation structure
- Implement deep-linking to video positions
- Add timestamp validation and error handling
3. **Table of Contents Generator** (3 hours)
- Auto-generate markdown table of contents
- Create hierarchical section structure
- Add timestamp links for each section
- Implement progress indicators and visual elements
- Test navigation functionality across different markdown renderers
### Task 4.4.3: Enhanced Markdown Formatting (6 hours)
#### Subtasks:
1. **Professional Markdown Formatter** (3 hours)
- Create enhanced markdown template system
- Implement consistent typography and styling
- Add metadata headers and footers
- Create professional document structure
- Test with various markdown parsers and renderers
2. **Section Content Enhancement** (2 hours)
- Generate brief summaries for each timestamped section
- Add key points and takeaways per section
- Implement content hierarchy and flow
- Create section transitions and connections
- Validate content quality and readability
3. **Export Quality Control** (1 hour)
- Implement markdown validation and quality checks
- Add automated formatting consistency verification
- Create export preview and validation system
- Test with different export destinations (GitHub, Notion, etc.)
- Add quality scoring for generated exports
### Task 4.4.4: Custom Prompt Template System (6 hours)
#### Subtasks:
1. **Template Management Backend** (3 hours)
- Create TemplateManager service with CRUD operations
- Implement template versioning and history tracking
- Add template validation and security checks
- Create template sharing and permissions system
- Build template performance analytics
2. **Template Editor Frontend** (2 hours)
- Create template creation/editing interface
- Add preview functionality for prompt testing
- Implement parameter configuration (temperature, tokens)
- Create template library browser with categories
- Add template rating and feedback system
3. **Domain-Specific Presets** (1 hour)
- Create predefined templates for each domain
- Educational preset: learning objectives, key concepts
- Business preset: ROI analysis, strategic insights
- Technical preset: implementation details, architecture
- Test preset effectiveness across different content types
## Data Models
### Enhanced Export Models
```python
from pydantic import BaseModel
from typing import List, Dict, Optional, Any
from datetime import datetime
from enum import Enum
class ExportFormat(str, Enum):
MARKDOWN = "markdown"
PDF = "pdf"
JSON = "json"
HTML = "html"
class DomainCategory(str, Enum):
EDUCATIONAL = "educational"
BUSINESS = "business"
TECHNICAL = "technical"
CONTENT_CREATION = "content_creation"
RESEARCH = "research"
GENERAL = "general"
class ExecutiveSummary(BaseModel):
overview: str
key_metrics: Dict[str, Any]
main_topics: List[str]
business_value: Optional[str] = None
action_items: List[str]
sentiment_analysis: Dict[str, float]
class TimestampedSection(BaseModel):
index: int
title: str
start_timestamp: int # seconds
end_timestamp: int
youtube_link: str
content: str
summary: str
key_points: List[str]
class PromptTemplate(BaseModel):
id: str
name: str
description: str
prompt_text: str
domain_category: DomainCategory
model_config: Dict[str, Any]
is_public: bool
usage_count: int
rating: float
created_at: datetime
updated_at: datetime
class ExportConfig(BaseModel):
format: ExportFormat
include_executive_summary: bool = True
include_timestamps: bool = True
include_toc: bool = True
section_detail_level: str = "standard" # brief, standard, detailed
custom_template_id: Optional[str] = None
class EnhancedMarkdownExport(BaseModel):
summary_id: str
executive_summary: ExecutiveSummary
table_of_contents: List[str]
sections: List[TimestampedSection]
markdown_content: str
metadata: Dict[str, Any]
quality_score: float
processing_time_seconds: float
template_used: Optional[PromptTemplate] = None
```
## Testing Strategy
### Unit Tests
- **Executive Summary Tests**: Quality and business value extraction
- **Timestamp Processing Tests**: Format validation and link generation
- **Section Detection Tests**: Topic segmentation accuracy
- **Template System Tests**: CRUD operations and validation
- **Markdown Formatter Tests**: Output quality and consistency
### Integration Tests
- **Export Pipeline Tests**: End-to-end enhanced export generation
- **Template Usage Tests**: Custom template application to summaries
- **API Integration Tests**: All new export and template endpoints
- **Database Tests**: Template storage and retrieval operations
### Quality Assurance Tests
- **Executive Summary Quality**: Business relevance and actionability
- **Navigation Functionality**: Timestamp links and table of contents
- **Template Effectiveness**: Domain-specific preset performance
- **Export Consistency**: Output quality across different content types
## API Specification
### Enhanced Export Endpoints
```yaml
/api/export/{summary_id}/enhanced:
post:
summary: Generate enhanced markdown export with executive summary
parameters:
- name: summary_id
in: path
required: true
schema:
type: string
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/ExportConfig'
responses:
200:
content:
application/json:
schema:
$ref: '#/components/schemas/EnhancedMarkdownExport'
/api/templates:
get:
summary: List available prompt templates
parameters:
- name: domain
in: query
schema:
type: string
enum: [educational, business, technical, content_creation, research]
- name: public_only
in: query
schema:
type: boolean
default: false
responses:
200:
content:
application/json:
schema:
type: array
items:
$ref: '#/components/schemas/PromptTemplate'
post:
summary: Create new prompt template
requestBody:
required: true
content:
application/json:
schema:
$ref: '#/components/schemas/PromptTemplate'
responses:
201:
content:
application/json:
schema:
$ref: '#/components/schemas/PromptTemplate'
```
## Success Criteria
### Functional Requirements ✅
- [ ] Executive summary generation provides 2-3 paragraph professional overview
- [ ] Timestamped sections use `[HH:MM:SS]` format with clickable YouTube links
- [ ] Table of contents auto-generates with hierarchical navigation structure
- [ ] Custom prompt templates support creation, editing, and sharing
- [ ] Domain-specific presets available for 5 different categories
- [ ] Enhanced markdown exports maintain professional formatting and consistency
### Quality Requirements ✅
- [ ] Executive summaries focus on business value and actionable insights
- [ ] Section detection creates logical content segments (accuracy >85%)
- [ ] Timestamp links correctly jump to specific video positions
- [ ] Custom templates produce measurably different output characteristics
- [ ] Export formatting renders correctly across major markdown platforms
- [ ] Template performance tracking shows usage analytics and effectiveness
### Performance Requirements ✅
- [ ] Executive summary generation completes in under 15 seconds
- [ ] Enhanced export processing under 30 seconds for standard videos
- [ ] Template application adds less than 10 seconds to summary generation
- [ ] Section detection handles videos up to 3 hours in length
- [ ] Template editor provides real-time preview with <2 second latency
## Implementation Notes
### Executive Summary Best Practices
- Focus on business value and strategic implications
- Include quantitative metrics where possible (duration, key statistics)
- Provide clear action items and next steps
- Use executive language appropriate for decision-makers
- Keep length between 150-300 words for optimal readability
### Timestamp Processing Guidelines
- Detect natural section breaks using topic modeling
- Create meaningful section titles that summarize content
- Ensure timestamp accuracy within 5-second tolerance
- Generate YouTube deep links with proper timestamp parameters
- Handle edge cases (missing timestamps, very short sections)
### Template System Design
- Implement template versioning for tracking changes
- Add validation to prevent malicious prompt injection
- Create template effectiveness scoring based on user feedback
- Support template inheritance and composition
- Enable collaborative template development
### Markdown Enhancement Patterns
- Use consistent heading hierarchy (H1 for title, H2 for sections)
- Add metadata blocks at beginning and end of documents
- Implement responsive table of contents for different viewers
- Create professional typography with proper spacing and formatting
- Test compatibility with GitHub, Notion, Obsidian, and other platforms
## Risk Mitigation
### High Risk: Executive Summary Quality
- **Risk**: Generic summaries that don't provide executive value
- **Mitigation**: Business-focused prompts, quality scoring, executive review feedback
### Medium Risk: Section Detection Accuracy
- **Risk**: Poor section breaks or meaningless titles
- **Mitigation**: Topic modeling validation, manual override options, quality thresholds
### Medium Risk: Template Complexity
- **Risk**: Template system too complex for users to adopt
- **Mitigation**: Simple editor interface, comprehensive presets, guided template creation
---
**Story Owner**: Development Team
**Architecture Reference**: BMad Method Epic-Story Structure
**Implementation Status**: Ready for Development
**Last Updated**: 2025-08-27