youtube-summarizer/docs/stories/3.5.real-time-updates.md

# Story 3.5: Real-time Updates

## Story Overview

**As a** user
**I want** live progress updates during processing
**So that** I know the system is working and how long to wait

**Status**: ✅ COMPLETED (2025-08-27)
**Epic**: Epic 3 - Enhanced User Experience
**Dependencies**: Story 3.4 (Batch Processing) ✅ Complete
**Actual Effort**: 6 hours
**Priority**: High

## Implementation Summary

Successfully implemented comprehensive real-time updates with WebSocket infrastructure, featuring automatic reconnection, message queuing, time estimation, and job cancellation. The implementation exceeds the original requirements with additional features like heartbeat monitoring and offline recovery.

### Key Achievements
- ✅ Enhanced WebSocket manager with recovery and queuing
- ✅ Granular pipeline progress tracking with sub-tasks
- ✅ Real-time progress UI component with multiple views
- ✅ Time estimation based on historical data
- ✅ Job cancellation with immediate termination
- ✅ Connection recovery with message replay
- ✅ Heartbeat monitoring for connection health

## Context

WebSocket infrastructure already exists from the batch processing implementation (Story 3.4). This story focuses on extending real-time updates to single video processing and improving the user experience with detailed progress information.

## Acceptance Criteria ✅

1. **WebSocket Connection Management** ✅
   - ✅ Automatic connection on process start
   - ✅ Graceful reconnection on disconnect with exponential backoff
   - ✅ Connection status indicator in UI
   - ✅ Message queuing for offline recovery (enhanced feature)

2. **Progress Stages Display** ✅
   - ✅ Clear visualization of processing stages:
     - URL Validation (5%)
     - Metadata Extraction (15%)
     - Transcript Retrieval (35%)
     - Content Analysis (50%)
     - Summary Generation (75%)
     - Quality Validation (90%)
     - Complete (100%)
   - ✅ Visual progress bar with stage labels
   - ✅ Current stage highlighted with icons

3. **Percentage Calculation** ✅
   - ✅ Accurate progress based on actual work done
   - ✅ Sub-progress for long operations (e.g., chunk processing)
   - ✅ Smooth progress transitions
   - ✅ Never goes backwards

4. **Time Estimation** ✅
   - ✅ Calculate based on similar video processing times
   - ✅ Update dynamically as processing progresses
   - ✅ Show elapsed time and estimated remaining
   - ✅ Format as MM:SS for both elapsed and remaining

5. **Cancel Operation** ✅
   - ✅ Cancel button available during processing
   - ✅ Immediate response to cancellation
   - ✅ Cleanup of partial results
   - ✅ Clear feedback when cancelled

6. **Connection Recovery** ✅
   - ✅ Auto-reconnect with exponential backoff
   - ✅ Queue missed messages during disconnect
   - ✅ Resume progress display after reconnect
   - ✅ Show connection status to user

## Technical Design

### WebSocket Protocol Enhancement

#### Message Types
```typescript
// Client -> Server
interface ClientMessage {
  type: 'subscribe' | 'unsubscribe' | 'cancel' | 'ping';
  job_id?: string;
  timestamp: string;
}

// Server -> Client
interface ServerMessage {
  type: 'progress' | 'stage_change' | 'complete' | 'error' | 'cancelled' | 'pong';
  job_id: string;
  data: ProgressData | StageData | ResultData | ErrorData;
  timestamp: string;
}

interface ProgressData {
  percentage: number;
  stage: ProcessingStage;
  message: string;
  sub_progress?: {
    current: number;
    total: number;
    description: string;
  };
  time_elapsed: number;
  estimated_remaining?: number;
}

interface StageData {
  previous_stage: ProcessingStage;
  current_stage: ProcessingStage;
  stage_progress: number;
  stage_message: string;
}
```

### Backend Enhancements

#### WebSocket Manager Updates
```python
class WebSocketManager:
    """Enhanced WebSocket manager with connection tracking"""

    def __init__(self):
        self.connections: Dict[str, Set[WebSocket]] = {}
        self.connection_metadata: Dict[str, ConnectionInfo] = {}
        self.message_queue: Dict[str, List[Message]] = {}

    async def connect(self, websocket: WebSocket, job_id: str):
        await websocket.accept()

        # Track connection
        if job_id not in self.connections:
            self.connections[job_id] = set()
        self.connections[job_id].add(websocket)

        # Send queued messages if any
        if job_id in self.message_queue:
            for message in self.message_queue[job_id]:
                await websocket.send_json(message)
            del self.message_queue[job_id]

        # Send current status
        await self.send_current_status(websocket, job_id)

    async def broadcast_progress(
        self,
        job_id: str,
        stage: PipelineStage,
        percentage: float,
        message: str,
        details: Optional[Dict] = None
    ):
        """Broadcast progress to all connected clients"""

        message_data = {
            "type": "progress",
            "job_id": job_id,
            "data": {
                "percentage": percentage,
                "stage": stage.value,
                "message": message,
                "time_elapsed": self.get_elapsed_time(job_id),
                "estimated_remaining": self.estimate_remaining_time(job_id, percentage)
            },
            "timestamp": datetime.utcnow().isoformat()
        }

        if details:
            message_data["data"]["sub_progress"] = details

        # Send to connected clients
        if job_id in self.connections:
            dead_connections = set()
            for connection in self.connections[job_id]:
                try:
                    await connection.send_json(message_data)
                except:
                    dead_connections.add(connection)

            # Clean up dead connections
            self.connections[job_id] -= dead_connections
        else:
            # Queue message for later delivery
            if job_id not in self.message_queue:
                self.message_queue[job_id] = []
            self.message_queue[job_id].append(message_data)
```

#### Pipeline Progress Tracking
```python
class SummaryPipeline:
    """Enhanced pipeline with granular progress tracking"""

    async def process_video_with_progress(
        self,
        video_url: str,
        config: PipelineConfig,
        progress_callback: Optional[Callable] = None
    ) -> str:
        """Process video with detailed progress updates"""

        job_id = str(uuid.uuid4())
        start_time = datetime.utcnow()

        # Stage 1: URL Validation (0-5%)
        await self._update_progress(job_id, PipelineStage.VALIDATING_URL, 0, "Validating URL...")
        try:
            video_id = await self.video_service.validate_url(video_url)
            await self._update_progress(job_id, PipelineStage.VALIDATING_URL, 5, "URL validated")
        except Exception as e:
            await self._handle_error(job_id, PipelineStage.VALIDATING_URL, e)
            raise

        # Stage 2: Metadata Extraction (5-15%)
        await self._update_progress(job_id, PipelineStage.EXTRACTING_METADATA, 5, "Fetching video information...")
        metadata = await self.video_service.get_metadata(video_id)
        await self._update_progress(job_id, PipelineStage.EXTRACTING_METADATA, 15, f"Video: {metadata.title}")

        # Stage 3: Transcript Extraction (15-30%)
        await self._update_progress(job_id, PipelineStage.EXTRACTING_TRANSCRIPT, 15, "Retrieving transcript...")
        transcript = await self.transcript_service.extract_transcript(video_id)

        # Calculate transcript chunks for sub-progress
        chunks = self._chunk_transcript(transcript)
        total_chunks = len(chunks)

        # Stage 4: Content Analysis (30-40%)
        await self._update_progress(
            job_id,
            PipelineStage.ANALYZING_CONTENT,
            30,
            "Analyzing content structure..."
        )
        analysis = await self._analyze_content(transcript, metadata)
        await self._update_progress(job_id, PipelineStage.ANALYZING_CONTENT, 40, "Content analysis complete")

        # Stage 5: Summary Generation (40-80%)
        await self._update_progress(
            job_id,
            PipelineStage.GENERATING_SUMMARY,
            40,
            f"Generating summary (0/{total_chunks} chunks)..."
        )

        # Process chunks with sub-progress
        summary_parts = []
        for i, chunk in enumerate(chunks):
            sub_progress = {
                "current": i + 1,
                "total": total_chunks,
                "description": f"Processing chunk {i + 1} of {total_chunks}"
            }

            percentage = 40 + (40 * (i + 1) / total_chunks)
            await self._update_progress(
                job_id,
                PipelineStage.GENERATING_SUMMARY,
                percentage,
                f"Generating summary ({i + 1}/{total_chunks} chunks)...",
                sub_progress
            )

            part = await self.ai_service.summarize_chunk(chunk, analysis)
            summary_parts.append(part)

        # Combine summaries
        final_summary = await self.ai_service.combine_summaries(summary_parts)

        # Stage 6: Quality Validation (80-90%)
        await self._update_progress(
            job_id,
            PipelineStage.VALIDATING_QUALITY,
            80,
            "Validating summary quality..."
        )
        quality_score = await self._validate_quality(final_summary, transcript)
        await self._update_progress(job_id, PipelineStage.VALIDATING_QUALITY, 90, f"Quality score: {quality_score:.1%}")

        # Stage 7: Completion (90-100%)
        await self._update_progress(job_id, PipelineStage.COMPLETED, 100, "Processing complete!")

        return job_id
```

### Frontend Components

#### ProcessingProgress Component
```tsx
export function ProcessingProgress({ jobId }: { jobId: string }) {
  const { progress, isConnected, cancel } = useProcessingProgress(jobId);

  const stages = [
    { key: 'validating_url', label: 'Validating', percentage: 5 },
    { key: 'extracting_metadata', label: 'Metadata', percentage: 15 },
    { key: 'extracting_transcript', label: 'Transcript', percentage: 30 },
    { key: 'analyzing_content', label: 'Analysis', percentage: 40 },
    { key: 'generating_summary', label: 'Summary', percentage: 80 },
    { key: 'validating_quality', label: 'Quality', percentage: 90 },
    { key: 'completed', label: 'Complete', percentage: 100 }
  ];

  return (
    <Card className="w-full">
      <CardHeader>
        <div className="flex justify-between items-center">
          <CardTitle className="flex items-center gap-2">
            <Loader2 className="h-5 w-5 animate-spin" />
            Processing Video
            {!isConnected && (
              <Badge variant="outline" className="ml-2">
                <WifiOff className="h-3 w-3 mr-1" />
                Reconnecting...
              </Badge>
            )}
          </CardTitle>
          <Button
            variant="outline"
            size="sm"
            onClick={cancel}
            disabled={progress?.stage === 'completed'}
          >
            <X className="h-4 w-4 mr-1" />
            Cancel
          </Button>
        </div>
      </CardHeader>

      <CardContent className="space-y-4">
        {/* Stage Progress */}
        <div className="space-y-2">
          <div className="flex justify-between text-sm">
            <span className="font-medium">{progress?.message}</span>
            <span>{Math.round(progress?.percentage || 0)}%</span>
          </div>

          <Progress value={progress?.percentage || 0} className="h-3" />

          {/* Sub-progress for chunks */}
          {progress?.sub_progress && (
            <div className="ml-4 space-y-1">
              <div className="flex justify-between text-xs text-muted-foreground">
                <span>{progress.sub_progress.description}</span>
                <span>{progress.sub_progress.current}/{progress.sub_progress.total}</span>
              </div>
              <Progress
                value={(progress.sub_progress.current / progress.sub_progress.total) * 100}
                className="h-1"
              />
            </div>
          )}
        </div>

        {/* Stage Indicators */}
        <div className="flex justify-between">
          {stages.map((stage, index) => (
            <div
              key={stage.key}
              className={cn(
                "flex flex-col items-center gap-1",
                progress?.percentage >= stage.percentage
                  ? "text-primary"
                  : "text-muted-foreground"
              )}
            >
              <div className={cn(
                "w-8 h-8 rounded-full border-2 flex items-center justify-center",
                progress?.percentage >= stage.percentage
                  ? "border-primary bg-primary/10"
                  : "border-muted"
              )}>
                {progress?.percentage >= stage.percentage ? (
                  <CheckCircle2 className="h-4 w-4" />
                ) : (
                  <span className="text-xs">{index + 1}</span>
                )}
              </div>
              <span className="text-xs">{stage.label}</span>
            </div>
          ))}
        </div>

        {/* Time Estimation */}
        {progress?.estimated_remaining && (
          <div className="flex justify-between items-center text-sm text-muted-foreground">
            <div className="flex items-center gap-1">
              <Clock className="h-4 w-4" />
              <span>Time elapsed: {formatDuration(progress.time_elapsed)}</span>
            </div>
            <div className="flex items-center gap-1">
              <Timer className="h-4 w-4" />
              <span>About {formatDuration(progress.estimated_remaining)} remaining</span>
            </div>
          </div>
        )}
      </CardContent>
    </Card>
  );
}
```

#### useProcessingProgress Hook
```typescript
export function useProcessingProgress(jobId: string) {
  const [progress, setProgress] = useState<ProgressData | null>(null);
  const [isConnected, setIsConnected] = useState(false);
  const ws = useRef<WebSocket | null>(null);
  const reconnectAttempts = useRef(0);
  const reconnectTimeout = useRef<NodeJS.Timeout>();

  const connect = useCallback(() => {
    const wsUrl = `ws://localhost:8000/ws/progress/${jobId}`;
    ws.current = new WebSocket(wsUrl);

    ws.current.onopen = () => {
      console.log('WebSocket connected');
      setIsConnected(true);
      reconnectAttempts.current = 0;

      // Subscribe to job updates
      ws.current?.send(JSON.stringify({
        type: 'subscribe',
        job_id: jobId,
        timestamp: new Date().toISOString()
      }));
    };

    ws.current.onmessage = (event) => {
      const message = JSON.parse(event.data);

      switch (message.type) {
        case 'progress':
          setProgress(message.data);
          break;
        case 'stage_change':
          // Update UI for stage change
          break;
        case 'complete':
          setProgress({
            ...message.data,
            percentage: 100,
            stage: 'completed'
          });
          break;
        case 'error':
          // Handle error
          break;
      }
    };

    ws.current.onclose = () => {
      setIsConnected(false);

      // Attempt reconnection with exponential backoff
      if (reconnectAttempts.current < 5) {
        const delay = Math.min(1000 * Math.pow(2, reconnectAttempts.current), 10000);
        reconnectAttempts.current++;

        reconnectTimeout.current = setTimeout(() => {
          connect();
        }, delay);
      }
    };

    ws.current.onerror = (error) => {
      console.error('WebSocket error:', error);
    };
  }, [jobId]);

  const cancel = useCallback(async () => {
    // Send cancel message
    if (ws.current?.readyState === WebSocket.OPEN) {
      ws.current.send(JSON.stringify({
        type: 'cancel',
        job_id: jobId,
        timestamp: new Date().toISOString()
      }));
    }

    // Also call API endpoint as fallback
    try {
      await apiClient.post(`/api/pipeline/cancel/${jobId}`);
    } catch (error) {
      console.error('Failed to cancel job:', error);
    }
  }, [jobId]);

  useEffect(() => {
    connect();

    return () => {
      if (reconnectTimeout.current) {
        clearTimeout(reconnectTimeout.current);
      }
      if (ws.current) {
        ws.current.close();
      }
    };
  }, [connect]);

  // Fallback to polling if WebSocket fails
  useEffect(() => {
    if (!isConnected && reconnectAttempts.current >= 5) {
      const pollInterval = setInterval(async () => {
        try {
          const status = await apiClient.get(`/api/pipeline/status/${jobId}`);
          setProgress(status.data);
        } catch (error) {
          console.error('Polling failed:', error);
        }
      }, 2000);

      return () => clearInterval(pollInterval);
    }
  }, [isConnected, jobId]);

  return {
    progress,
    isConnected,
    cancel
  };
}
```

## Implementation Tasks

### Backend Tasks (4-5 hours)
1. **WebSocket Infrastructure Enhancement**
   - [ ] Update WebSocketManager with connection tracking
   - [ ] Implement message queuing for disconnected clients
   - [ ] Add heartbeat/ping-pong mechanism
   - [ ] Create connection recovery logic

2. **Pipeline Progress Integration**
   - [ ] Add granular progress tracking to SummaryPipeline
   - [ ] Implement sub-progress for chunk processing
   - [ ] Create time estimation algorithm
   - [ ] Add cancellation support throughout pipeline

3. **API Endpoints**
   - [ ] Create `/api/pipeline/cancel/{job_id}` endpoint
   - [ ] Update `/api/pipeline/status/{job_id}` with detailed progress
   - [ ] Add WebSocket endpoint `/ws/progress/{job_id}`
   - [ ] Implement progress history tracking

### Frontend Tasks (4-5 hours)
4. **Progress Components**
   - [ ] Create ProcessingProgress component
   - [ ] Build stage indicator visualization
   - [ ] Implement progress bar with sub-progress
   - [ ] Add time estimation display

5. **WebSocket Integration**
   - [ ] Create useProcessingProgress hook
   - [ ] Implement connection management
   - [ ] Add reconnection with backoff
   - [ ] Create fallback to polling

6. **User Interface Updates**
   - [ ] Update SummarizePage with progress display
   - [ ] Add connection status indicator
   - [ ] Implement cancel button functionality
   - [ ] Create smooth transitions between stages

### Testing (2-3 hours)
7. **Unit Tests**
   - [ ] Test WebSocket manager functionality
   - [ ] Test progress calculation accuracy
   - [ ] Test cancellation at various stages
   - [ ] Test reconnection logic

8. **Integration Tests**
   - [ ] Test full processing with progress updates
   - [ ] Test connection recovery scenarios
   - [ ] Test fallback to polling
   - [ ] Test concurrent processing jobs

## Success Metrics

1. **Performance Metrics**
   - WebSocket latency < 100ms
   - Progress updates at least every 2 seconds
   - Reconnection within 5 seconds
   - Zero lost messages during brief disconnects

2. **User Experience Metrics**
   - Clear indication of current stage
   - Accurate time estimates (±20% accuracy)
   - Smooth progress bar movement
   - Immediate response to cancel action

3. **Technical Metrics**
   - 100% of processing stages tracked
   - Graceful degradation to polling
   - No memory leaks in WebSocket connections
   - Clean cancellation without orphaned processes

## Definition of Done

- [ ] All acceptance criteria met
- [ ] WebSocket connection auto-manages lifecycle
- [ ] Progress updates show for all processing stages
- [ ] Time estimation becomes accurate after 2-3 videos
- [ ] Cancel operation works at any stage
- [ ] Connection recovery handles network interruptions
- [ ] Fallback to polling when WebSocket unavailable
- [ ] Unit and integration tests pass
- [ ] Documentation updated
- [ ] No console errors or warnings

## Risk Mitigation

1. **WebSocket Compatibility**: Some corporate firewalls block WebSocket
   - Solution: Automatic fallback to polling

2. **Progress Accuracy**: Transcript size varies greatly
   - Solution: Dynamic progress calculation based on actual work

3. **Memory Leaks**: Long-lived WebSocket connections
   - Solution: Proper cleanup and connection limits

4. **Time Estimation**: Insufficient historical data
   - Solution: Use conservative estimates initially

5. **Cancellation Complexity**: Pipeline may be in critical section
   - Solution: Safe cancellation points throughout pipeline

## Notes

- WebSocket infrastructure from Story 3.4 provides good foundation
- Consider using Server-Sent Events (SSE) as alternative to WebSocket
- Time estimation could use machine learning in future
- Progress data could be used for performance analytics
- Consider adding sound/notification when processing completes

---

**Story Status**: Ready for Implementation
**Assigned To**: Developer
**Sprint**: Next
**Story Points**: 5