youtube-summarizer/backend/autonomous/README.md

568 lines
16 KiB
Markdown

# Autonomous Operations & Webhook System
The YouTube Summarizer includes a comprehensive autonomous operations system with advanced webhook capabilities, enabling intelligent automation, real-time notifications, and self-managing workflows.
## 🚀 Quick Start
### Enable Autonomous Operations
```python
from backend.autonomous.autonomous_controller import start_autonomous_operations
# Start autonomous operations
await start_autonomous_operations()
```
### Register Webhooks
```python
from backend.autonomous.webhook_system import register_webhook, WebhookEvent
# Register webhook for transcription events
await register_webhook(
webhook_id="my_app_webhook",
url="https://myapp.com/webhooks/youtube-summarizer",
events=[WebhookEvent.TRANSCRIPTION_COMPLETED, WebhookEvent.SUMMARIZATION_COMPLETED]
)
```
### API Usage
```bash
# Start autonomous operations via API
curl -X POST "http://localhost:8000/api/autonomous/automation/start"
# Register webhook via API
curl -X POST "http://localhost:8000/api/autonomous/webhooks/my_webhook" \
-H "Content-Type: application/json" \
-d '{
"url": "https://myapp.com/webhook",
"events": ["transcription.completed", "summarization.completed"]
}'
```
## 🎯 Features
### Webhook System
- **Real-time Notifications**: Instant webhook delivery for all system events
- **Secure Delivery**: HMAC-SHA256, Bearer Token, and API Key authentication
- **Reliable Delivery**: Automatic retries with exponential backoff
- **Event Filtering**: Advanced filtering conditions for targeted notifications
- **Delivery Tracking**: Comprehensive logging and status monitoring
### Autonomous Controller
- **Intelligent Automation**: Rule-based automation with multiple trigger types
- **Resource Management**: Automatic scaling and performance optimization
- **Scheduled Operations**: Cron-like scheduling for recurring tasks
- **Event-Driven Actions**: Respond automatically to system events
- **Self-Healing**: Automatic error recovery and system optimization
### Monitoring & Analytics
- **Real-time Metrics**: System performance and health monitoring
- **Execution History**: Detailed logs of all autonomous operations
- **Success Tracking**: Comprehensive statistics and success rates
- **Health Checks**: Automatic system health assessment
## 📡 Webhook System
### Supported Events
| Event | Description | Payload |
|-------|-------------|---------|
| `transcription.completed` | Video transcription finished successfully | `{video_id, transcript, quality_score, processing_time}` |
| `transcription.failed` | Video transcription failed | `{video_id, error, retry_count}` |
| `summarization.completed` | Video summarization finished | `{video_id, summary, key_points, processing_time}` |
| `summarization.failed` | Video summarization failed | `{video_id, error, retry_count}` |
| `batch.started` | Batch processing started | `{batch_id, video_count, estimated_time}` |
| `batch.completed` | Batch processing completed | `{batch_id, results, total_time}` |
| `batch.failed` | Batch processing failed | `{batch_id, error, completed_videos}` |
| `video.processed` | Complete video processing finished | `{video_id, transcript, summary, metadata}` |
| `error.occurred` | System error occurred | `{error_type, message, context}` |
| `system.status` | System status change | `{status, component, details}` |
| `user.quota_exceeded` | User quota exceeded | `{user_id, quota_type, current_usage}` |
| `processing.delayed` | Processing delayed due to high load | `{queue_depth, estimated_delay}` |
### Security Methods
#### HMAC SHA256 (Recommended)
```python
import hmac
import hashlib
def verify_webhook(payload, signature, secret):
expected = hmac.new(
secret.encode('utf-8'),
payload.encode('utf-8'),
hashlib.sha256
).hexdigest()
return signature == f"sha256={expected}"
```
#### Bearer Token
```javascript
// Verify in your webhook handler
const auth = req.headers.authorization;
if (auth !== `Bearer ${YOUR_SECRET_TOKEN}`) {
return res.status(401).json({error: 'Unauthorized'});
}
```
#### API Key Header
```python
# Verify API key
api_key = request.headers.get('X-API-Key')
if api_key != YOUR_API_KEY:
return {'error': 'Invalid API key'}, 401
```
### Webhook Registration
```python
from backend.autonomous.webhook_system import WebhookConfig, WebhookSecurityType
# Advanced webhook configuration
config = WebhookConfig(
url="https://your-app.com/webhooks/youtube-summarizer",
events=[WebhookEvent.VIDEO_PROCESSED],
security_type=WebhookSecurityType.HMAC_SHA256,
timeout_seconds=30,
retry_attempts=3,
retry_delay_seconds=5,
filter_conditions={
"video_duration": {"$lt": 3600}, # Only videos < 1 hour
"processing_quality": {"$in": ["high", "premium"]}
}
)
```
### Sample Webhook Handler
```python
from fastapi import FastAPI, Request, HTTPException
import hmac
import hashlib
app = FastAPI()
@app.post("/webhooks/youtube-summarizer")
async def handle_webhook(request: Request):
# Get payload and signature
payload = await request.body()
signature = request.headers.get("X-Hub-Signature-256", "")
# Verify signature
if not verify_signature(payload, signature, WEBHOOK_SECRET):
raise HTTPException(status_code=401, detail="Invalid signature")
# Parse payload
data = await request.json()
event = data["event"]
# Handle different events
if event == "transcription.completed":
await handle_transcription_completed(data["data"])
elif event == "summarization.completed":
await handle_summarization_completed(data["data"])
elif event == "error.occurred":
await handle_error(data["data"])
return {"status": "received"}
def verify_signature(payload: bytes, signature: str, secret: str) -> bool:
if not signature.startswith("sha256="):
return False
expected = hmac.new(
secret.encode(),
payload,
hashlib.sha256
).hexdigest()
return signature == f"sha256={expected}"
```
## 🤖 Autonomous Controller
### Automation Rules
The system supports multiple automation rule types:
#### Scheduled Rules
Time-based automation using cron-like syntax:
```python
# Daily cleanup at 2 AM
autonomous_controller.add_rule(
name="Daily Cleanup",
trigger=AutomationTrigger.SCHEDULED,
action=AutomationAction.CLEANUP_CACHE,
parameters={
"schedule": "0 2 * * *", # Cron format
"max_age_hours": 24
}
)
```
#### Queue-Based Rules
Trigger actions based on queue depth:
```python
# Process batch when queue exceeds threshold
autonomous_controller.add_rule(
name="Queue Monitor",
trigger=AutomationTrigger.QUEUE_BASED,
action=AutomationAction.BATCH_PROCESS,
parameters={
"queue_threshold": 10,
"batch_size": 5
},
conditions={
"min_queue_age_minutes": 10
}
)
```
#### Threshold-Based Rules
Monitor system metrics and respond:
```python
# Optimize performance when thresholds exceeded
autonomous_controller.add_rule(
name="Performance Monitor",
trigger=AutomationTrigger.THRESHOLD_BASED,
action=AutomationAction.OPTIMIZE_PERFORMANCE,
parameters={
"cpu_threshold": 80,
"memory_threshold": 85,
"response_time_threshold": 5.0
}
)
```
#### Event-Driven Rules
Respond to specific system events:
```python
# Scale resources based on user activity
autonomous_controller.add_rule(
name="Auto Scaling",
trigger=AutomationTrigger.USER_ACTIVITY,
action=AutomationAction.SCALE_RESOURCES,
parameters={
"activity_threshold": 5,
"scale_factor": 1.5
}
)
```
### Available Actions
| Action | Description | Parameters |
|--------|-------------|------------|
| `PROCESS_VIDEO` | Process individual videos | `video_url`, `processing_options` |
| `BATCH_PROCESS` | Process multiple videos | `batch_size`, `queue_selection` |
| `CLEANUP_CACHE` | Clean up old cached data | `max_age_hours`, `cleanup_types` |
| `GENERATE_REPORT` | Generate system reports | `report_types`, `recipients` |
| `SCALE_RESOURCES` | Scale system resources | `scale_factor`, `target_metrics` |
| `SEND_NOTIFICATION` | Send notifications | `recipients`, `message`, `urgency` |
| `OPTIMIZE_PERFORMANCE` | Optimize system performance | `optimization_targets` |
| `BACKUP_DATA` | Backup system data | `backup_types`, `retention_days` |
## 🔗 API Endpoints
### Webhook Management
```bash
# Register webhook
POST /api/autonomous/webhooks/{webhook_id}
# Get webhook status
GET /api/autonomous/webhooks/{webhook_id}
# Update webhook
PUT /api/autonomous/webhooks/{webhook_id}
# Delete webhook
DELETE /api/autonomous/webhooks/{webhook_id}
# List all webhooks
GET /api/autonomous/webhooks
# Get system stats
GET /api/autonomous/webhooks/system/stats
```
### Automation Management
```bash
# Start/stop automation
POST /api/autonomous/automation/start
POST /api/autonomous/automation/stop
# Get system status
GET /api/autonomous/automation/status
# Manage rules
POST /api/autonomous/automation/rules
GET /api/autonomous/automation/rules
PUT /api/autonomous/automation/rules/{rule_id}
DELETE /api/autonomous/automation/rules/{rule_id}
# Execute rule manually
POST /api/autonomous/automation/rules/{rule_id}/execute
```
### Monitoring
```bash
# System health
GET /api/autonomous/system/health
# System metrics
GET /api/autonomous/system/metrics
# Execution history
GET /api/autonomous/automation/executions
# Recent events
GET /api/autonomous/events
```
## 📊 Monitoring & Analytics
### System Health Dashboard
```python
# Get comprehensive system status
status = await get_automation_status()
print(f"Controller Status: {status['controller_status']}")
print(f"Active Rules: {status['active_rules']}")
print(f"Success Rate: {status['success_rate']:.2%}")
print(f"Average Execution Time: {status['average_execution_time']:.2f}s")
```
### Webhook Delivery Monitoring
```python
# Monitor webhook performance
stats = webhook_manager.get_system_stats()
print(f"Active Webhooks: {stats['active_webhooks']}")
print(f"Success Rate: {stats['success_rate']:.2%}")
print(f"Pending Deliveries: {stats['pending_deliveries']}")
print(f"Average Response Time: {stats['average_response_time']:.3f}s")
```
### Execution History
```python
# Get recent executions
executions = autonomous_controller.get_execution_history(limit=20)
for execution in executions:
print(f"Rule: {execution['rule_id']}")
print(f"Status: {execution['status']}")
print(f"Started: {execution['started_at']}")
if execution['error_message']:
print(f"Error: {execution['error_message']}")
```
## 🚨 Error Handling & Recovery
### Automatic Retry Logic
Webhooks automatically retry failed deliveries:
- **Exponential Backoff**: Increasing delays between retries
- **Maximum Attempts**: Configurable retry limits
- **Failure Tracking**: Detailed error logging
- **Dead Letter Queue**: Failed deliveries tracked for analysis
### Self-Healing Operations
```python
# Automatic error recovery
autonomous_controller.add_rule(
name="Error Recovery",
trigger=AutomationTrigger.EVENT_DRIVEN,
action=AutomationAction.OPTIMIZE_PERFORMANCE,
parameters={
"recovery_actions": [
"restart_services",
"clear_error_queues",
"reset_connections"
]
}
)
```
### Health Monitoring
```python
# Continuous health monitoring
@autonomous_controller.health_check
async def check_system_health():
# Custom health check logic
cpu_usage = get_cpu_usage()
memory_usage = get_memory_usage()
if cpu_usage > 90:
await trigger_action(AutomationAction.SCALE_RESOURCES)
if memory_usage > 95:
await trigger_action(AutomationAction.CLEANUP_CACHE)
```
## 🛠️ Configuration
### Environment Variables
```bash
# Webhook Configuration
WEBHOOK_MAX_TIMEOUT=300 # Maximum webhook timeout (seconds)
WEBHOOK_DEFAULT_RETRIES=3 # Default retry attempts
WEBHOOK_CLEANUP_DAYS=7 # Days to keep delivery records
# Automation Configuration
AUTOMATION_CHECK_INTERVAL=30 # Rule check interval (seconds)
AUTOMATION_EXECUTION_TIMEOUT=3600 # Maximum execution time (seconds)
AUTOMATION_MAX_CONCURRENT=10 # Maximum concurrent executions
# Security
WEBHOOK_SECRET_LENGTH=32 # Generated secret length
REQUIRE_WEBHOOK_AUTH=true # Require webhook authentication
```
### Advanced Configuration
```python
from backend.autonomous.webhook_system import WebhookManager
from backend.autonomous.autonomous_controller import AutonomousController
# Custom webhook manager
webhook_manager = WebhookManager()
webhook_manager.stats["max_retries"] = 5
webhook_manager.stats["default_timeout"] = 45
# Custom autonomous controller
autonomous_controller = AutonomousController()
autonomous_controller.metrics["check_interval"] = 15
autonomous_controller.metrics["max_executions"] = 50
```
## 📈 Performance Optimization
### Webhook Performance
- **Connection Pooling**: Reuse HTTP connections for webhook deliveries
- **Batch Deliveries**: Group multiple events for the same endpoint
- **Async Processing**: Non-blocking webhook delivery queue
- **Circuit Breaker**: Temporarily disable failing endpoints
### Automation Performance
- **Rule Prioritization**: Execute high-priority rules first
- **Resource Limits**: Prevent resource exhaustion
- **Execution Throttling**: Limit concurrent executions
- **Smart Scheduling**: Optimize execution timing
## 🔒 Security Considerations
### Webhook Security
1. **Always Use HTTPS**: Never send webhooks to HTTP endpoints
2. **Verify Signatures**: Always validate HMAC signatures
3. **Rotate Secrets**: Regularly rotate webhook secrets
4. **Rate Limiting**: Implement rate limiting on webhook endpoints
5. **Input Validation**: Validate all webhook payloads
### Automation Security
1. **Least Privilege**: Limit automation rule capabilities
2. **Audit Logging**: Log all automation activities
3. **Resource Limits**: Prevent resource exhaustion attacks
4. **Secure Parameters**: Encrypt sensitive parameters
5. **Access Control**: Restrict rule modification access
## 🧪 Testing
### Webhook Testing
```python
# Test webhook delivery
from backend.autonomous.webhook_system import trigger_event, WebhookEvent
# Trigger test event
delivery_ids = await trigger_event(
WebhookEvent.SYSTEM_STATUS,
{"test": True, "message": "Test webhook delivery"}
)
print(f"Triggered {len(delivery_ids)} webhook deliveries")
```
### Automation Testing
```python
# Test automation rule
from backend.autonomous.autonomous_controller import trigger_manual_execution
# Manually execute rule
success = await trigger_manual_execution("rule_id")
if success:
print("Rule executed successfully")
```
### Integration Testing
```bash
# Test webhook endpoint
curl -X POST "http://localhost:8000/api/autonomous/webhooks/test" \
-H "Content-Type: application/json" \
-d '{
"event": "system.status",
"data": {"test": true}
}'
# Test automation status
curl -X GET "http://localhost:8000/api/autonomous/automation/status"
```
## 🚀 Production Deployment
### Infrastructure Requirements
- **Redis**: For webhook delivery queue and caching
- **Database**: For persistent rule and execution storage
- **Monitoring**: Prometheus/Grafana for metrics
- **Load Balancer**: For high availability webhook delivery
### Deployment Checklist
- [ ] Configure webhook secrets and authentication
- [ ] Set up monitoring and alerting
- [ ] Configure backup and recovery procedures
- [ ] Test all webhook endpoints
- [ ] Verify automation rule execution
- [ ] Set up log aggregation
- [ ] Configure resource limits
- [ ] Test failover scenarios
## 📚 Examples
See the complete example implementations in:
- `backend/autonomous/example_usage.py` - Basic usage examples
- `backend/api/autonomous.py` - API integration examples
- `tests/autonomous/` - Comprehensive test suite
## 🤝 Contributing
1. Follow existing code patterns and error handling
2. Add comprehensive tests for new features
3. Update documentation for API changes
4. Include monitoring and logging
5. Consider security implications
## 📄 License
This autonomous operations system is part of the YouTube Summarizer project and follows the same licensing terms.