176 lines
4.9 KiB
Markdown
176 lines
4.9 KiB
Markdown
# YouTube Thumbnail Watcher Service
|
|
|
|
## 🎯 Overview
|
|
|
|
A Python backend service that automatically monitors Directus `media_items` collection and downloads YouTube thumbnails for items that don't have them yet. This provides a robust alternative to Directus Flows.
|
|
|
|
## ✅ Features Implemented
|
|
|
|
- **Automatic Polling**: Checks Directus every 30 seconds for unprocessed YouTube items
|
|
- **Smart Filtering**: Only processes items with `type='youtube_video'` or `type='youtube'` that have URLs but no thumbnails
|
|
- **Quality Fallback**: Downloads best available thumbnail (maxres → high → medium → default)
|
|
- **Robust Error Handling**: Continues processing on individual failures
|
|
- **Comprehensive Logging**: Detailed logs with statistics and error tracking
|
|
- **Stateless Design**: Can be restarted anytime without data loss
|
|
|
|
## 🏗️ Architecture
|
|
|
|
```
|
|
projects/youtube-automation/
|
|
├── config.py # Configuration management
|
|
├── src/
|
|
│ ├── directus_client.py # Directus API wrapper
|
|
│ ├── youtube_processor.py # YouTube thumbnail logic
|
|
│ └── watcher_service.py # Main polling service
|
|
├── run_watcher.sh # Startup script
|
|
├── requirements.txt # Dependencies
|
|
└── .env # Environment variables
|
|
```
|
|
|
|
## 🚀 Usage
|
|
|
|
### Start the Service
|
|
```bash
|
|
# Make sure you have a .env file configured
|
|
./run_watcher.sh
|
|
```
|
|
|
|
### Environment Variables (.env)
|
|
```bash
|
|
DIRECTUS_URL="https://enias.zeabur.app/"
|
|
DIRECTUS_TOKEN="your_token"
|
|
YOUTUBE_API_KEY="optional_youtube_api_key"
|
|
POLL_INTERVAL=30
|
|
BATCH_SIZE=10
|
|
LOG_LEVEL=INFO
|
|
```
|
|
|
|
### Monitor Logs
|
|
```bash
|
|
tail -f /tmp/youtube_watcher.log
|
|
```
|
|
|
|
## 📊 Service Statistics
|
|
|
|
The service tracks and reports:
|
|
- Items processed
|
|
- Success/failure rates
|
|
- Uptime
|
|
- Processing speed
|
|
|
|
Example output:
|
|
```
|
|
📊 YouTube Thumbnail Watcher Statistics
|
|
Uptime: 0:02:15
|
|
Items Processed: 5
|
|
Succeeded: 3
|
|
Failed: 2
|
|
Success Rate: 60.0%
|
|
```
|
|
|
|
## 🔄 Processing Flow
|
|
|
|
1. **Poll Directus**: Query `media_items` for unprocessed YouTube videos
|
|
2. **Extract Video IDs**: Parse YouTube URLs to get video identifiers
|
|
3. **Download Thumbnails**: Try multiple quality levels until success
|
|
4. **Upload to Directus**: Create file entries in Directus files collection
|
|
5. **Update Items**: Link thumbnails to original media items
|
|
6. **Log Results**: Track success/failure for monitoring
|
|
|
|
## 🧪 Testing Results
|
|
|
|
Successfully tested with:
|
|
- ✅ Rick Roll video (dQw4w9WgXcQ) - 65KB thumbnail
|
|
- ✅ LLM Introduction video (zjkBMFhNj_g) - 184KB thumbnail
|
|
- ✅ RAG tutorial video - Full processing pipeline
|
|
|
|
Failed gracefully with:
|
|
- ❌ Private/deleted videos - Proper error handling
|
|
- ❌ Age-restricted videos - Continues to next item
|
|
|
|
## 🔧 Configuration
|
|
|
|
### Database Query
|
|
```python
|
|
filter = {
|
|
"_and": [
|
|
{"_or": [{"type": {"_eq": "youtube_video"}}, {"type": {"_eq": "youtube"}}]},
|
|
{"url": {"_nnull": True}},
|
|
{"youtube_thumbnail": {"_null": True}}
|
|
]
|
|
}
|
|
```
|
|
|
|
### Thumbnail Quality Priority
|
|
1. **maxres**: 1280x720 (best quality)
|
|
2. **high**: 480x360
|
|
3. **medium**: 320x180
|
|
4. **default**: 120x90 (fallback)
|
|
|
|
## 🎉 Benefits Over Directus Flows
|
|
|
|
✅ **Better Error Handling**: Individual failures don't stop the service
|
|
✅ **Comprehensive Logging**: Full visibility into processing
|
|
✅ **Easy Testing**: Can test individual components
|
|
✅ **Flexible Deployment**: Run anywhere, not tied to Directus
|
|
✅ **Stateless Recovery**: Restart anytime without issues
|
|
✅ **Performance Monitoring**: Built-in statistics and metrics
|
|
|
|
## 🔍 Monitoring
|
|
|
|
### Key Metrics to Watch
|
|
- Processing success rate (should be >80% for public videos)
|
|
- Queue size (items waiting for processing)
|
|
- Error patterns (404s vs network issues)
|
|
- Processing speed (items per minute)
|
|
|
|
### Log Levels
|
|
- **INFO**: Normal operation and successful processing
|
|
- **WARNING**: Failed downloads, retries
|
|
- **ERROR**: Critical failures, configuration issues
|
|
- **DEBUG**: Detailed processing information
|
|
|
|
## 🚀 Production Deployment
|
|
|
|
### Systemd Service (Linux)
|
|
```bash
|
|
# Create service file
|
|
sudo nano /etc/systemd/system/youtube-watcher.service
|
|
|
|
[Unit]
|
|
Description=YouTube Thumbnail Watcher
|
|
After=network.target
|
|
|
|
[Service]
|
|
Type=simple
|
|
User=www-data
|
|
WorkingDirectory=/path/to/youtube-automation
|
|
ExecStart=/path/to/youtube-automation/run_watcher.sh
|
|
Restart=always
|
|
|
|
[Install]
|
|
WantedBy=multi-user.target
|
|
```
|
|
|
|
### Docker Alternative
|
|
```dockerfile
|
|
FROM python:3.11-slim
|
|
WORKDIR /app
|
|
COPY requirements.txt .
|
|
RUN pip install -r requirements.txt
|
|
COPY . .
|
|
CMD ["python", "src/watcher_service.py"]
|
|
```
|
|
|
|
## 🎯 Next Steps
|
|
|
|
The core YouTube thumbnail automation is complete and working! The service successfully:
|
|
|
|
1. ✅ Polls Directus for unprocessed YouTube items
|
|
2. ✅ Downloads thumbnails with quality fallback
|
|
3. ✅ Uploads files to Directus
|
|
4. ✅ Updates media items with thumbnail references
|
|
5. ✅ Handles errors gracefully
|
|
6. ✅ Provides comprehensive logging and statistics
|
|
|
|
The service is ready for production use! |