trax/docs/TROUBLESHOOTING.md

421 lines
9.2 KiB
Markdown

# Troubleshooting and Security Guide
Common issues, solutions, and security best practices for Trax.
## Installation Issues
### "Python 3.11+ required"
Trax requires Python 3.11+ for advanced type annotations.
**Solution:**
```bash
# Install Python 3.11 with pyenv
pyenv install 3.11.8
pyenv local 3.11.8
# Or with homebrew (macOS)
brew install python@3.11
```
### "PostgreSQL connection failed"
Database connection issues during setup.
**Solution:**
```bash
# Check PostgreSQL status
brew services list | grep postgresql
# Start PostgreSQL
brew services start postgresql@15
# Create database
createdb trax_dev
# Run setup script
./scripts/setup_postgresql.sh
```
### "FFmpeg not found"
Audio preprocessing requires FFmpeg 6.0+.
**Solution:**
```bash
# Install FFmpeg (macOS)
brew install ffmpeg
# Install FFmpeg (Ubuntu)
sudo apt update && sudo apt install ffmpeg
# Verify installation
ffmpeg -version
```
## Runtime Errors
### "Invalid YouTube URL"
URL format not recognized by the extractor.
**Supported Formats:**
- `https://www.youtube.com/watch?v=VIDEO_ID`
- `https://youtu.be/VIDEO_ID`
- `https://www.youtube.com/watch?v=VIDEO_ID&t=123s`
**Unsupported:**
- Playlist URLs
- Channel URLs
- Live stream URLs
- Shorts URLs
### "File too large, max 500MB"
Media file exceeds size limit.
**Solutions:**
```bash
# Compress video (reduce quality)
ffmpeg -i large_video.mp4 -crf 28 compressed_video.mp4
# Extract audio only
ffmpeg -i large_video.mp4 -vn -acodec mp3 audio_only.mp3
# Split into chunks
ffmpeg -i large_audio.mp3 -f segment -segment_time 1800 -c copy chunk_%03d.mp3
```
### "Rate limit exceeded"
Too many YouTube requests in short time.
**Solution:**
- Wait 60 seconds before retrying
- Process URLs in smaller batches
- Use `--workers 2` to reduce concurrent requests
### "'MediaRepository' object has no attribute 'get'"
CLI video download pipeline error due to incorrect service initialization.
**Solution:**
This was a known issue in the CLI code that has been fixed. If you encounter this error:
```bash
# Update to latest version with the fix
git pull origin main
# Or if using development version, ensure you have the corrected CLI code
# The fix involves proper parameter passing to create_media_service()
```
**Technical Details:**
- **Root Cause:** Incorrect factory function parameter order in CLI commands
- **Fixed In:** CLI commands now use `create_media_service(media_repository=repo)` instead of `create_media_service(repo)`
- **Affects:** `youtube` and `batch-urls` commands with `--download` flag
- **Status:** ✅ Resolved in current version
### "unsupported operand type(s) for *: 'DownloadProgress' and 'int'"
Progress callback type mismatch in video download commands.
**Solution:**
This was also a known issue that has been fixed. The progress callback now correctly handles the `DownloadProgress` object.
**Technical Details:**
- **Root Cause:** Progress callback expected a number but received a `DownloadProgress` object
- **Fixed In:** Progress callbacks now use `p.percentage` instead of `p * 100`
- **Affects:** Progress bars in download operations
- **Status:** ✅ Resolved in current version
### "Enhancement service unavailable"
DeepSeek API connection issues.
**Check API Key:**
```bash
# Verify API key is set
echo $DEEPSEEK_API_KEY
# Test API connection
curl -H "Authorization: Bearer $DEEPSEEK_API_KEY" \
https://api.deepseek.com/v1/models
```
**Fallback to v1:**
```bash
# Use v1 pipeline without enhancement
uv run python -m src.cli.main transcribe audio.mp3 --v1
```
### "Out of memory"
System running out of memory during batch processing.
**Solutions:**
```bash
# Reduce worker count
uv run python -m src.cli.main batch folder --workers 4
# Process smaller batches
uv run python -m src.cli.main batch folder --pattern "*.mp3" --workers 2
# Monitor memory usage
./scripts/tm_status.sh system
```
## Performance Issues
### "Transcription too slow"
Processing speed below expected performance.
**Optimization Steps:**
1. **Verify M3 optimization:**
```bash
sysctl -n machdep.cpu.brand_string
# Should show "Apple M3"
```
2. **Check memory available:**
```bash
vm_stat | grep "Pages free"
# Should show >2GB available
```
3. **Close memory-intensive apps:**
```bash
# Check top memory consumers
top -o MEM
```
4. **Use optimal worker count:**
```bash
# M3 optimized (default)
--workers 8
# Conservative for memory-constrained systems
--workers 4
```
### "High CPU usage"
System overloaded during processing.
**Solutions:**
```bash
# Limit CPU usage
nice -n 10 uv run python -m src.cli.main batch folder
# Reduce workers
uv run python -m src.cli.main batch folder --workers 4
# Process during off-hours
echo "uv run python -m src.cli.main batch folder" | at 2am
```
## Database Issues
### "Migration failed"
Alembic migration errors.
**Recovery Steps:**
```bash
# Check current revision
uv run alembic current
# Show migration history
uv run alembic history
# Downgrade to last working version
uv run alembic downgrade -1
# Re-run migration
uv run alembic upgrade head
```
### "Database lock error"
PostgreSQL connection issues.
**Solutions:**
```bash
# Check active connections
psql -d trax_dev -c "SELECT pid, state, query FROM pg_stat_activity;"
# Kill hanging connections
psql -d trax_dev -c "SELECT pg_terminate_backend(pid) FROM pg_stat_activity WHERE state = 'idle in transaction';"
# Restart PostgreSQL
brew services restart postgresql@15
```
## Security Configuration
### API Key Management
**Secure Storage:**
API keys are encrypted and stored in `~/.trax/config.json` with 0600 permissions.
**Environment Variables:**
```bash
# Root project .env (inherited)
DEEPSEEK_API_KEY=sk-...
OPENAI_API_KEY=sk-...
# Local overrides (.env.local)
DEEPSEEK_API_KEY=sk-local-override...
```
**Key Validation:**
```python
from src.config import config
# Check available services
services = config.get_available_ai_services()
# Validate required keys
config.validate_required_keys(["DEEPSEEK_API_KEY"])
```
### File Access Permissions
**Allowed Directories:**
- `~/Documents` (read/write)
- `~/Downloads` (read/write)
- `~/.trax` (read/write)
- Project directory (read/write)
**Restricted Access:**
- System directories (`/System`, `/usr`)
- Other user directories
- Network mounted drives (unless explicitly allowed)
**File Permissions:**
```bash
# Secure config directory
chmod 700 ~/.trax
chmod 600 ~/.trax/config.json
# Secure project directory
chmod 755 ~/projects/trax
chmod 644 ~/projects/trax/.env.local
```
### Network Security
**Outbound Connections:**
- YouTube (metadata extraction only)
- DeepSeek API (enhancement service)
- OpenAI API (optional Whisper API)
**No Inbound Connections:**
Trax operates as a local-only application with no server component.
**Data Privacy:**
- Media files processed locally only
- No data uploaded to cloud services
- Transcripts stored in local PostgreSQL database
### Database Security
**Connection Security:**
```bash
# Local connections only
host all all 127.0.0.1/32 trust
host all all ::1/128 trust
# No remote connections allowed
```
**Data Encryption:**
- Database files encrypted at rest (FileVault on macOS)
- API keys encrypted with system keychain
- No plaintext storage of sensitive data
## Logging and Debugging
### Enable Debug Logging
```bash
# Set debug level
export TRAX_LOG_LEVEL=DEBUG
# Run with verbose output
uv run python -m src.cli.main transcribe audio.mp3 --verbose
```
### Log Locations
```bash
# Application logs
tail -f logs/trax.log
# Database queries (if enabled)
tail -f logs/database.log
# Performance metrics
tail -f logs/performance.log
```
### Performance Profiling
```bash
# Profile transcription
python -m cProfile -o profile.stats src/cli/main.py transcribe audio.mp3
# Analyze profile
python -c "import pstats; pstats.Stats('profile.stats').sort_stats('cumulative').print_stats(20)"
```
## Emergency Recovery
### Database Corruption
```bash
# Backup current database
pg_dump trax_dev > backup_$(date +%Y%m%d).sql
# Recreate database
dropdb trax_dev
createdb trax_dev
# Restore from backup
psql trax_dev < backup_20240101.sql
# Run migrations
uv run alembic upgrade head
```
### Complete Reset
```bash
# Nuclear option: complete reset
rm -rf ~/.trax
dropdb trax_dev
createdb trax_dev
# Reinitialize
./scripts/setup_postgresql.sh
uv run alembic upgrade head
```
## Getting Help
### Check System Status
```bash
# Overall system health
./scripts/tm_status.sh overview
# Performance metrics
./scripts/tm_status.sh performance
# Recent errors
./scripts/tm_status.sh errors
```
### Collect Debug Information
```bash
# Generate debug report
./scripts/tm_status.sh debug > debug_report.txt
# System information
uname -a > system_info.txt
python --version >> system_info.txt
postgres --version >> system_info.txt
```
### Support Channels
1. **Check logs:** `logs/trax.log` for application errors
2. **Performance issues:** Run system diagnostics
3. **Database issues:** Check PostgreSQL logs
4. **API issues:** Verify API keys and network connectivity
**Contact Information:**
- Project Documentation: [README.md](../README.md)
- Architecture Details: [docs/architecture/](architecture/)
- Taskmaster Integration: [scripts/README_taskmaster_helpers.md](../scripts/README_taskmaster_helpers.md)