trax/docs/CLI.md

587 lines
16 KiB
Markdown

# CLI Command Reference
Complete reference for all Trax CLI commands with examples and options.
## Command Structure
Trax provides two CLI interfaces:
### Standard CLI
```bash
uv run python -m src.cli.main <command> [options] [arguments]
```
### Enhanced CLI (Recommended)
```bash
uv run python -m src.cli.enhanced_cli <command> [options] [arguments]
```
The enhanced CLI provides:
- **Real-time progress reporting** with Rich progress bars
- **Performance monitoring** (CPU, memory, temperature)
- **Intelligent batch processing** with concurrent execution
- **Enhanced error handling** with user-friendly guidance
- **Multiple export formats** (JSON, TXT, SRT, VTT)
- **Advanced features** (speaker diarization, domain adaptation)
## Enhanced CLI Commands
### Enhanced CLI Overview
The enhanced CLI (`src.cli.enhanced_cli`) provides a modern, feature-rich interface with real-time progress reporting and advanced capabilities.
**Key Features:**
- **Rich Progress Bars**: Real-time transcription progress with time estimates
- **Performance Monitoring**: Live CPU, memory, and temperature tracking
- **Intelligent Queuing**: Batch processing with size-based prioritization
- **Advanced Export**: Multiple formats including SRT and VTT subtitles
- **Error Guidance**: Helpful suggestions for common issues
- **Optional Features**: Speaker diarization and domain adaptation
### `transcribe <input>`
Enhanced single file transcription with progress reporting.
**Usage:**
```bash
uv run python -m src.cli.enhanced_cli transcribe input.wav
```
**Options:**
- `-o, --output PATH` - Output directory (default: current directory)
- `-f, --format [json|txt|srt|vtt]` - Output format (default: json)
- `-m, --model [tiny|base|small|medium|large]` - Model size (default: base)
- `-d, --device [cpu|cuda]` - Processing device (default: cpu)
- `--domain [general|technical|medical|academic]` - Domain adaptation
- `--diarize` - Enable speaker diarization
- `--speakers INTEGER` - Number of speakers (for diarization)
**Examples:**
```bash
# Basic transcription with progress bar
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3
# Enhanced transcription with domain adaptation
uv run python -m src.cli.enhanced_cli transcribe medical_audio.wav --domain medical
# Speaker diarization with SRT output
uv run python -m src.cli.enhanced_cli transcribe interview.mp4 --diarize --speakers 2 -f srt
# High-quality transcription with large model
uv run python -m src.cli.enhanced_cli transcribe podcast.mp3 -m large -f vtt
```
### `batch <input>`
Enhanced batch processing with intelligent queuing and concurrent execution.
**Usage:**
```bash
uv run python -m src.cli.enhanced_cli batch /path/to/audio/files
```
**Options:**
- `-o, --output PATH` - Output directory (default: current directory)
- `-c, --concurrency INTEGER` - Number of concurrent processes (default: 4)
- `-f, --format [json|txt|srt|vtt]` - Output format (default: json)
- `-m, --model [tiny|base|small|medium|large]` - Model size (default: base)
- `-d, --device [cpu|cuda]` - Processing device (default: cpu)
- `--domain [general|technical|medical|academic]` - Domain adaptation
- `--diarize` - Enable speaker diarization
- `--speakers INTEGER` - Number of speakers (for diarization)
**Examples:**
```bash
# Batch process with 8 concurrent workers
uv run python -m src.cli.enhanced_cli batch ~/Podcasts -c 8
# Process with domain adaptation and speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Lectures --domain academic --diarize
# Conservative processing for memory-constrained systems
uv run python -m src.cli.enhanced_cli batch ~/Audio -c 2 -m small
# High-quality batch processing
uv run python -m src.cli.enhanced_cli batch ~/Interviews -m large -f srt --diarize --speakers 3
```
**Intelligent Queuing:**
The enhanced batch processor automatically:
- Sorts files by size (smaller files first for faster feedback)
- Monitors system resources in real-time
- Provides detailed progress for each file
- Handles errors gracefully without stopping the batch
## Enhanced Progress Tracking Features
### Multi-Pass Pipeline Progress Visualization
When using the `--multi-pass` option, the CLI provides detailed progress tracking for each stage of the multi-pass transcription pipeline:
**Stage 1: Fast Transcription Pass**
- Real-time progress with confidence scoring
- Segment generation and quality assessment
- Low-confidence segment identification
**Stage 2: Refinement Pass**
- Progress tracking for low-confidence segments
- Audio slicing and re-transcription
- Quality improvement monitoring
**Stage 3: Enhancement Pass**
- Domain-specific enhancement progress
- Content optimization tracking
- Final quality validation
**Stage 4: Speaker Diarization (if enabled)**
- Parallel speaker identification
- Speaker count and segmentation progress
- Integration with transcription results
### System Resource Monitoring
The enhanced CLI includes real-time system resource monitoring:
**CPU Usage Monitoring**
- Current and peak CPU utilization
- Performance warnings at 80%+ and 95%+ thresholds
- Processing optimization recommendations
**Memory Usage Tracking**
- Real-time memory consumption
- Peak memory usage during processing
- Memory optimization suggestions
**Disk and Network I/O**
- Storage usage monitoring
- Network activity tracking
- Performance bottleneck identification
**Temperature Monitoring**
- CPU temperature tracking (when available)
- Thermal throttling warnings
- Performance impact assessment
### Error Recovery and Export Progress
**Error Recovery Tracking**
- Automatic error detection and classification
- Recovery attempt progress monitoring
- Success/failure rate reporting
- User guidance for common issues
**Multi-Format Export Progress**
- Concurrent export to multiple formats
- Individual format progress tracking
- Export success rate monitoring
- Output file path reporting
### Progress Display Features
**Rich Visual Interface**
- Beautiful progress bars with Rich library
- Real-time stage and sub-stage updates
- Time remaining estimates
- Spinner animations for active operations
**Status Indicators**
- 🟢 Healthy resource usage
- 🟡 Moderate resource usage (warning)
- 🔴 High resource usage (critical)
- ✅ Completed operations
- ⚠️ Warnings and issues
- ❌ Errors and failures
**Progress Callbacks**
- Stage transition notifications
- Quality metric updates
- Performance benchmark reporting
- User guidance and tips
## Standard CLI Commands
### `youtube <url>`
Extract metadata from YouTube URLs without requiring API access.
**Usage:**
```bash
uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID
```
**Options:**
- `--download` - Download media after metadata extraction
- `--queue` - Add to batch queue for processing
- `--json` - Output as JSON (default)
- `--txt` - Output as plain text
**Examples:**
```bash
# Extract metadata only
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ
# Extract and download immediately ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --download
# Plain text output
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --txt
```
**Download Pipeline Status:****FULLY FUNCTIONAL**
- Media download with progress tracking
- Automatic file format detection
- Downloaded files saved to `data/media/downloads/`
- File hash generation for integrity verification
**Supported URL Formats:**
- `https://www.youtube.com/watch?v=VIDEO_ID`
- `https://youtu.be/VIDEO_ID`
- `https://www.youtube.com/watch?v=VIDEO_ID&t=123s`
### `batch-urls <file>`
Process multiple YouTube URLs from a text file.
**Usage:**
```bash
uv run python -m src.cli.main batch-urls urls.txt
```
**File Format:**
```
https://youtube.com/watch?v=video1
https://youtube.com/watch?v=video2
https://youtu.be/video3
```
**Options:**
- `--download` - Download all media after metadata extraction
- `--queue` - Add all to batch processing queue
- `--workers <n>` - Number of parallel workers (default: 4)
**Examples:**
```bash
# Process URLs file
uv run python -m src.cli.main batch-urls my_videos.txt
# Process and download with parallel processing ✅ WORKING
uv run python -m src.cli.main batch-urls my_videos.txt --download
# Download with text output format
uv run python -m src.cli.main batch-urls my_videos.txt --download --txt
```
**Batch Download Status:****FULLY FUNCTIONAL**
- Parallel processing of multiple URLs
- Progress tracking for each download
- Comprehensive success/failure reporting
- Automatic error handling and retry logic
### `transcribe <file>`
Transcribe a single audio or video file.
**Usage:**
```bash
uv run python -m src.cli.main transcribe path/to/audio.mp3
```
**Options:**
- `--v1` - Use v1 pipeline (Whisper only, default)
- `--v2` - Use v2 pipeline (Whisper + DeepSeek enhancement)
- `--json` - Output as JSON (default)
- `--txt` - Output as plain text
- `--min-accuracy <percent>` - Minimum accuracy threshold (default: 80%)
**Supported Formats:**
- Audio: MP3, WAV, M4A, FLAC, OGG
- Video: MP4, AVI, MOV, MKV, WEBM
**Examples:**
```bash
# Basic transcription (v1 pipeline)
uv run python -m src.cli.main transcribe lecture.mp3
# Enhanced transcription (v2 pipeline)
uv run python -m src.cli.main transcribe podcast.mp4 --v2
# Plain text output with accuracy threshold
uv run python -m src.cli.main transcribe audio.wav --txt --min-accuracy 90
```
### `batch <folder>`
Batch process multiple audio/video files in a directory.
**Usage:**
```bash
uv run python -m src.cli.main batch /path/to/audio/files
```
**Options:**
- `--v1` - Use v1 pipeline (default)
- `--v2` - Use v2 pipeline with enhancement
- `--workers <n>` - Number of parallel workers (default: 8)
- `--min-accuracy <percent>` - Minimum accuracy threshold (default: 80%)
- `--recursive` - Process subdirectories recursively
- `--pattern <glob>` - File pattern to match (e.g., "*.mp3")
**Examples:**
```bash
# Process all audio files with 8 workers
uv run python -m src.cli.main batch /Users/me/podcasts
# Enhanced processing with custom settings
uv run python -m src.cli.main batch /Users/me/lectures --v2 --workers 4 --min-accuracy 95
# Process only MP3 files recursively
uv run python -m src.cli.main batch /Users/me/audio --recursive --pattern "*.mp3"
```
## Enhanced CLI Features
### Real-Time Performance Monitoring
The enhanced CLI provides live system monitoring during processing:
```bash
# Performance stats are displayed automatically
CPU: 45.2% | Memory: 2.1GB/8GB (26%) | Temp: 65°C
```
**Monitored Metrics:**
- **CPU Usage**: Real-time CPU utilization percentage
- **Memory Usage**: Current and total memory with percentage
- **Temperature**: CPU temperature monitoring (when available)
- **Processing Speed**: Time estimates and completion percentages
### Enhanced Error Handling
The enhanced CLI provides intelligent error guidance:
```bash
# Memory error with helpful suggestions
❌ Memory error. Try using a smaller model with --model small or reduce concurrency.
# File not found with guidance
❌ File not found: lecture.mp3
💡 Check that the input file path is correct and the file exists.
# GPU error with alternatives
❌ CUDA out of memory
💡 GPU-related error. Try using --device cpu instead.
```
**Error Categories:**
- **File Errors**: Path validation and existence checks
- **Memory Errors**: Model size and concurrency suggestions
- **GPU Errors**: Device fallback recommendations
- **Permission Errors**: File access guidance
- **Generic Errors**: General troubleshooting tips
## Performance Guidelines
### Enhanced CLI Optimization
- **Default Concurrency:** 4 (balanced for most systems)
- **Memory Usage:** <2GB per pipeline
- **Processing Speed:** <30s for 5-minute audio (v1)
- **Real-time Factor:** <0.1 (much faster than real-time)
- **Progress Updates:** Every 2-5 seconds
### M3 MacBook Optimization
- **Default Workers:** 8 (optimal for M3 chip)
- **Memory Usage:** <2GB per pipeline
- **Processing Speed:** <30s for 5-minute audio (v1)
- **Real-time Factor:** <0.1 (much faster than real-time)
### Worker Configuration
```bash
# Conservative (low memory)
--workers 4
# Balanced (default)
--workers 8
# Aggressive (high-end M3)
--workers 12
```
## Output Formats
### Enhanced CLI Formats
The enhanced CLI supports multiple output formats:
#### JSON Output (Default)
```json
{
"text_content": "Never gonna give you up...",
"segments": [
{
"start": 0.0,
"end": 2.5,
"text": "Never gonna give you up"
}
],
"confidence": 0.95,
"processing_time": 5.2
}
```
#### Text Output
```
Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
...
```
#### SRT Subtitles
```
1
00:00:00,000 --> 00:00:02,500
Never gonna give you up
2
00:00:02,500 --> 00:00:05,000
Never gonna let you down
```
#### VTT Subtitles
```
WEBVTT
00:00:00.000 --> 00:00:02.500
Never gonna give you up
00:00:02.500 --> 00:00:05.000
Never gonna let you down
```
### Standard CLI Formats
#### JSON Output (Default)
```json
{
"youtube_id": "dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up",
"channel": "Rick Astley",
"duration_seconds": 212,
"transcript": {
"text": "Never gonna give you up...",
"segments": [...],
"confidence": 0.95
}
}
```
#### Text Output
```
Title: Rick Astley - Never Gonna Give You Up
Channel: Rick Astley
Duration: 3:32
Transcript:
Never gonna give you up
Never gonna let you down
...
```
## Common Workflows
### Enhanced CLI Workflows
#### Research Workflow (Enhanced)
```bash
# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt
# 2. Download selected videos
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download
# 3. Enhanced transcription with progress monitoring
uv run python -m src.cli.enhanced_cli transcribe downloaded_video.mp4 -m large --domain academic
# 4. Batch process with intelligent queuing
uv run python -m src.cli.enhanced_cli batch ~/Downloads/research_audio -c 6 -f srt
```
#### Academic Lecture Processing
```bash
# Process academic lectures with domain adaptation
uv run python -m src.cli.enhanced_cli batch ~/Lectures \
--domain academic \
-m large \
-f srt \
-c 4 \
--diarize \
--speakers 1
```
#### Podcast Production
```bash
# High-quality podcast transcription with speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Podcasts \
-m large \
-f vtt \
--diarize \
--speakers 3 \
-c 2
```
### Standard CLI Workflows
#### Research Workflow ✅ FUNCTIONAL
```bash
# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt
# 2. Download selected videos ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download
# 3. Transcribe downloaded media
uv run python -m src.cli.main transcribe data/media/downloads/video.m4a --v2
# 4. Batch process entire folder
uv run python -m src.cli.main batch data/media/downloads --v2
```
**Complete Pipeline Status:**
- **YouTube metadata extraction** - Working
- **Media download** - Working with progress tracking
- 🚧 **Transcription** - Ready for implementation
- 🚧 **Batch processing** - Ready for implementation
### Podcast Processing
```bash
# Process entire podcast folder with high accuracy
uv run python -m src.cli.main batch ~/Podcasts --v2 --min-accuracy 95 --workers 6
```
### Academic Lectures
```bash
# Conservative processing for complex academic content
uv run python -m src.cli.main batch ~/Lectures --v2 --workers 4 --min-accuracy 99
```
## Error Handling
Commands automatically handle common errors:
- **Network timeouts** - Automatic retry with exponential backoff
- **File format issues** - Automatic conversion to supported formats
- **Memory limits** - Automatic chunking for large files
- **API rate limits** - Automatic throttling and retry
For troubleshooting specific errors, see [TROUBLESHOOTING.md](TROUBLESHOOTING.md).
## Integration with Taskmaster
All CLI operations can be tracked using Taskmaster:
```bash
# Create task for batch processing
./scripts/tm_master.sh add "Process podcast archive with v2 pipeline"
# Track progress
./scripts/tm_workflow.sh update 15 "Processed 50 files, 10 remaining"
# Mark complete
./scripts/tm_master.sh done 15
```
See [Taskmaster Helper Scripts](../scripts/README_taskmaster_helpers.md) for complete integration guide.