trax/docs/CLI.md

# CLI Command Reference

Complete reference for all Trax CLI commands with examples and options.

## Command Structure

Trax provides two CLI interfaces:

### Standard CLI
```bash
uv run python -m src.cli.main <command> [options] [arguments]
```

### Enhanced CLI (Recommended)
```bash
uv run python -m src.cli.enhanced_cli <command> [options] [arguments]
```

The enhanced CLI provides:
- **Real-time progress reporting** with Rich progress bars
- **Performance monitoring** (CPU, memory, temperature)
- **Intelligent batch processing** with concurrent execution
- **Enhanced error handling** with user-friendly guidance
- **Multiple export formats** (JSON, TXT, SRT, VTT)
- **Advanced features** (speaker diarization, domain adaptation)

## Enhanced CLI Commands

### Enhanced CLI Overview

The enhanced CLI (`src.cli.enhanced_cli`) provides a modern, feature-rich interface with real-time progress reporting and advanced capabilities.

**Key Features:**
- **Rich Progress Bars**: Real-time transcription progress with time estimates
- **Performance Monitoring**: Live CPU, memory, and temperature tracking
- **Intelligent Queuing**: Batch processing with size-based prioritization
- **Advanced Export**: Multiple formats including SRT and VTT subtitles
- **Error Guidance**: Helpful suggestions for common issues
- **Optional Features**: Speaker diarization and domain adaptation

### `transcribe <input>`
Enhanced single file transcription with progress reporting.

**Usage:**
```bash
uv run python -m src.cli.enhanced_cli transcribe input.wav
```

**Options:**
- `-o, --output PATH` - Output directory (default: current directory)
- `-f, --format [json|txt|srt|vtt]` - Output format (default: json)
- `-m, --model [tiny|base|small|medium|large]` - Model size (default: base)
- `-d, --device [cpu|cuda]` - Processing device (default: cpu)
- `--domain [general|technical|medical|academic]` - Domain adaptation
- `--diarize` - Enable speaker diarization
- `--speakers INTEGER` - Number of speakers (for diarization)

**Examples:**
```bash
# Basic transcription with progress bar
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3

# Enhanced transcription with domain adaptation
uv run python -m src.cli.enhanced_cli transcribe medical_audio.wav --domain medical

# Speaker diarization with SRT output
uv run python -m src.cli.enhanced_cli transcribe interview.mp4 --diarize --speakers 2 -f srt

# High-quality transcription with large model
uv run python -m src.cli.enhanced_cli transcribe podcast.mp3 -m large -f vtt
```

### `batch <input>`
Enhanced batch processing with intelligent queuing and concurrent execution.

**Usage:**
```bash
uv run python -m src.cli.enhanced_cli batch /path/to/audio/files
```

**Options:**
- `-o, --output PATH` - Output directory (default: current directory)
- `-c, --concurrency INTEGER` - Number of concurrent processes (default: 4)
- `-f, --format [json|txt|srt|vtt]` - Output format (default: json)
- `-m, --model [tiny|base|small|medium|large]` - Model size (default: base)
- `-d, --device [cpu|cuda]` - Processing device (default: cpu)
- `--domain [general|technical|medical|academic]` - Domain adaptation
- `--diarize` - Enable speaker diarization
- `--speakers INTEGER` - Number of speakers (for diarization)

**Examples:**
```bash
# Batch process with 8 concurrent workers
uv run python -m src.cli.enhanced_cli batch ~/Podcasts -c 8

# Process with domain adaptation and speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Lectures --domain academic --diarize

# Conservative processing for memory-constrained systems
uv run python -m src.cli.enhanced_cli batch ~/Audio -c 2 -m small

# High-quality batch processing
uv run python -m src.cli.enhanced_cli batch ~/Interviews -m large -f srt --diarize --speakers 3
```

**Intelligent Queuing:**
The enhanced batch processor automatically:
- Sorts files by size (smaller files first for faster feedback)
- Monitors system resources in real-time
- Provides detailed progress for each file
- Handles errors gracefully without stopping the batch

## Enhanced Progress Tracking Features

### Multi-Pass Pipeline Progress Visualization

When using the `--multi-pass` option, the CLI provides detailed progress tracking for each stage of the multi-pass transcription pipeline:

**Stage 1: Fast Transcription Pass**
- Real-time progress with confidence scoring
- Segment generation and quality assessment
- Low-confidence segment identification

**Stage 2: Refinement Pass**
- Progress tracking for low-confidence segments
- Audio slicing and re-transcription
- Quality improvement monitoring

**Stage 3: Enhancement Pass**
- Domain-specific enhancement progress
- Content optimization tracking
- Final quality validation

**Stage 4: Speaker Diarization (if enabled)**
- Parallel speaker identification
- Speaker count and segmentation progress
- Integration with transcription results

### System Resource Monitoring

The enhanced CLI includes real-time system resource monitoring:

**CPU Usage Monitoring**
- Current and peak CPU utilization
- Performance warnings at 80%+ and 95%+ thresholds
- Processing optimization recommendations

**Memory Usage Tracking**
- Real-time memory consumption
- Peak memory usage during processing
- Memory optimization suggestions

**Disk and Network I/O**
- Storage usage monitoring
- Network activity tracking
- Performance bottleneck identification

**Temperature Monitoring**
- CPU temperature tracking (when available)
- Thermal throttling warnings
- Performance impact assessment

### Error Recovery and Export Progress

**Error Recovery Tracking**
- Automatic error detection and classification
- Recovery attempt progress monitoring
- Success/failure rate reporting
- User guidance for common issues

**Multi-Format Export Progress**
- Concurrent export to multiple formats
- Individual format progress tracking
- Export success rate monitoring
- Output file path reporting

### Progress Display Features

**Rich Visual Interface**
- Beautiful progress bars with Rich library
- Real-time stage and sub-stage updates
- Time remaining estimates
- Spinner animations for active operations

**Status Indicators**
- 🟢 Healthy resource usage
- 🟡 Moderate resource usage (warning)
- 🔴 High resource usage (critical)
- ✅ Completed operations
- ⚠️ Warnings and issues
- ❌ Errors and failures

**Progress Callbacks**
- Stage transition notifications
- Quality metric updates
- Performance benchmark reporting
- User guidance and tips

## Standard CLI Commands

### `youtube <url>`
Extract metadata from YouTube URLs without requiring API access.

**Usage:**
```bash
uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID
```

**Options:**
- `--download` - Download media after metadata extraction
- `--queue` - Add to batch queue for processing
- `--json` - Output as JSON (default)
- `--txt` - Output as plain text

**Examples:**
```bash
# Extract metadata only
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ

# Extract and download immediately ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --download

# Plain text output
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --txt
```

**Download Pipeline Status:** ✅ **FULLY FUNCTIONAL**
- Media download with progress tracking
- Automatic file format detection
- Downloaded files saved to `data/media/downloads/`
- File hash generation for integrity verification

**Supported URL Formats:**
- `https://www.youtube.com/watch?v=VIDEO_ID`
- `https://youtu.be/VIDEO_ID`
- `https://www.youtube.com/watch?v=VIDEO_ID&t=123s`

### `batch-urls <file>`
Process multiple YouTube URLs from a text file.

**Usage:**
```bash
uv run python -m src.cli.main batch-urls urls.txt
```

**File Format:**
```
https://youtube.com/watch?v=video1
https://youtube.com/watch?v=video2
https://youtu.be/video3
```

**Options:**
- `--download` - Download all media after metadata extraction
- `--queue` - Add all to batch processing queue
- `--workers <n>` - Number of parallel workers (default: 4)

**Examples:**
```bash
# Process URLs file
uv run python -m src.cli.main batch-urls my_videos.txt

# Process and download with parallel processing ✅ WORKING
uv run python -m src.cli.main batch-urls my_videos.txt --download

# Download with text output format
uv run python -m src.cli.main batch-urls my_videos.txt --download --txt
```

**Batch Download Status:** ✅ **FULLY FUNCTIONAL**
- Parallel processing of multiple URLs
- Progress tracking for each download
- Comprehensive success/failure reporting
- Automatic error handling and retry logic

### `transcribe <file>`
Transcribe a single audio or video file.

**Usage:**
```bash
uv run python -m src.cli.main transcribe path/to/audio.mp3
```

**Options:**
- `--v1` - Use v1 pipeline (Whisper only, default)
- `--v2` - Use v2 pipeline (Whisper + DeepSeek enhancement)
- `--json` - Output as JSON (default)
- `--txt` - Output as plain text
- `--min-accuracy <percent>` - Minimum accuracy threshold (default: 80%)

**Supported Formats:**
- Audio: MP3, WAV, M4A, FLAC, OGG
- Video: MP4, AVI, MOV, MKV, WEBM

**Examples:**
```bash
# Basic transcription (v1 pipeline)
uv run python -m src.cli.main transcribe lecture.mp3

# Enhanced transcription (v2 pipeline)
uv run python -m src.cli.main transcribe podcast.mp4 --v2

# Plain text output with accuracy threshold
uv run python -m src.cli.main transcribe audio.wav --txt --min-accuracy 90
```

### `batch <folder>`
Batch process multiple audio/video files in a directory.

**Usage:**
```bash
uv run python -m src.cli.main batch /path/to/audio/files
```

**Options:**
- `--v1` - Use v1 pipeline (default)
- `--v2` - Use v2 pipeline with enhancement
- `--workers <n>` - Number of parallel workers (default: 8)
- `--min-accuracy <percent>` - Minimum accuracy threshold (default: 80%)
- `--recursive` - Process subdirectories recursively
- `--pattern <glob>` - File pattern to match (e.g., "*.mp3")

**Examples:**
```bash
# Process all audio files with 8 workers
uv run python -m src.cli.main batch /Users/me/podcasts

# Enhanced processing with custom settings
uv run python -m src.cli.main batch /Users/me/lectures --v2 --workers 4 --min-accuracy 95

# Process only MP3 files recursively
uv run python -m src.cli.main batch /Users/me/audio --recursive --pattern "*.mp3"
```

## Enhanced CLI Features

### Real-Time Performance Monitoring

The enhanced CLI provides live system monitoring during processing:

```bash
# Performance stats are displayed automatically
CPU: 45.2% | Memory: 2.1GB/8GB (26%) | Temp: 65°C
```

**Monitored Metrics:**
- **CPU Usage**: Real-time CPU utilization percentage
- **Memory Usage**: Current and total memory with percentage
- **Temperature**: CPU temperature monitoring (when available)
- **Processing Speed**: Time estimates and completion percentages

### Enhanced Error Handling

The enhanced CLI provides intelligent error guidance:

```bash
# Memory error with helpful suggestions
❌ Memory error. Try using a smaller model with --model small or reduce concurrency.

# File not found with guidance
❌ File not found: lecture.mp3
💡 Check that the input file path is correct and the file exists.

# GPU error with alternatives
❌ CUDA out of memory
💡 GPU-related error. Try using --device cpu instead.
```

**Error Categories:**
- **File Errors**: Path validation and existence checks
- **Memory Errors**: Model size and concurrency suggestions
- **GPU Errors**: Device fallback recommendations
- **Permission Errors**: File access guidance
- **Generic Errors**: General troubleshooting tips

## Performance Guidelines

### Enhanced CLI Optimization
- **Default Concurrency:** 4 (balanced for most systems)
- **Memory Usage:** <2GB per pipeline
- **Processing Speed:** <30s for 5-minute audio (v1)
- **Real-time Factor:** <0.1 (much faster than real-time)
- **Progress Updates:** Every 2-5 seconds

### M3 MacBook Optimization
- **Default Workers:** 8 (optimal for M3 chip)
- **Memory Usage:** <2GB per pipeline
- **Processing Speed:** <30s for 5-minute audio (v1)
- **Real-time Factor:** <0.1 (much faster than real-time)

### Worker Configuration
```bash
# Conservative (low memory)
--workers 4

# Balanced (default)
--workers 8

# Aggressive (high-end M3)
--workers 12
```

## Output Formats

### Enhanced CLI Formats

The enhanced CLI supports multiple output formats:

#### JSON Output (Default)
```json
{
  "text_content": "Never gonna give you up...",
  "segments": [
    {
      "start": 0.0,
      "end": 2.5,
      "text": "Never gonna give you up"
    }
  ],
  "confidence": 0.95,
  "processing_time": 5.2
}
```

#### Text Output
```
Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
...
```

#### SRT Subtitles
```
1
00:00:00,000 --> 00:00:02,500
Never gonna give you up

2
00:00:02,500 --> 00:00:05,000
Never gonna let you down
```

#### VTT Subtitles
```
WEBVTT

00:00:00.000 --> 00:00:02.500
Never gonna give you up

00:00:02.500 --> 00:00:05.000
Never gonna let you down
```

### Standard CLI Formats

#### JSON Output (Default)
```json
{
  "youtube_id": "dQw4w9WgXcQ",
  "title": "Rick Astley - Never Gonna Give You Up",
  "channel": "Rick Astley",
  "duration_seconds": 212,
  "transcript": {
    "text": "Never gonna give you up...",
    "segments": [...],
    "confidence": 0.95
  }
}
```

#### Text Output
```
Title: Rick Astley - Never Gonna Give You Up
Channel: Rick Astley
Duration: 3:32

Transcript:
Never gonna give you up
Never gonna let you down
...
```

## Common Workflows

### Enhanced CLI Workflows

#### Research Workflow (Enhanced)
```bash
# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt

# 2. Download selected videos
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download

# 3. Enhanced transcription with progress monitoring
uv run python -m src.cli.enhanced_cli transcribe downloaded_video.mp4 -m large --domain academic

# 4. Batch process with intelligent queuing
uv run python -m src.cli.enhanced_cli batch ~/Downloads/research_audio -c 6 -f srt
```

#### Academic Lecture Processing
```bash
# Process academic lectures with domain adaptation
uv run python -m src.cli.enhanced_cli batch ~/Lectures \
  --domain academic \
  -m large \
  -f srt \
  -c 4 \
  --diarize \
  --speakers 1
```

#### Podcast Production
```bash
# High-quality podcast transcription with speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Podcasts \
  -m large \
  -f vtt \
  --diarize \
  --speakers 3 \
  -c 2
```

### Standard CLI Workflows

#### Research Workflow ✅ FUNCTIONAL
```bash
# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt

# 2. Download selected videos ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download

# 3. Transcribe downloaded media
uv run python -m src.cli.main transcribe data/media/downloads/video.m4a --v2

# 4. Batch process entire folder
uv run python -m src.cli.main batch data/media/downloads --v2
```

**Complete Pipeline Status:**
- ✅ **YouTube metadata extraction** - Working
- ✅ **Media download** - Working with progress tracking
- 🚧 **Transcription** - Ready for implementation
- 🚧 **Batch processing** - Ready for implementation

### Podcast Processing
```bash
# Process entire podcast folder with high accuracy
uv run python -m src.cli.main batch ~/Podcasts --v2 --min-accuracy 95 --workers 6
```

### Academic Lectures
```bash
# Conservative processing for complex academic content
uv run python -m src.cli.main batch ~/Lectures --v2 --workers 4 --min-accuracy 99
```

## Error Handling

Commands automatically handle common errors:
- **Network timeouts** - Automatic retry with exponential backoff
- **File format issues** - Automatic conversion to supported formats
- **Memory limits** - Automatic chunking for large files
- **API rate limits** - Automatic throttling and retry

For troubleshooting specific errors, see [TROUBLESHOOTING.md](TROUBLESHOOTING.md).

## Integration with Taskmaster

All CLI operations can be tracked using Taskmaster:

```bash
# Create task for batch processing
./scripts/tm_master.sh add "Process podcast archive with v2 pipeline"

# Track progress
./scripts/tm_workflow.sh update 15 "Processed 50 files, 10 remaining"

# Mark complete
./scripts/tm_master.sh done 15
```

See [Taskmaster Helper Scripts](../scripts/README_taskmaster_helpers.md) for complete integration guide.