16 KiB
CLI Command Reference
Complete reference for all Trax CLI commands with examples and options.
Command Structure
Trax provides two CLI interfaces:
Standard CLI
uv run python -m src.cli.main <command> [options] [arguments]
Enhanced CLI (Recommended)
uv run python -m src.cli.enhanced_cli <command> [options] [arguments]
The enhanced CLI provides:
- Real-time progress reporting with Rich progress bars
- Performance monitoring (CPU, memory, temperature)
- Intelligent batch processing with concurrent execution
- Enhanced error handling with user-friendly guidance
- Multiple export formats (JSON, TXT, SRT, VTT)
- Advanced features (speaker diarization, domain adaptation)
Enhanced CLI Commands
Enhanced CLI Overview
The enhanced CLI (src.cli.enhanced_cli) provides a modern, feature-rich interface with real-time progress reporting and advanced capabilities.
Key Features:
- Rich Progress Bars: Real-time transcription progress with time estimates
- Performance Monitoring: Live CPU, memory, and temperature tracking
- Intelligent Queuing: Batch processing with size-based prioritization
- Advanced Export: Multiple formats including SRT and VTT subtitles
- Error Guidance: Helpful suggestions for common issues
- Optional Features: Speaker diarization and domain adaptation
transcribe <input>
Enhanced single file transcription with progress reporting.
Usage:
uv run python -m src.cli.enhanced_cli transcribe input.wav
Options:
-o, --output PATH- Output directory (default: current directory)-f, --format [json|txt|srt|vtt]- Output format (default: json)-m, --model [tiny|base|small|medium|large]- Model size (default: base)-d, --device [cpu|cuda]- Processing device (default: cpu)--domain [general|technical|medical|academic]- Domain adaptation--diarize- Enable speaker diarization--speakers INTEGER- Number of speakers (for diarization)
Examples:
# Basic transcription with progress bar
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3
# Enhanced transcription with domain adaptation
uv run python -m src.cli.enhanced_cli transcribe medical_audio.wav --domain medical
# Speaker diarization with SRT output
uv run python -m src.cli.enhanced_cli transcribe interview.mp4 --diarize --speakers 2 -f srt
# High-quality transcription with large model
uv run python -m src.cli.enhanced_cli transcribe podcast.mp3 -m large -f vtt
batch <input>
Enhanced batch processing with intelligent queuing and concurrent execution.
Usage:
uv run python -m src.cli.enhanced_cli batch /path/to/audio/files
Options:
-o, --output PATH- Output directory (default: current directory)-c, --concurrency INTEGER- Number of concurrent processes (default: 4)-f, --format [json|txt|srt|vtt]- Output format (default: json)-m, --model [tiny|base|small|medium|large]- Model size (default: base)-d, --device [cpu|cuda]- Processing device (default: cpu)--domain [general|technical|medical|academic]- Domain adaptation--diarize- Enable speaker diarization--speakers INTEGER- Number of speakers (for diarization)
Examples:
# Batch process with 8 concurrent workers
uv run python -m src.cli.enhanced_cli batch ~/Podcasts -c 8
# Process with domain adaptation and speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Lectures --domain academic --diarize
# Conservative processing for memory-constrained systems
uv run python -m src.cli.enhanced_cli batch ~/Audio -c 2 -m small
# High-quality batch processing
uv run python -m src.cli.enhanced_cli batch ~/Interviews -m large -f srt --diarize --speakers 3
Intelligent Queuing: The enhanced batch processor automatically:
- Sorts files by size (smaller files first for faster feedback)
- Monitors system resources in real-time
- Provides detailed progress for each file
- Handles errors gracefully without stopping the batch
Enhanced Progress Tracking Features
Multi-Pass Pipeline Progress Visualization
When using the --multi-pass option, the CLI provides detailed progress tracking for each stage of the multi-pass transcription pipeline:
Stage 1: Fast Transcription Pass
- Real-time progress with confidence scoring
- Segment generation and quality assessment
- Low-confidence segment identification
Stage 2: Refinement Pass
- Progress tracking for low-confidence segments
- Audio slicing and re-transcription
- Quality improvement monitoring
Stage 3: Enhancement Pass
- Domain-specific enhancement progress
- Content optimization tracking
- Final quality validation
Stage 4: Speaker Diarization (if enabled)
- Parallel speaker identification
- Speaker count and segmentation progress
- Integration with transcription results
System Resource Monitoring
The enhanced CLI includes real-time system resource monitoring:
CPU Usage Monitoring
- Current and peak CPU utilization
- Performance warnings at 80%+ and 95%+ thresholds
- Processing optimization recommendations
Memory Usage Tracking
- Real-time memory consumption
- Peak memory usage during processing
- Memory optimization suggestions
Disk and Network I/O
- Storage usage monitoring
- Network activity tracking
- Performance bottleneck identification
Temperature Monitoring
- CPU temperature tracking (when available)
- Thermal throttling warnings
- Performance impact assessment
Error Recovery and Export Progress
Error Recovery Tracking
- Automatic error detection and classification
- Recovery attempt progress monitoring
- Success/failure rate reporting
- User guidance for common issues
Multi-Format Export Progress
- Concurrent export to multiple formats
- Individual format progress tracking
- Export success rate monitoring
- Output file path reporting
Progress Display Features
Rich Visual Interface
- Beautiful progress bars with Rich library
- Real-time stage and sub-stage updates
- Time remaining estimates
- Spinner animations for active operations
Status Indicators
- 🟢 Healthy resource usage
- 🟡 Moderate resource usage (warning)
- 🔴 High resource usage (critical)
- ✅ Completed operations
- ⚠️ Warnings and issues
- ❌ Errors and failures
Progress Callbacks
- Stage transition notifications
- Quality metric updates
- Performance benchmark reporting
- User guidance and tips
Standard CLI Commands
youtube <url>
Extract metadata from YouTube URLs without requiring API access.
Usage:
uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID
Options:
--download- Download media after metadata extraction--queue- Add to batch queue for processing--json- Output as JSON (default)--txt- Output as plain text
Examples:
# Extract metadata only
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ
# Extract and download immediately ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --download
# Plain text output
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --txt
Download Pipeline Status: ✅ FULLY FUNCTIONAL
- Media download with progress tracking
- Automatic file format detection
- Downloaded files saved to
data/media/downloads/ - File hash generation for integrity verification
Supported URL Formats:
https://www.youtube.com/watch?v=VIDEO_IDhttps://youtu.be/VIDEO_IDhttps://www.youtube.com/watch?v=VIDEO_ID&t=123s
batch-urls <file>
Process multiple YouTube URLs from a text file.
Usage:
uv run python -m src.cli.main batch-urls urls.txt
File Format:
https://youtube.com/watch?v=video1
https://youtube.com/watch?v=video2
https://youtu.be/video3
Options:
--download- Download all media after metadata extraction--queue- Add all to batch processing queue--workers <n>- Number of parallel workers (default: 4)
Examples:
# Process URLs file
uv run python -m src.cli.main batch-urls my_videos.txt
# Process and download with parallel processing ✅ WORKING
uv run python -m src.cli.main batch-urls my_videos.txt --download
# Download with text output format
uv run python -m src.cli.main batch-urls my_videos.txt --download --txt
Batch Download Status: ✅ FULLY FUNCTIONAL
- Parallel processing of multiple URLs
- Progress tracking for each download
- Comprehensive success/failure reporting
- Automatic error handling and retry logic
transcribe <file>
Transcribe a single audio or video file.
Usage:
uv run python -m src.cli.main transcribe path/to/audio.mp3
Options:
--v1- Use v1 pipeline (Whisper only, default)--v2- Use v2 pipeline (Whisper + DeepSeek enhancement)--json- Output as JSON (default)--txt- Output as plain text--min-accuracy <percent>- Minimum accuracy threshold (default: 80%)
Supported Formats:
- Audio: MP3, WAV, M4A, FLAC, OGG
- Video: MP4, AVI, MOV, MKV, WEBM
Examples:
# Basic transcription (v1 pipeline)
uv run python -m src.cli.main transcribe lecture.mp3
# Enhanced transcription (v2 pipeline)
uv run python -m src.cli.main transcribe podcast.mp4 --v2
# Plain text output with accuracy threshold
uv run python -m src.cli.main transcribe audio.wav --txt --min-accuracy 90
batch <folder>
Batch process multiple audio/video files in a directory.
Usage:
uv run python -m src.cli.main batch /path/to/audio/files
Options:
--v1- Use v1 pipeline (default)--v2- Use v2 pipeline with enhancement--workers <n>- Number of parallel workers (default: 8)--min-accuracy <percent>- Minimum accuracy threshold (default: 80%)--recursive- Process subdirectories recursively--pattern <glob>- File pattern to match (e.g., "*.mp3")
Examples:
# Process all audio files with 8 workers
uv run python -m src.cli.main batch /Users/me/podcasts
# Enhanced processing with custom settings
uv run python -m src.cli.main batch /Users/me/lectures --v2 --workers 4 --min-accuracy 95
# Process only MP3 files recursively
uv run python -m src.cli.main batch /Users/me/audio --recursive --pattern "*.mp3"
Enhanced CLI Features
Real-Time Performance Monitoring
The enhanced CLI provides live system monitoring during processing:
# Performance stats are displayed automatically
CPU: 45.2% | Memory: 2.1GB/8GB (26%) | Temp: 65°C
Monitored Metrics:
- CPU Usage: Real-time CPU utilization percentage
- Memory Usage: Current and total memory with percentage
- Temperature: CPU temperature monitoring (when available)
- Processing Speed: Time estimates and completion percentages
Enhanced Error Handling
The enhanced CLI provides intelligent error guidance:
# Memory error with helpful suggestions
❌ Memory error. Try using a smaller model with --model small or reduce concurrency.
# File not found with guidance
❌ File not found: lecture.mp3
💡 Check that the input file path is correct and the file exists.
# GPU error with alternatives
❌ CUDA out of memory
💡 GPU-related error. Try using --device cpu instead.
Error Categories:
- File Errors: Path validation and existence checks
- Memory Errors: Model size and concurrency suggestions
- GPU Errors: Device fallback recommendations
- Permission Errors: File access guidance
- Generic Errors: General troubleshooting tips
Performance Guidelines
Enhanced CLI Optimization
- Default Concurrency: 4 (balanced for most systems)
- Memory Usage: <2GB per pipeline
- Processing Speed: <30s for 5-minute audio (v1)
- Real-time Factor: <0.1 (much faster than real-time)
- Progress Updates: Every 2-5 seconds
M3 MacBook Optimization
- Default Workers: 8 (optimal for M3 chip)
- Memory Usage: <2GB per pipeline
- Processing Speed: <30s for 5-minute audio (v1)
- Real-time Factor: <0.1 (much faster than real-time)
Worker Configuration
# Conservative (low memory)
--workers 4
# Balanced (default)
--workers 8
# Aggressive (high-end M3)
--workers 12
Output Formats
Enhanced CLI Formats
The enhanced CLI supports multiple output formats:
JSON Output (Default)
{
"text_content": "Never gonna give you up...",
"segments": [
{
"start": 0.0,
"end": 2.5,
"text": "Never gonna give you up"
}
],
"confidence": 0.95,
"processing_time": 5.2
}
Text Output
Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
...
SRT Subtitles
1
00:00:00,000 --> 00:00:02,500
Never gonna give you up
2
00:00:02,500 --> 00:00:05,000
Never gonna let you down
VTT Subtitles
WEBVTT
00:00:00.000 --> 00:00:02.500
Never gonna give you up
00:00:02.500 --> 00:00:05.000
Never gonna let you down
Standard CLI Formats
JSON Output (Default)
{
"youtube_id": "dQw4w9WgXcQ",
"title": "Rick Astley - Never Gonna Give You Up",
"channel": "Rick Astley",
"duration_seconds": 212,
"transcript": {
"text": "Never gonna give you up...",
"segments": [...],
"confidence": 0.95
}
}
Text Output
Title: Rick Astley - Never Gonna Give You Up
Channel: Rick Astley
Duration: 3:32
Transcript:
Never gonna give you up
Never gonna let you down
...
Common Workflows
Enhanced CLI Workflows
Research Workflow (Enhanced)
# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt
# 2. Download selected videos
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download
# 3. Enhanced transcription with progress monitoring
uv run python -m src.cli.enhanced_cli transcribe downloaded_video.mp4 -m large --domain academic
# 4. Batch process with intelligent queuing
uv run python -m src.cli.enhanced_cli batch ~/Downloads/research_audio -c 6 -f srt
Academic Lecture Processing
# Process academic lectures with domain adaptation
uv run python -m src.cli.enhanced_cli batch ~/Lectures \
--domain academic \
-m large \
-f srt \
-c 4 \
--diarize \
--speakers 1
Podcast Production
# High-quality podcast transcription with speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Podcasts \
-m large \
-f vtt \
--diarize \
--speakers 3 \
-c 2
Standard CLI Workflows
Research Workflow ✅ FUNCTIONAL
# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt
# 2. Download selected videos ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download
# 3. Transcribe downloaded media
uv run python -m src.cli.main transcribe data/media/downloads/video.m4a --v2
# 4. Batch process entire folder
uv run python -m src.cli.main batch data/media/downloads --v2
Complete Pipeline Status:
- ✅ YouTube metadata extraction - Working
- ✅ Media download - Working with progress tracking
- 🚧 Transcription - Ready for implementation
- 🚧 Batch processing - Ready for implementation
Podcast Processing
# Process entire podcast folder with high accuracy
uv run python -m src.cli.main batch ~/Podcasts --v2 --min-accuracy 95 --workers 6
Academic Lectures
# Conservative processing for complex academic content
uv run python -m src.cli.main batch ~/Lectures --v2 --workers 4 --min-accuracy 99
Error Handling
Commands automatically handle common errors:
- Network timeouts - Automatic retry with exponential backoff
- File format issues - Automatic conversion to supported formats
- Memory limits - Automatic chunking for large files
- API rate limits - Automatic throttling and retry
For troubleshooting specific errors, see TROUBLESHOOTING.md.
Integration with Taskmaster
All CLI operations can be tracked using Taskmaster:
# Create task for batch processing
./scripts/tm_master.sh add "Process podcast archive with v2 pipeline"
# Track progress
./scripts/tm_workflow.sh update 15 "Processed 50 files, 10 remaining"
# Mark complete
./scripts/tm_master.sh done 15
See Taskmaster Helper Scripts for complete integration guide.