16 KiB

Raw Blame History

CLI Command Reference

Complete reference for all Trax CLI commands with examples and options.

Command Structure

Trax provides two CLI interfaces:

Standard CLI

uv run python -m src.cli.main <command> [options] [arguments]

Enhanced CLI (Recommended)

uv run python -m src.cli.enhanced_cli <command> [options] [arguments]

The enhanced CLI provides:

Real-time progress reporting with Rich progress bars
Performance monitoring (CPU, memory, temperature)
Intelligent batch processing with concurrent execution
Enhanced error handling with user-friendly guidance
Multiple export formats (JSON, TXT, SRT, VTT)
Advanced features (speaker diarization, domain adaptation)

Enhanced CLI Commands

Enhanced CLI Overview

The enhanced CLI (src.cli.enhanced_cli) provides a modern, feature-rich interface with real-time progress reporting and advanced capabilities.

Key Features:

Rich Progress Bars: Real-time transcription progress with time estimates
Performance Monitoring: Live CPU, memory, and temperature tracking
Intelligent Queuing: Batch processing with size-based prioritization
Advanced Export: Multiple formats including SRT and VTT subtitles
Error Guidance: Helpful suggestions for common issues
Optional Features: Speaker diarization and domain adaptation

`transcribe <input>`

Enhanced single file transcription with progress reporting.

Usage:

uv run python -m src.cli.enhanced_cli transcribe input.wav

Options:

-o, --output PATH - Output directory (default: current directory)
-f, --format [json|txt|srt|vtt] - Output format (default: json)
-m, --model [tiny|base|small|medium|large] - Model size (default: base)
-d, --device [cpu|cuda] - Processing device (default: cpu)
--domain [general|technical|medical|academic] - Domain adaptation
--diarize - Enable speaker diarization
--speakers INTEGER - Number of speakers (for diarization)

Examples:

# Basic transcription with progress bar
uv run python -m src.cli.enhanced_cli transcribe lecture.mp3

# Enhanced transcription with domain adaptation
uv run python -m src.cli.enhanced_cli transcribe medical_audio.wav --domain medical

# Speaker diarization with SRT output
uv run python -m src.cli.enhanced_cli transcribe interview.mp4 --diarize --speakers 2 -f srt

# High-quality transcription with large model
uv run python -m src.cli.enhanced_cli transcribe podcast.mp3 -m large -f vtt

`batch <input>`

Enhanced batch processing with intelligent queuing and concurrent execution.

Usage:

uv run python -m src.cli.enhanced_cli batch /path/to/audio/files

Options:

-o, --output PATH - Output directory (default: current directory)
-c, --concurrency INTEGER - Number of concurrent processes (default: 4)
-f, --format [json|txt|srt|vtt] - Output format (default: json)
-m, --model [tiny|base|small|medium|large] - Model size (default: base)
-d, --device [cpu|cuda] - Processing device (default: cpu)
--domain [general|technical|medical|academic] - Domain adaptation
--diarize - Enable speaker diarization
--speakers INTEGER - Number of speakers (for diarization)

Examples:

# Batch process with 8 concurrent workers
uv run python -m src.cli.enhanced_cli batch ~/Podcasts -c 8

# Process with domain adaptation and speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Lectures --domain academic --diarize

# Conservative processing for memory-constrained systems
uv run python -m src.cli.enhanced_cli batch ~/Audio -c 2 -m small

# High-quality batch processing
uv run python -m src.cli.enhanced_cli batch ~/Interviews -m large -f srt --diarize --speakers 3

Intelligent Queuing: The enhanced batch processor automatically:

Sorts files by size (smaller files first for faster feedback)
Monitors system resources in real-time
Provides detailed progress for each file
Handles errors gracefully without stopping the batch

Enhanced Progress Tracking Features

Multi-Pass Pipeline Progress Visualization

When using the --multi-pass option, the CLI provides detailed progress tracking for each stage of the multi-pass transcription pipeline:

Stage 1: Fast Transcription Pass

Real-time progress with confidence scoring
Segment generation and quality assessment
Low-confidence segment identification

Stage 2: Refinement Pass

Progress tracking for low-confidence segments
Audio slicing and re-transcription
Quality improvement monitoring

Stage 3: Enhancement Pass

Domain-specific enhancement progress
Content optimization tracking
Final quality validation

Stage 4: Speaker Diarization (if enabled)

Parallel speaker identification
Speaker count and segmentation progress
Integration with transcription results

System Resource Monitoring

The enhanced CLI includes real-time system resource monitoring:

CPU Usage Monitoring

Current and peak CPU utilization
Performance warnings at 80%+ and 95%+ thresholds
Processing optimization recommendations

Memory Usage Tracking

Real-time memory consumption
Peak memory usage during processing
Memory optimization suggestions

Disk and Network I/O

Storage usage monitoring
Network activity tracking
Performance bottleneck identification

Temperature Monitoring

CPU temperature tracking (when available)
Thermal throttling warnings
Performance impact assessment

Error Recovery and Export Progress

Error Recovery Tracking

Automatic error detection and classification
Recovery attempt progress monitoring
Success/failure rate reporting
User guidance for common issues

Multi-Format Export Progress

Concurrent export to multiple formats
Individual format progress tracking
Export success rate monitoring
Output file path reporting

Progress Display Features

Rich Visual Interface

Beautiful progress bars with Rich library
Real-time stage and sub-stage updates
Time remaining estimates
Spinner animations for active operations

Status Indicators

🟢 Healthy resource usage
🟡 Moderate resource usage (warning)
🔴 High resource usage (critical)
✅ Completed operations
⚠️ Warnings and issues
❌ Errors and failures

Progress Callbacks

Stage transition notifications
Quality metric updates
Performance benchmark reporting
User guidance and tips

Standard CLI Commands

`youtube <url>`

Extract metadata from YouTube URLs without requiring API access.

Usage:

uv run python -m src.cli.main youtube https://youtube.com/watch?v=VIDEO_ID

Options:

--download - Download media after metadata extraction
--queue - Add to batch queue for processing
--json - Output as JSON (default)
--txt - Output as plain text

Examples:

# Extract metadata only
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ

# Extract and download immediately ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --download

# Plain text output
uv run python -m src.cli.main youtube https://youtube.com/watch?v=dQw4w9WgXcQ --txt

Download Pipeline Status: ✅ FULLY FUNCTIONAL

Media download with progress tracking
Automatic file format detection
Downloaded files saved to data/media/downloads/
File hash generation for integrity verification

Supported URL Formats:

https://www.youtube.com/watch?v=VIDEO_ID
https://youtu.be/VIDEO_ID
https://www.youtube.com/watch?v=VIDEO_ID&t=123s

`batch-urls <file>`

Process multiple YouTube URLs from a text file.

Usage:

uv run python -m src.cli.main batch-urls urls.txt

File Format:

https://youtube.com/watch?v=video1
https://youtube.com/watch?v=video2
https://youtu.be/video3

Options:

--download - Download all media after metadata extraction
--queue - Add all to batch processing queue
--workers <n> - Number of parallel workers (default: 4)

Examples:

# Process URLs file
uv run python -m src.cli.main batch-urls my_videos.txt

# Process and download with parallel processing ✅ WORKING
uv run python -m src.cli.main batch-urls my_videos.txt --download

# Download with text output format
uv run python -m src.cli.main batch-urls my_videos.txt --download --txt

Batch Download Status: ✅ FULLY FUNCTIONAL

Parallel processing of multiple URLs
Progress tracking for each download
Comprehensive success/failure reporting
Automatic error handling and retry logic

`transcribe <file>`

Transcribe a single audio or video file.

Usage:

uv run python -m src.cli.main transcribe path/to/audio.mp3

Options:

--v1 - Use v1 pipeline (Whisper only, default)
--v2 - Use v2 pipeline (Whisper + DeepSeek enhancement)
--json - Output as JSON (default)
--txt - Output as plain text
--min-accuracy <percent> - Minimum accuracy threshold (default: 80%)

Supported Formats:

Audio: MP3, WAV, M4A, FLAC, OGG
Video: MP4, AVI, MOV, MKV, WEBM

Examples:

# Basic transcription (v1 pipeline)
uv run python -m src.cli.main transcribe lecture.mp3

# Enhanced transcription (v2 pipeline)
uv run python -m src.cli.main transcribe podcast.mp4 --v2

# Plain text output with accuracy threshold
uv run python -m src.cli.main transcribe audio.wav --txt --min-accuracy 90

`batch <folder>`

Batch process multiple audio/video files in a directory.

Usage:

uv run python -m src.cli.main batch /path/to/audio/files

Options:

--v1 - Use v1 pipeline (default)
--v2 - Use v2 pipeline with enhancement
--workers <n> - Number of parallel workers (default: 8)
--min-accuracy <percent> - Minimum accuracy threshold (default: 80%)
--recursive - Process subdirectories recursively
--pattern <glob> - File pattern to match (e.g., "*.mp3")

Examples:

# Process all audio files with 8 workers
uv run python -m src.cli.main batch /Users/me/podcasts

# Enhanced processing with custom settings
uv run python -m src.cli.main batch /Users/me/lectures --v2 --workers 4 --min-accuracy 95

# Process only MP3 files recursively
uv run python -m src.cli.main batch /Users/me/audio --recursive --pattern "*.mp3"

Enhanced CLI Features

Real-Time Performance Monitoring

The enhanced CLI provides live system monitoring during processing:

# Performance stats are displayed automatically
CPU: 45.2% | Memory: 2.1GB/8GB (26%) | Temp: 65°C

Monitored Metrics:

CPU Usage: Real-time CPU utilization percentage
Memory Usage: Current and total memory with percentage
Temperature: CPU temperature monitoring (when available)
Processing Speed: Time estimates and completion percentages

Enhanced Error Handling

The enhanced CLI provides intelligent error guidance:

# Memory error with helpful suggestions
❌ Memory error. Try using a smaller model with --model small or reduce concurrency.

# File not found with guidance
❌ File not found: lecture.mp3
💡 Check that the input file path is correct and the file exists.

# GPU error with alternatives
❌ CUDA out of memory
💡 GPU-related error. Try using --device cpu instead.

Error Categories:

File Errors: Path validation and existence checks
Memory Errors: Model size and concurrency suggestions
GPU Errors: Device fallback recommendations
Permission Errors: File access guidance
Generic Errors: General troubleshooting tips

Performance Guidelines

Enhanced CLI Optimization

Default Concurrency: 4 (balanced for most systems)
Memory Usage: <2GB per pipeline
Processing Speed: <30s for 5-minute audio (v1)
Real-time Factor: <0.1 (much faster than real-time)
Progress Updates: Every 2-5 seconds

M3 MacBook Optimization

Default Workers: 8 (optimal for M3 chip)
Memory Usage: <2GB per pipeline
Processing Speed: <30s for 5-minute audio (v1)
Real-time Factor: <0.1 (much faster than real-time)

Worker Configuration

# Conservative (low memory)
--workers 4

# Balanced (default)
--workers 8  

# Aggressive (high-end M3)
--workers 12

Output Formats

Enhanced CLI Formats

The enhanced CLI supports multiple output formats:

JSON Output (Default)

{
  "text_content": "Never gonna give you up...",
  "segments": [
    {
      "start": 0.0,
      "end": 2.5,
      "text": "Never gonna give you up"
    }
  ],
  "confidence": 0.95,
  "processing_time": 5.2
}

Text Output

Never gonna give you up
Never gonna let you down
Never gonna run around and desert you
...

SRT Subtitles

1
00:00:00,000 --> 00:00:02,500
Never gonna give you up

2
00:00:02,500 --> 00:00:05,000
Never gonna let you down

VTT Subtitles

WEBVTT

00:00:00.000 --> 00:00:02.500
Never gonna give you up

00:00:02.500 --> 00:00:05.000
Never gonna let you down

Standard CLI Formats

JSON Output (Default)

{
  "youtube_id": "dQw4w9WgXcQ",
  "title": "Rick Astley - Never Gonna Give You Up",
  "channel": "Rick Astley",
  "duration_seconds": 212,
  "transcript": {
    "text": "Never gonna give you up...",
    "segments": [...],
    "confidence": 0.95
  }
}

Text Output

Title: Rick Astley - Never Gonna Give You Up
Channel: Rick Astley
Duration: 3:32

Transcript:
Never gonna give you up
Never gonna let you down
...

Common Workflows

Enhanced CLI Workflows

Research Workflow (Enhanced)

# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt

# 2. Download selected videos
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download

# 3. Enhanced transcription with progress monitoring
uv run python -m src.cli.enhanced_cli transcribe downloaded_video.mp4 -m large --domain academic

# 4. Batch process with intelligent queuing
uv run python -m src.cli.enhanced_cli batch ~/Downloads/research_audio -c 6 -f srt

Academic Lecture Processing

# Process academic lectures with domain adaptation
uv run python -m src.cli.enhanced_cli batch ~/Lectures \
  --domain academic \
  -m large \
  -f srt \
  -c 4 \
  --diarize \
  --speakers 1

Podcast Production

# High-quality podcast transcription with speaker diarization
uv run python -m src.cli.enhanced_cli batch ~/Podcasts \
  -m large \
  -f vtt \
  --diarize \
  --speakers 3 \
  -c 2

Standard CLI Workflows

Research Workflow ✅ FUNCTIONAL

# 1. Extract metadata from YouTube playlist
uv run python -m src.cli.main batch-urls research_videos.txt

# 2. Download selected videos ✅ WORKING
uv run python -m src.cli.main youtube https://youtube.com/watch?v=interesting --download

# 3. Transcribe downloaded media
uv run python -m src.cli.main transcribe data/media/downloads/video.m4a --v2

# 4. Batch process entire folder
uv run python -m src.cli.main batch data/media/downloads --v2

Complete Pipeline Status:

✅ YouTube metadata extraction - Working
✅ Media download - Working with progress tracking
🚧 Transcription - Ready for implementation
🚧 Batch processing - Ready for implementation

Podcast Processing

# Process entire podcast folder with high accuracy
uv run python -m src.cli.main batch ~/Podcasts --v2 --min-accuracy 95 --workers 6

Academic Lectures

# Conservative processing for complex academic content
uv run python -m src.cli.main batch ~/Lectures --v2 --workers 4 --min-accuracy 99

Error Handling

Commands automatically handle common errors:

Network timeouts - Automatic retry with exponential backoff
File format issues - Automatic conversion to supported formats
Memory limits - Automatic chunking for large files
API rate limits - Automatic throttling and retry

For troubleshooting specific errors, see TROUBLESHOOTING.md.

Integration with Taskmaster

All CLI operations can be tracked using Taskmaster:

# Create task for batch processing
./scripts/tm_master.sh add "Process podcast archive with v2 pipeline"

# Track progress
./scripts/tm_workflow.sh update 15 "Processed 50 files, 10 remaining"

# Mark complete
./scripts/tm_master.sh done 15

See Taskmaster Helper Scripts for complete integration guide.

16 KiB Raw Blame History

CLI Command Reference

Command Structure

Standard CLI

Enhanced CLI (Recommended)

Enhanced CLI Commands

Enhanced CLI Overview

transcribe <input>

batch <input>

Enhanced Progress Tracking Features

Multi-Pass Pipeline Progress Visualization

System Resource Monitoring

Error Recovery and Export Progress

Progress Display Features

Standard CLI Commands

youtube <url>

batch-urls <file>

transcribe <file>

batch <folder>

Enhanced CLI Features

Real-Time Performance Monitoring

Enhanced Error Handling

Performance Guidelines

Enhanced CLI Optimization

M3 MacBook Optimization

Worker Configuration

Output Formats

Enhanced CLI Formats

JSON Output (Default)

Text Output

SRT Subtitles

VTT Subtitles

Standard CLI Formats

JSON Output (Default)

Text Output

Common Workflows

Enhanced CLI Workflows

Research Workflow (Enhanced)

Academic Lecture Processing

Podcast Production

Standard CLI Workflows

Research Workflow ✅ FUNCTIONAL

Podcast Processing

Academic Lectures

Error Handling

Integration with Taskmaster

16 KiB

Raw Blame History

`transcribe <input>`

`batch <input>`

`youtube <url>`

`batch-urls <file>`

`transcribe <file>`

`batch <folder>`