clean-tracks/docs/API.md

# Clean-Tracks API Documentation

## Base URL
```
http://localhost:5000/api
```

## Authentication
Currently, the API does not require authentication. In production, implement JWT or OAuth2.

## File Size Limits
- Maximum file size: 500MB
- Supported formats: MP3, WAV, FLAC, M4A, OGG, AAC

## Endpoints

### Health Check

#### GET /api/health
Check if the API is running.

**Response:**
```json
{
  "status": "healthy",
  "timestamp": "2024-01-15T10:30:00Z",
  "version": "0.1.0"
}
```

---

### Audio Processing

#### POST /api/process
Process an audio file to detect and censor explicit content.

**Request:**
- Method: `POST`
- Content-Type: `multipart/form-data`

**Form Data:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| file | File | Yes | Audio file to process |
| word_list_id | Integer | No | ID of word list to use (default: system default) |
| censor_method | String | No | Method: `silence`, `beep`, `white_noise` (default: `beep`) |
| min_severity | String | No | Minimum severity: `low`, `medium`, `high`, `extreme` (default: `low`) |

**Response (202 Accepted):**
```json
{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "status": "queued",
  "message": "File uploaded and queued for processing"
}
```

---

### Job Management

#### GET /api/jobs/{job_id}
Get the status of a processing job.

**Response:**
```json
{
  "id": 1,
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "input_filename": "audio.mp3",
  "output_filename": "audio_censored.mp3",
  "status": "completed",
  "started_at": "2024-01-15T10:30:00Z",
  "completed_at": "2024-01-15T10:31:30Z",
  "processing_time_seconds": 90.5,
  "audio_duration_seconds": 180.0,
  "words_detected": 15,
  "words_censored": 12,
  "error_message": null
}
```

#### GET /api/jobs
List recent processing jobs.

**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| limit | Integer | 10 | Maximum number of jobs to return |
| status | String | - | Filter by status: `pending`, `processing`, `completed`, `failed` |

**Response:**
```json
[
  {
    "id": 1,
    "job_id": "550e8400-e29b-41d4-a716-446655440000",
    "input_filename": "audio.mp3",
    "status": "completed",
    "created_at": "2024-01-15T10:30:00Z"
  }
]
```

#### GET /api/jobs/{job_id}/download
Download the processed audio file.

**Response:**
- Binary audio file download
- Content-Type: Based on processed file format
- Content-Disposition: attachment

---

### Word List Management

#### GET /api/wordlists
Get all word lists.

**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| active_only | Boolean | true | Only return active lists |

**Response:**
```json
[
  {
    "id": 1,
    "name": "English - General",
    "description": "General English profanity",
    "language": "en",
    "is_default": true,
    "is_active": true,
    "word_count": 150,
    "created_at": "2024-01-15T10:00:00Z",
    "updated_at": "2024-01-15T10:00:00Z"
  }
]
```

#### POST /api/wordlists
Create a new word list.

**Request Body:**
```json
{
  "name": "Custom List",
  "description": "My custom word list",
  "language": "en",
  "is_default": false
}
```

**Response (201 Created):**
```json
{
  "id": 2,
  "message": "Word list created successfully"
}
```

#### GET /api/wordlists/{list_id}
Get details and statistics for a specific word list.

**Response:**
```json
{
  "id": 1,
  "name": "English - General",
  "total_words": 150,
  "by_severity": {
    "low": 30,
    "medium": 60,
    "high": 50,
    "extreme": 10
  },
  "by_category": {
    "profanity": 100,
    "slur": 30,
    "sexual": 20
  },
  "has_variations": 75,
  "created_at": "2024-01-15T10:00:00Z",
  "updated_at": "2024-01-15T10:00:00Z",
  "version": 1
}
```

#### PUT /api/wordlists/{list_id}
Update a word list.

**Request Body:**
```json
{
  "name": "Updated Name",
  "description": "Updated description",
  "is_default": true
}
```

#### DELETE /api/wordlists/{list_id}
Delete a word list.

#### POST /api/wordlists/{list_id}/words
Add words to a word list.

**Request Body:**
```json
{
  "words": {
    "word1": {
      "severity": "high",
      "category": "profanity",
      "variations": ["w0rd1", "word_1"],
      "notes": "Common misspellings"
    },
    "word2": {
      "severity": "medium",
      "category": "slur"
    }
  }
}
```

#### DELETE /api/wordlists/{list_id}/words
Remove words from a word list.

**Request Body:**
```json
{
  "words": ["word1", "word2", "word3"]
}
```

#### GET /api/wordlists/{list_id}/export
Export a word list to file.

**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| format | String | json | Export format: `json`, `csv`, `txt` |

**Response:**
- File download in requested format

#### POST /api/wordlists/{list_id}/import
Import words from a file into a word list.

**Request:**
- Method: `POST`
- Content-Type: `multipart/form-data`

**Form Data:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| file | File | Yes | Word list file (JSON, CSV, or TXT) |
| merge | Boolean | No | Merge with existing words (default: false) |

---

### User Settings

#### GET /api/settings
Get user settings.

**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| user_id | String | default | User identifier |

**Response:**
```json
{
  "id": 1,
  "user_id": "default",
  "processing": {
    "default_word_list_id": 1,
    "default_censor_method": "beep",
    "default_min_severity": "low"
  },
  "audio": {
    "preferred_output_format": "mp3",
    "preferred_bitrate": "192k",
    "preserve_metadata": true
  },
  "model": {
    "whisper_model_size": "base",
    "use_gpu": true
  },
  "ui": {
    "theme": "light",
    "language": "en",
    "show_waveform": true,
    "auto_play_preview": false
  },
  "privacy": {
    "save_history": true,
    "save_transcriptions": false,
    "anonymous_mode": false
  }
}
```

#### PUT /api/settings
Update user settings.

**Request Body:**
```json
{
  "user_id": "default",
  "default_censor_method": "silence",
  "whisper_model_size": "small",
  "theme": "dark"
}
```

---

### Statistics

#### GET /api/statistics
Get overall processing statistics.

**Response:**
```json
{
  "total_jobs": 1000,
  "completed_jobs": 950,
  "success_rate": 95.0,
  "total_audio_duration_hours": 500.5,
  "total_words_detected": 15000,
  "total_words_censored": 12000,
  "average_processing_time_seconds": 45.3
}
```

---

## WebSocket Events

Connect to WebSocket at `ws://localhost:5000/socket.io/`

### Client Events (send to server)

#### connect
Establish WebSocket connection.

#### join_job
Join a job room to receive updates.
```json
{
  "job_id": "550e8400-e29b-41d4-a716-446655440000"
}
```

#### leave_job
Leave a job room.
```json
{
  "job_id": "550e8400-e29b-41d4-a716-446655440000"
}
```

#### ping
Keep connection alive.

### Server Events (receive from server)

#### connected
Connection established.
```json
{
  "message": "Connected to Clean-Tracks server"
}
```

#### job_progress
Processing progress update.
```json
{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "progress": {
    "stage": "transcription",
    "percent": 45.5,
    "message": "Transcribing audio...",
    "timestamp": "2024-01-15T10:30:45Z"
  }
}
```

#### job_completed
Job completed successfully.
```json
{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "result": {
    "output_filename": "audio_censored.mp3",
    "statistics": {
      "words_detected": 15,
      "words_censored": 12,
      "processing_time": 90.5
    },
    "timestamp": "2024-01-15T10:31:30Z"
  }
}
```

#### job_failed
Job failed with error.
```json
{
  "job_id": "550e8400-e29b-41d4-a716-446655440000",
  "error": "Failed to process audio: Invalid format"
}
```

---

## Error Responses

All endpoints may return the following error responses:

### 400 Bad Request
```json
{
  "error": "Description of what went wrong"
}
```

### 404 Not Found
```json
{
  "error": "Resource not found"
}
```

### 500 Internal Server Error
```json
{
  "error": "Internal server error"
}
```

---

## Rate Limiting

In production, implement rate limiting:
- 100 requests per minute for general endpoints
- 10 file uploads per minute
- 1000 WebSocket messages per minute

---

## CORS

By default, CORS is enabled for all origins. In production, configure specific allowed origins.

---

## Examples

### Complete Processing Workflow

1. **Upload file for processing:**
```bash
curl -X POST http://localhost:5000/api/process \
  -F "file=@audio.mp3" \
  -F "word_list_id=1" \
  -F "censor_method=beep"
```

2. **Check job status:**
```bash
curl http://localhost:5000/api/jobs/550e8400-e29b-41d4-a716-446655440000
```

3. **Download processed file:**
```bash
curl -O http://localhost:5000/api/jobs/550e8400-e29b-41d4-a716-446655440000/download
```

### WebSocket Connection (JavaScript)

```javascript
const socket = io('http://localhost:5000');

socket.on('connect', () => {
  console.log('Connected to server');

  // Join job room
  socket.emit('join_job', { job_id: '550e8400-e29b-41d4-a716-446655440000' });
});

socket.on('job_progress', (data) => {
  console.log(`Progress: ${data.progress.percent}% - ${data.progress.message}`);
});

socket.on('job_completed', (data) => {
  console.log('Job completed!', data.result);
});

socket.on('job_failed', (data) => {
  console.error('Job failed:', data.error);
});
```