clean-tracks/docs/API.md

508 lines
9.3 KiB
Markdown

# Clean-Tracks API Documentation
## Base URL
```
http://localhost:5000/api
```
## Authentication
Currently, the API does not require authentication. In production, implement JWT or OAuth2.
## File Size Limits
- Maximum file size: 500MB
- Supported formats: MP3, WAV, FLAC, M4A, OGG, AAC
## Endpoints
### Health Check
#### GET /api/health
Check if the API is running.
**Response:**
```json
{
"status": "healthy",
"timestamp": "2024-01-15T10:30:00Z",
"version": "0.1.0"
}
```
---
### Audio Processing
#### POST /api/process
Process an audio file to detect and censor explicit content.
**Request:**
- Method: `POST`
- Content-Type: `multipart/form-data`
**Form Data:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| file | File | Yes | Audio file to process |
| word_list_id | Integer | No | ID of word list to use (default: system default) |
| censor_method | String | No | Method: `silence`, `beep`, `white_noise` (default: `beep`) |
| min_severity | String | No | Minimum severity: `low`, `medium`, `high`, `extreme` (default: `low`) |
**Response (202 Accepted):**
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"status": "queued",
"message": "File uploaded and queued for processing"
}
```
---
### Job Management
#### GET /api/jobs/{job_id}
Get the status of a processing job.
**Response:**
```json
{
"id": 1,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"input_filename": "audio.mp3",
"output_filename": "audio_censored.mp3",
"status": "completed",
"started_at": "2024-01-15T10:30:00Z",
"completed_at": "2024-01-15T10:31:30Z",
"processing_time_seconds": 90.5,
"audio_duration_seconds": 180.0,
"words_detected": 15,
"words_censored": 12,
"error_message": null
}
```
#### GET /api/jobs
List recent processing jobs.
**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| limit | Integer | 10 | Maximum number of jobs to return |
| status | String | - | Filter by status: `pending`, `processing`, `completed`, `failed` |
**Response:**
```json
[
{
"id": 1,
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"input_filename": "audio.mp3",
"status": "completed",
"created_at": "2024-01-15T10:30:00Z"
}
]
```
#### GET /api/jobs/{job_id}/download
Download the processed audio file.
**Response:**
- Binary audio file download
- Content-Type: Based on processed file format
- Content-Disposition: attachment
---
### Word List Management
#### GET /api/wordlists
Get all word lists.
**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| active_only | Boolean | true | Only return active lists |
**Response:**
```json
[
{
"id": 1,
"name": "English - General",
"description": "General English profanity",
"language": "en",
"is_default": true,
"is_active": true,
"word_count": 150,
"created_at": "2024-01-15T10:00:00Z",
"updated_at": "2024-01-15T10:00:00Z"
}
]
```
#### POST /api/wordlists
Create a new word list.
**Request Body:**
```json
{
"name": "Custom List",
"description": "My custom word list",
"language": "en",
"is_default": false
}
```
**Response (201 Created):**
```json
{
"id": 2,
"message": "Word list created successfully"
}
```
#### GET /api/wordlists/{list_id}
Get details and statistics for a specific word list.
**Response:**
```json
{
"id": 1,
"name": "English - General",
"total_words": 150,
"by_severity": {
"low": 30,
"medium": 60,
"high": 50,
"extreme": 10
},
"by_category": {
"profanity": 100,
"slur": 30,
"sexual": 20
},
"has_variations": 75,
"created_at": "2024-01-15T10:00:00Z",
"updated_at": "2024-01-15T10:00:00Z",
"version": 1
}
```
#### PUT /api/wordlists/{list_id}
Update a word list.
**Request Body:**
```json
{
"name": "Updated Name",
"description": "Updated description",
"is_default": true
}
```
#### DELETE /api/wordlists/{list_id}
Delete a word list.
#### POST /api/wordlists/{list_id}/words
Add words to a word list.
**Request Body:**
```json
{
"words": {
"word1": {
"severity": "high",
"category": "profanity",
"variations": ["w0rd1", "word_1"],
"notes": "Common misspellings"
},
"word2": {
"severity": "medium",
"category": "slur"
}
}
}
```
#### DELETE /api/wordlists/{list_id}/words
Remove words from a word list.
**Request Body:**
```json
{
"words": ["word1", "word2", "word3"]
}
```
#### GET /api/wordlists/{list_id}/export
Export a word list to file.
**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| format | String | json | Export format: `json`, `csv`, `txt` |
**Response:**
- File download in requested format
#### POST /api/wordlists/{list_id}/import
Import words from a file into a word list.
**Request:**
- Method: `POST`
- Content-Type: `multipart/form-data`
**Form Data:**
| Field | Type | Required | Description |
|-------|------|----------|-------------|
| file | File | Yes | Word list file (JSON, CSV, or TXT) |
| merge | Boolean | No | Merge with existing words (default: false) |
---
### User Settings
#### GET /api/settings
Get user settings.
**Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| user_id | String | default | User identifier |
**Response:**
```json
{
"id": 1,
"user_id": "default",
"processing": {
"default_word_list_id": 1,
"default_censor_method": "beep",
"default_min_severity": "low"
},
"audio": {
"preferred_output_format": "mp3",
"preferred_bitrate": "192k",
"preserve_metadata": true
},
"model": {
"whisper_model_size": "base",
"use_gpu": true
},
"ui": {
"theme": "light",
"language": "en",
"show_waveform": true,
"auto_play_preview": false
},
"privacy": {
"save_history": true,
"save_transcriptions": false,
"anonymous_mode": false
}
}
```
#### PUT /api/settings
Update user settings.
**Request Body:**
```json
{
"user_id": "default",
"default_censor_method": "silence",
"whisper_model_size": "small",
"theme": "dark"
}
```
---
### Statistics
#### GET /api/statistics
Get overall processing statistics.
**Response:**
```json
{
"total_jobs": 1000,
"completed_jobs": 950,
"success_rate": 95.0,
"total_audio_duration_hours": 500.5,
"total_words_detected": 15000,
"total_words_censored": 12000,
"average_processing_time_seconds": 45.3
}
```
---
## WebSocket Events
Connect to WebSocket at `ws://localhost:5000/socket.io/`
### Client Events (send to server)
#### connect
Establish WebSocket connection.
#### join_job
Join a job room to receive updates.
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000"
}
```
#### leave_job
Leave a job room.
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000"
}
```
#### ping
Keep connection alive.
### Server Events (receive from server)
#### connected
Connection established.
```json
{
"message": "Connected to Clean-Tracks server"
}
```
#### job_progress
Processing progress update.
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"progress": {
"stage": "transcription",
"percent": 45.5,
"message": "Transcribing audio...",
"timestamp": "2024-01-15T10:30:45Z"
}
}
```
#### job_completed
Job completed successfully.
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"result": {
"output_filename": "audio_censored.mp3",
"statistics": {
"words_detected": 15,
"words_censored": 12,
"processing_time": 90.5
},
"timestamp": "2024-01-15T10:31:30Z"
}
}
```
#### job_failed
Job failed with error.
```json
{
"job_id": "550e8400-e29b-41d4-a716-446655440000",
"error": "Failed to process audio: Invalid format"
}
```
---
## Error Responses
All endpoints may return the following error responses:
### 400 Bad Request
```json
{
"error": "Description of what went wrong"
}
```
### 404 Not Found
```json
{
"error": "Resource not found"
}
```
### 500 Internal Server Error
```json
{
"error": "Internal server error"
}
```
---
## Rate Limiting
In production, implement rate limiting:
- 100 requests per minute for general endpoints
- 10 file uploads per minute
- 1000 WebSocket messages per minute
---
## CORS
By default, CORS is enabled for all origins. In production, configure specific allowed origins.
---
## Examples
### Complete Processing Workflow
1. **Upload file for processing:**
```bash
curl -X POST http://localhost:5000/api/process \
-F "file=@audio.mp3" \
-F "word_list_id=1" \
-F "censor_method=beep"
```
2. **Check job status:**
```bash
curl http://localhost:5000/api/jobs/550e8400-e29b-41d4-a716-446655440000
```
3. **Download processed file:**
```bash
curl -O http://localhost:5000/api/jobs/550e8400-e29b-41d4-a716-446655440000/download
```
### WebSocket Connection (JavaScript)
```javascript
const socket = io('http://localhost:5000');
socket.on('connect', () => {
console.log('Connected to server');
// Join job room
socket.emit('join_job', { job_id: '550e8400-e29b-41d4-a716-446655440000' });
});
socket.on('job_progress', (data) => {
console.log(`Progress: ${data.progress.percent}% - ${data.progress.message}`);
});
socket.on('job_completed', (data) => {
console.log('Job completed!', data.result);
});
socket.on('job_failed', (data) => {
console.error('Job failed:', data.error);
});
```