mixcloud-rss-generator/CLAUDE.md

7.1 KiB

CLAUDE.md - Mixcloud RSS Generator

This file provides guidance to Claude Code when working with the Mixcloud RSS Generator component of the Personal AI Assistant project.

Relationship to Main Project

  • Part of the Personal AI Assistant ecosystem
  • See main CLAUDE.md for general project guidelines
  • Update CHANGELOG.md when making changes to this component

Understanding Component History

New to this component? Review CHANGELOG.md to understand:

  • Evolution from v0.1.0 CLI tool to v0.3.0 with specialized feeds
  • API changes and adaptations
  • Integration timeline with main project

Project Overview

A backend-only CLI tool that converts Mixcloud user shows into RSS feeds compatible with podcast apps and feed readers. Uses shared content syndication services from the main AI Assistant project for reusability and consistency.

Architecture Change (v1.0): Refactored from Flask web app to backend-only CLI using shared services.

Technology Stack

  • Language: Python 3.8+
  • Architecture: Backend-only CLI with shared services
  • Key Libraries:
    • requests - HTTP requests to Mixcloud (via shared services)
    • xml.etree.ElementTree - RSS/XML generation (via shared services)
    • Shared Services: Content syndication components from shared/services/content_syndication/
  • Removed: Flask, BeautifulSoup4 (moved to shared services)
  • Caching: File-based caching in ./cache directory

Quick Start

Backend CLI Usage

# REQUIRED FIRST STEP - Activate virtual environment (when using local development)
source venv/bin/activate

# Set PYTHONPATH for shared services
export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH

# Generate RSS feed for a Mixcloud user
python src/cli.py WRFG

# Save to file
python src/cli.py WRFG -o feed.xml

# Limit number of episodes
python src/cli.py WRFG -l 50

# Advanced filtering
python src/cli.py WRFG --keywords "rap,public affairs" --limit 100
python src/cli.py WRFG --rap-only --limit 200

Legacy Command Line (Deprecated)

# Still available for compatibility
python src/mixcloud_rss.py username -o feed.xml

Architecture Notes

New Backend-Only Architecture

  1. CLI Interface: src/cli.py - New primary interface with advanced filtering
  2. Shared Services: Uses shared/services/content_syndication/ components:
    • ContentSyndicationService - Main orchestration
    • MixcloudAPIClient - API interactions with caching
    • RSSFeedGenerator - RSS 2.0 generation with iTunes extensions
    • FeedFilterService - Advanced filtering (dates, keywords, tags)

RSS Feed Generation Flow

  1. CLI parses arguments and builds filters
  2. ContentSyndicationService orchestrates the process
  3. MixcloudAPIClient fetches user data with caching
  4. FeedFilterService applies filtering criteria
  5. RSSFeedGenerator creates RSS 2.0 compliant XML
  6. Results output to file or stdout

Key Files

  • src/cli.py - NEW Backend CLI interface (primary)
  • src/mixcloud_rss.py - Legacy RSS generation logic (deprecated)
  • generate_*.py - Specialized feed generators (work with legacy code)
  • cache/ - Cached API responses (gitignored)
  • Archived: archived_projects/mixcloud-ui/ - Former Flask web interface

Caching Strategy

  • Default TTL: 3600 seconds (1 hour)
  • Cache key: MD5 hash of request parameters
  • Stored as JSON files in ./cache directory

Development Commands

Testing Feed Generation

# REQUIRED FIRST STEP - Activate virtual environment (when using local development)
source venv/bin/activate
export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH

# Test with new CLI
python src/cli.py WRFG --validate  # Quick user validation
python src/cli.py WRFG --user-info  # User information
python src/cli.py WRFG --verbose   # Verbose RSS generation

# Test filtering
python src/cli.py WRFG --rap-only --limit 100
python src/cli.py WRFG --keywords "interview" --output test.xml

# Validate RSS output
python -m xml.dom.minidom test.xml  # Pretty print and validate XML

# Legacy testing (still works)
python src/mixcloud_rss.py WRFG
python generate_rap_feed.py

Cache Management

# Clear cache
rm -rf cache/*.json

# View cached data
ls -la cache/

Common Issues and Solutions

Mixcloud API Changes

Problem: Feed generation fails with extraction errors Solution:

  • Check if Mixcloud HTML structure changed
  • Update BeautifulSoup selectors in extract_shows_from_html()
  • Look for new API endpoints in browser network tab

Audio URL Extraction

Problem: "Could not extract audio URL" errors Solution:

  • Mixcloud often changes their audio URL format
  • Check extract_audio_url() method
  • May need to update regex patterns or API calls

RSS Feed Validation

Problem: Podcast apps reject the feed Solution:

  • Ensure all required RSS elements are present
  • Check iTunes podcast extensions
  • Validate dates are in RFC822 format
  • Use online RSS validators

Character Encoding

Problem: Special characters appear garbled Solution:

  • Ensure UTF-8 encoding throughout
  • Use .encode('utf-8') when writing files
  • Set XML encoding declaration

Integration with Main Project

Usage in Personal AI Assistant

  1. Podcast Monitoring: RSS feeds enable the podcast processing pipeline
  2. Episode Detection: New episodes detected via RSS polling
  3. Audio Source: Provides URLs for audio downloading

Specialized Feeds

The generate_*.py scripts create filtered feeds for specific shows:

  • generate_rap_feed.py - Revolutionary African Perspectives shows
  • generate_july21_feed.py - Specific date filtering

Testing

Manual Testing

# REQUIRED FIRST STEP - Activate virtual environment (when using local development)
source venv/bin/activate

# Test basic functionality
python src/mixcloud_rss.py WRFG -o test.xml
cat test.xml | head -20  # Check output

# Test web interface
python src/web_app.py
# Visit http://localhost:5000 and test form submission

Feed Validation

Deployment Considerations

Docker Support

# Build image
docker build -t mixcloud-rss .

# Run container
docker run -p 5000:5000 mixcloud-rss

Environment Variables

  • No required environment variables
  • Optional: CACHE_DIR, CACHE_TTL for customization

Troubleshooting

Debug Mode

# Add to scripts for verbose output
import logging
logging.basicConfig(level=logging.DEBUG)

Common Error Messages

  • "No shows found": Check if username is correct or if Mixcloud is accessible
  • "Cache directory not writable": Ensure ./cache exists and has write permissions
  • "Invalid XML": Check for unescaped special characters in show data

Performance

  • Cache is crucial for performance
  • Consider implementing Redis cache for production
  • Batch requests when generating multiple feeds

Best Practices

  1. Always validate generated RSS feeds
  2. Test with multiple podcast apps
  3. Monitor Mixcloud for API/structure changes
  4. Keep cache directory clean in development
  5. Log errors for debugging production issues