Initial commit - Mixcloud RSS Generator

2025-08-14 00:43:59 -04:00 · 2025-08-14 00:43:59 -04:00 · d7d82c4211
commit d7d82c4211
26 changed files with 1633 additions and 0 deletions
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@ -0,0 +1,66 @@
 # Changelog - Mixcloud RSS Generator
 All notable changes to the Mixcloud RSS Generator component will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 ## [Unreleased]
 ### To Do
 - Add support for playlists and categories
 - Implement feed pagination for users with many shows
 - Add configurable cache expiration
 - Support for custom feed metadata
 ## [0.3.0] - 2025-08-04
 ### Added
 - Generated specialized RSS feeds for Revolutionary African Perspectives (RAP) show
 - Created filtered feeds for specific date ranges (July 21 episode)
 - Added precise show filtering capabilities
 - Support for WRFG radio show feeds
 ### Changed
 - Enhanced feed generation scripts for better show filtering
 - Improved caching mechanism for faster feed updates
 ## [0.2.0] - 2025-07-15
 ### Added
 - Web interface for RSS feed generation
 - RESTful API endpoints for programmatic access
 - Built-in caching system for improved performance
 - Docker support with dedicated Dockerfile
 - HTML template for web interface
 ### Changed
 - Restructured code into modular components (src directory)
 - Improved error handling for invalid Mixcloud URLs
 - Enhanced feed metadata with proper iTunes tags
 ### Fixed
 - Audio URL extraction for newer Mixcloud API changes
 - Character encoding issues in show descriptions
 ## [0.1.0] - 2025-06-01
 ### Added
 - Initial release with core functionality
 - Command-line interface for RSS feed generation
 - Support for converting Mixcloud user shows to RSS
 - Basic feed generation with episode metadata
 - Compatible with major podcast apps
 - Configurable episode limits
 ### Technical Details
 - Python-based implementation
 - Uses Mixcloud's public API
 - Generates standard RSS 2.0 feeds with podcast extensions
 ---
 ## Integration with Personal AI Assistant
 This component is used by the main Personal AI Assistant project for:
 - Monitoring podcast RSS feeds for new episodes
 - Providing audio sources for the transcription pipeline
 - Enabling podcast app compatibility for processed shows
 For main project changes, see the [parent changelog](../CHANGELOG.md).
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -0,0 +1,228 @@
 # CLAUDE.md - Mixcloud RSS Generator
 This file provides guidance to Claude Code when working with the Mixcloud RSS Generator component of the Personal AI Assistant project.
 ## Relationship to Main Project
 - Part of the Personal AI Assistant ecosystem
 - See main [CLAUDE.md](../CLAUDE.md) for general project guidelines
 - Update [CHANGELOG.md](./CHANGELOG.md) when making changes to this component
 ## Understanding Component History
 **New to this component?** Review [CHANGELOG.md](./CHANGELOG.md) to understand:
 - Evolution from v0.1.0 CLI tool to v0.3.0 with specialized feeds
 - API changes and adaptations
 - Integration timeline with main project
 ## Project Overview
 A backend-only CLI tool that converts Mixcloud user shows into RSS feeds compatible with podcast apps and feed readers. Uses shared content syndication services from the main AI Assistant project for reusability and consistency.
 **Architecture Change (v1.0)**: Refactored from Flask web app to backend-only CLI using shared services.
 ## Technology Stack
 - **Language**: Python 3.8+
 - **Architecture**: Backend-only CLI with shared services
 - **Key Libraries**: 
  - `requests` - HTTP requests to Mixcloud (via shared services)
  - `xml.etree.ElementTree` - RSS/XML generation (via shared services)
  - **Shared Services**: Content syndication components from `shared/services/content_syndication/`
 - **Removed**: Flask, BeautifulSoup4 (moved to shared services)
 - **Caching**: File-based caching in `./cache` directory
 ## Quick Start
 ### Backend CLI Usage
 ```bash
 # REQUIRED FIRST STEP - Activate virtual environment (when using local development)
 source venv/bin/activate
 # Set PYTHONPATH for shared services
 export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
 # Generate RSS feed for a Mixcloud user
 python src/cli.py WRFG
 # Save to file
 python src/cli.py WRFG -o feed.xml
 # Limit number of episodes
 python src/cli.py WRFG -l 50
 # Advanced filtering
 python src/cli.py WRFG --keywords "rap,public affairs" --limit 100
 python src/cli.py WRFG --rap-only --limit 200
 ```
 ### Legacy Command Line (Deprecated)
 ```bash
 # Still available for compatibility
 python src/mixcloud_rss.py username -o feed.xml
 ```
 ## Architecture Notes
 ### New Backend-Only Architecture
 1. **CLI Interface**: `src/cli.py` - New primary interface with advanced filtering
 2. **Shared Services**: Uses `shared/services/content_syndication/` components:
   - `ContentSyndicationService` - Main orchestration 
   - `MixcloudAPIClient` - API interactions with caching
   - `RSSFeedGenerator` - RSS 2.0 generation with iTunes extensions
   - `FeedFilterService` - Advanced filtering (dates, keywords, tags)
 ### RSS Feed Generation Flow
 1. CLI parses arguments and builds filters
 2. ContentSyndicationService orchestrates the process
 3. MixcloudAPIClient fetches user data with caching
 4. FeedFilterService applies filtering criteria
 5. RSSFeedGenerator creates RSS 2.0 compliant XML
 6. Results output to file or stdout
 ### Key Files
 - `src/cli.py` - **NEW** Backend CLI interface (primary)
 - `src/mixcloud_rss.py` - Legacy RSS generation logic (deprecated)
 - `generate_*.py` - Specialized feed generators (work with legacy code)
 - `cache/` - Cached API responses (gitignored)
 - **Archived**: `archived_projects/mixcloud-ui/` - Former Flask web interface
 ### Caching Strategy
 - Default TTL: 3600 seconds (1 hour)
 - Cache key: MD5 hash of request parameters
 - Stored as JSON files in `./cache` directory
 ## Development Commands
 ### Testing Feed Generation
 ```bash
 # REQUIRED FIRST STEP - Activate virtual environment (when using local development)
 source venv/bin/activate
 export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
 # Test with new CLI
 python src/cli.py WRFG --validate  # Quick user validation
 python src/cli.py WRFG --user-info  # User information
 python src/cli.py WRFG --verbose   # Verbose RSS generation
 # Test filtering
 python src/cli.py WRFG --rap-only --limit 100
 python src/cli.py WRFG --keywords "interview" --output test.xml
 # Validate RSS output
 python -m xml.dom.minidom test.xml  # Pretty print and validate XML
 # Legacy testing (still works)
 python src/mixcloud_rss.py WRFG
 python generate_rap_feed.py
 ```
 ### Cache Management
 ```bash
 # Clear cache
 rm -rf cache/*.json
 # View cached data
 ls -la cache/
 ```
 ## Common Issues and Solutions
 ### Mixcloud API Changes
 **Problem**: Feed generation fails with extraction errors
 **Solution**: 
 - Check if Mixcloud HTML structure changed
 - Update BeautifulSoup selectors in `extract_shows_from_html()`
 - Look for new API endpoints in browser network tab
 ### Audio URL Extraction
 **Problem**: "Could not extract audio URL" errors
 **Solution**:
 - Mixcloud often changes their audio URL format
 - Check `extract_audio_url()` method
 - May need to update regex patterns or API calls
 ### RSS Feed Validation
 **Problem**: Podcast apps reject the feed
 **Solution**:
 - Ensure all required RSS elements are present
 - Check iTunes podcast extensions
 - Validate dates are in RFC822 format
 - Use online RSS validators
 ### Character Encoding
 **Problem**: Special characters appear garbled
 **Solution**:
 - Ensure UTF-8 encoding throughout
 - Use `.encode('utf-8')` when writing files
 - Set XML encoding declaration
 ## Integration with Main Project
 ### Usage in Personal AI Assistant
 1. **Podcast Monitoring**: RSS feeds enable the podcast processing pipeline
 2. **Episode Detection**: New episodes detected via RSS polling
 3. **Audio Source**: Provides URLs for audio downloading
 ### Specialized Feeds
 The `generate_*.py` scripts create filtered feeds for specific shows:
 - `generate_rap_feed.py` - Revolutionary African Perspectives shows
 - `generate_july21_feed.py` - Specific date filtering
 ## Testing
 ### Manual Testing
 ```bash
 # REQUIRED FIRST STEP - Activate virtual environment (when using local development)
 source venv/bin/activate
 # Test basic functionality
 python src/mixcloud_rss.py WRFG -o test.xml
 cat test.xml | head -20  # Check output
 # Test web interface
 python src/web_app.py
 # Visit http://localhost:5000 and test form submission
 ```
 ### Feed Validation
 - Use https://validator.w3.org/feed/ for RSS validation
 - Test in actual podcast apps (Apple Podcasts, Overcast)
 - Check all links are accessible
 ## Deployment Considerations
 ### Docker Support
 ```bash
 # Build image
 docker build -t mixcloud-rss .
 # Run container
 docker run -p 5000:5000 mixcloud-rss
 ```
 ### Environment Variables
 - No required environment variables
 - Optional: `CACHE_DIR`, `CACHE_TTL` for customization
 ## Troubleshooting
 ### Debug Mode
 ```python
 # Add to scripts for verbose output
 import logging
 logging.basicConfig(level=logging.DEBUG)
 ```
 ### Common Error Messages
 - **"No shows found"**: Check if username is correct or if Mixcloud is accessible
 - **"Cache directory not writable"**: Ensure `./cache` exists and has write permissions
 - **"Invalid XML"**: Check for unescaped special characters in show data
 ### Performance
 - Cache is crucial for performance
 - Consider implementing Redis cache for production
 - Batch requests when generating multiple feeds
 ## Best Practices
 1. Always validate generated RSS feeds
 2. Test with multiple podcast apps
 3. Monitor Mixcloud for API/structure changes
 4. Keep cache directory clean in development
 5. Log errors for debugging production issues
--- a/23
+++ b/23
@ -0,0 +1,23 @@
 FROM python:3.11-slim
 WORKDIR /app
 # Install dependencies
 COPY requirements.txt .
 RUN pip install --no-cache-dir -r requirements.txt
 # Copy application code
 COPY src/ ./src/
 # Create cache directory
 RUN mkdir -p /app/cache
 # Expose port
 EXPOSE 5000
 # Set environment variables
 ENV FLASK_APP=src/web_app.py
 ENV PYTHONUNBUFFERED=1
 # Run the web server
 CMD ["python", "src/web_app.py"]
--- a/README.md
+++ b/README.md
@ -0,0 +1,214 @@
 # Mixcloud RSS Generator (Backend CLI)
 Convert Mixcloud shows into RSS feeds using a lightweight command-line interface that leverages shared content syndication services.
 ## Features
 - 🎵 Convert any Mixcloud user's shows into an RSS feed
 - 📱 Compatible with podcast apps (Apple Podcasts, Overcast, etc.)
 - 🚀 Fast with built-in caching
 - 🔧 Backend-only CLI tool (no web interface)
 - 📡 Advanced filtering options (keywords, dates, tags)
 - ♻️  Uses shared services for reusability across projects
 ## Architecture
 This project has been refactored to use shared services:
 - **Backend Services**: Located in `shared/services/content_syndication/`
 - **CLI Interface**: `src/cli.py` provides command-line access
 - **Legacy Components**: Web UI archived in `archived_projects/mixcloud-ui/`
 ## Installation
 ```bash
 # Navigate to the mixcloud-rss-generator directory
 cd mixcloud-rss-generator
 # Install dependencies
 pip install -r requirements.txt
 # Ensure shared services are accessible
 export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
 ```
 ## Usage
 ### Basic Usage
 Generate RSS feed from Mixcloud user:
 ```bash
 # Basic RSS generation
 python src/cli.py WRFG
 # From full Mixcloud URL
 python src/cli.py --url https://www.mixcloud.com/NTSRadio/
 # Save to file with custom limit
 python src/cli.py WRFG --limit 50 --output feed.xml
 ```
 ### Advanced Filtering
 ```bash
 # Filter by keywords in title
 python src/cli.py WRFG --keywords "rap,public affairs" --limit 100
 # Filter by date range
 python src/cli.py WRFG --date-range 2024-01-01 2024-12-31
 # Filter by specific dates
 python src/cli.py WRFG --specific-dates "July 21,Aug 15,2024-09-01"
 # Revolutionary African Perspectives only (convenience filter)
 python src/cli.py WRFG --rap-only --limit 100
 # Combine multiple filters
 python src/cli.py WRFG --keywords "interview" --tags "house,techno" --limit 30
 ```
 ### Utility Operations
 ```bash
 # Validate user without generating feed
 python src/cli.py WRFG --validate
 # Get user information
 python src/cli.py WRFG --user-info
 # Verbose output for debugging
 python src/cli.py WRFG --verbose
 ```
 ### Integration with Podcast Apps
 1. Generate RSS feed and save to a publicly accessible location
 2. Use the file path or served URL in your podcast app:
   - **Apple Podcasts**: File → Add Show by URL
   - **Overcast**: Add URL → Plus button → Add URL  
   - **Pocket Casts**: Search → Enter URL
   - **Castro**: Library → Sources → Plus → Add Podcast by URL
 ## Shared Services Architecture
 The RSS generation now uses modular services from `shared/services/content_syndication/`:
 - **ContentSyndicationService**: Main orchestration service
 - **MixcloudAPIClient**: Handles Mixcloud API interactions with caching
 - **RSSFeedGenerator**: Creates RSS 2.0 compliant feeds
 - **FeedFilterService**: Advanced content filtering capabilities
 ## Configuration
 ### Cache Settings
 ```bash
 # Custom cache directory and TTL
 python src/cli.py WRFG --cache-dir ./custom-cache --cache-ttl 7200
 ```
 ### Environment Variables
 - `PYTHONPATH`: Must include parent project directory for shared imports
 - No other environment variables required for basic operation
 ## Examples
 ### Generate Filtered Feed
 ```python
 # In Python script
 from shared.services.content_syndication import ContentSyndicationService
 # Initialize service
 service = ContentSyndicationService(cache_dir="./cache", cache_ttl=3600)
 # Generate with filters
 filters = {"keywords": "rap,public affairs", "start_date": "2024-01-01"}
 rss_feed = service.generate_mixcloud_rss("WRFG", limit=50, filters=filters)
 ```
 ### Integration with Main AI Project
 ```python
 # Use in podcast processing pipeline
 from shared.services.content_syndication import ContentSyndicationService
 service = ContentSyndicationService()
 rss_feed = service.generate_rap_feed("WRFG", limit=100)  # Convenience method
 # Save for podcast processing
 with open("data/feeds/wrfg_rap.xml", "w") as f:
    f.write(rss_feed)
 ```
 ## Migration Notes
 ### From Web Interface
 The web interface (`web_app.py` and `templates/`) has been archived to `archived_projects/mixcloud-ui/`. Key changes:
 - **Before**: `python src/web_app.py` (Flask web server)
 - **After**: `python src/cli.py [options]` (CLI tool)
 ### From Legacy Script
 The original `mixcloud_rss.py` remains for compatibility but new usage should prefer the CLI:
 - **Legacy**: `python src/mixcloud_rss.py username`
 - **New**: `python src/cli.py username`
 ## Troubleshooting
 ### Import Errors
 ```bash
 # Ensure PYTHONPATH includes project root
 export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
 python src/cli.py WRFG --validate
 ```
 ### Cache Issues
 ```bash
 # Clear cache if experiencing stale data
 rm -rf cache/*.json
 python src/cli.py WRFG --verbose
 ```
 ### API Errors
 - **User not found**: Check username spelling and profile visibility
 - **No shows**: User might have private shows or no content
 - **Rate limiting**: Wait between requests or increase cache TTL
 ## Advanced Usage
 ### Specialized Feeds
 The CLI includes convenience options for specialized content:
 ```bash
 # Revolutionary African Perspectives shows only
 python src/cli.py WRFG --rap-only --limit 200
 # Recent interviews only  
 python src/cli.py WRFG --keywords "interview" --date-range 2024-01-01 $(date +%Y-%m-%d)
 ```
 ### Batch Processing
 ```bash
 # Process multiple users
 for user in WRFG NTSRadio ResidentAdvisor; do
    python src/cli.py $user --output "feeds/${user}.xml" --limit 50
 done
 ```
 ## Integration with AI Assistant
 This RSS generator integrates with the Personal AI Assistant project for:
 - **Podcast Processing**: RSS feeds enable episode detection
 - **Audio Analysis**: Provides metadata for audio processing
 - **Content Monitoring**: Automated feed checking for new episodes
 ## License
 MIT License - part of the Personal AI Assistant ecosystem.
--- a/WRFG_filtered_July21.xml
+++ b/WRFG_filtered_July21.xml
--- a/WRFG_public_affairs_rap.xml
+++ b/WRFG_public_affairs_rap.xml
--- a/WRFG_rap_only.xml
+++ b/WRFG_rap_only.xml
--- a/WRFG_revolutionary_african_perspectives.xml
+++ b/WRFG_revolutionary_african_perspectives.xml
--- a/cache/20ed7335948f285910dac2bfe2caf0c4.json
+++ b/cache/20ed7335948f285910dac2bfe2caf0c4.json
--- a/cache/482892e27895ecc87ab9adaba8e08a45.json
+++ b/cache/482892e27895ecc87ab9adaba8e08a45.json
--- a/cache/5dab6670d570f975320645984d4175bd.json
+++ b/cache/5dab6670d570f975320645984d4175bd.json
--- a/cache/904c2250515cfb19ab549ec3448bf6c7.json
+++ b/cache/904c2250515cfb19ab549ec3448bf6c7.json
--- a/cache/92537c5644dc21af53706aff276cc284.json
+++ b/cache/92537c5644dc21af53706aff276cc284.json
@ -0,0 +1 @@
 {"key": "/WRFG/", "url": "https://www.mixcloud.com/WRFG/", "name": "WRFG Atlanta", "username": "WRFG", "pictures": {"small": "https://thumbnailer.mixcloud.com/unsafe/25x25/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "thumbnail": "https://thumbnailer.mixcloud.com/unsafe/50x50/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium_mobile": "https://thumbnailer.mixcloud.com/unsafe/80x80/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium": "https://thumbnailer.mixcloud.com/unsafe/100x100/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "large": "https://thumbnailer.mixcloud.com/unsafe/300x300/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "320wx320h": "https://thumbnailer.mixcloud.com/unsafe/320x320/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "extra_large": "https://thumbnailer.mixcloud.com/unsafe/600x600/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "640wx640h": "https://thumbnailer.mixcloud.com/unsafe/640x640/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d"}, "biog": "Founded in 1973 in Atlanta, GA, Radio Free Georgia is a non-profit, non-commercial, independent, community radio station. It broadcasts on 89.3 FM and is licensed at 100,000 watts. \n\nWRFG is committed to bringing progressive news and handpicked independent music to the metro Atlanta area via FM and the world via our internet stream. \n\nLearn more: https://wrfg.org/\nInstagram: https://www.instagram.com/wrfgatlanta/\nFacebook: https://www.facebook.com/wrfgatl89.3fm", "created_time": "2019-05-11T19:08:22Z", "updated_time": "2019-05-11T19:08:22Z", "follower_count": 673, "following_count": 27, "cloudcast_count": 17838, "favorite_count": 7, "listen_count": 0, "is_pro": true, "is_premium": false, "city": "Atlanta", "country": "United States", "cover_pictures": {"835wx120h": "https://thumbnailer.mixcloud.com/unsafe/835x120/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1113wx160h": "https://thumbnailer.mixcloud.com/unsafe/1113x160/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1670wx240h": "https://thumbnailer.mixcloud.com/unsafe/1670x240/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789"}, "picture_primary_color": "000000"}
--- a/cache/aa8c87fb4a0a056c866e47a6cd35a603.json
+++ b/cache/aa8c87fb4a0a056c866e47a6cd35a603.json
--- a/cache/c2f70b7f8e0ee4340a03f7c5d70f7cbf.json
+++ b/cache/c2f70b7f8e0ee4340a03f7c5d70f7cbf.json
--- a/cache/e81ada4ae2617b36febc8fdfb5f3504b.json
+++ b/cache/e81ada4ae2617b36febc8fdfb5f3504b.json
@ -0,0 +1 @@
 {"key": "/WRFG/", "url": "https://www.mixcloud.com/WRFG/", "name": "WRFG Atlanta", "username": "WRFG", "pictures": {"small": "https://thumbnailer.mixcloud.com/unsafe/25x25/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "thumbnail": "https://thumbnailer.mixcloud.com/unsafe/50x50/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium_mobile": "https://thumbnailer.mixcloud.com/unsafe/80x80/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium": "https://thumbnailer.mixcloud.com/unsafe/100x100/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "large": "https://thumbnailer.mixcloud.com/unsafe/300x300/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "320wx320h": "https://thumbnailer.mixcloud.com/unsafe/320x320/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "extra_large": "https://thumbnailer.mixcloud.com/unsafe/600x600/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "640wx640h": "https://thumbnailer.mixcloud.com/unsafe/640x640/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d"}, "biog": "Founded in 1973 in Atlanta, GA, Radio Free Georgia is a non-profit, non-commercial, independent, community radio station. It broadcasts on 89.3 FM and is licensed at 100,000 watts. \n\nWRFG is committed to bringing progressive news and handpicked independent music to the metro Atlanta area via FM and the world via our internet stream. \n\nLearn more: https://wrfg.org/\nInstagram: https://www.instagram.com/wrfgatlanta/\nFacebook: https://www.facebook.com/wrfgatl89.3fm", "created_time": "2019-05-11T19:08:22Z", "updated_time": "2019-05-11T19:08:22Z", "follower_count": 673, "following_count": 27, "cloudcast_count": 17838, "favorite_count": 7, "listen_count": 0, "is_pro": true, "is_premium": false, "city": "Atlanta", "country": "United States", "cover_pictures": {"835wx120h": "https://thumbnailer.mixcloud.com/unsafe/835x120/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1113wx160h": "https://thumbnailer.mixcloud.com/unsafe/1113x160/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1670wx240h": "https://thumbnailer.mixcloud.com/unsafe/1670x240/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789"}, "picture_primary_color": "000000"}
--- a/docker-compose.yml
+++ b/docker-compose.yml
@ -0,0 +1,19 @@
 version: '3.8'
 services:
  mixcloud-rss:
    build: .
    container_name: mixcloud-rss-generator
    ports:
      - "5000:5000"
    volumes:
      - ./cache:/app/cache
    environment:
      - SECRET_KEY=${SECRET_KEY:-your-secret-key-here}
    restart: unless-stopped
    healthcheck:
      test: ["CMD", "curl", "-f", "http://localhost:5000/health"]
      interval: 30s
      timeout: 10s
      retries: 3
      start_period: 40s
--- a/generate_july21_feed.py
+++ b/generate_july21_feed.py
@ -0,0 +1,63 @@
 #!/usr/bin/env python3
 """
 Generate RSS feed for specific dates (e.g., July 21 show)
 """
 from src.mixcloud_rss import MixcloudRSSGenerator
 def generate_filtered_feed(username, specific_dates):
    """Generate RSS feed filtered by specific dates."""
    # Create generator
    generator = MixcloudRSSGenerator()
    # Set up filters
    filters = {
        'specific_dates': specific_dates
    }
    # Generate feed
    print(f"Generating RSS feed for {username} filtered by dates: {specific_dates}")
    rss_feed = generator.generate_rss_from_username(username, limit=50, filters=filters)
    if rss_feed:
        # Save to file
        filename = f"{username}_filtered_{specific_dates.replace(',', '_').replace(' ', '')}.xml"
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(rss_feed)
        print(f"✅ RSS feed saved to: {filename}")
        # Also print the RSS URL for the web server
        print(f"\n📡 RSS URL for web server:")
        print(f"http://localhost:5000/rss/{username}?limit=50&specific_dates={specific_dates}")
        # Count episodes
        import xml.etree.ElementTree as ET
        root = ET.fromstring(rss_feed)
        items = root.findall('.//item')
        print(f"\n📊 Found {len(items)} episodes matching the filter")
        # Show episode details
        if items:
            print("\n📅 Matching episodes:")
            for item in items:
                title = item.find('title').text
                pub_date = item.find('pubDate').text
                print(f"  - {title} ({pub_date})")
    else:
        print("❌ Error: Could not generate RSS feed")
 if __name__ == "__main__":
    import sys
    if len(sys.argv) < 2:
        print("Usage: python generate_july21_feed.py <username> [dates]")
        print("Example: python generate_july21_feed.py djusername 'July 21'")
        print("Example: python generate_july21_feed.py djusername 'July 21, August 15'")
        sys.exit(1)
    username = sys.argv[1]
    dates = sys.argv[2] if len(sys.argv) > 2 else "July 21"
    generate_filtered_feed(username, dates)
--- a/generate_rap_feed.py
+++ b/generate_rap_feed.py
@ -0,0 +1,74 @@
 #!/usr/bin/env python3
 """
 Generate RSS feed for Public Affairs RAP (Revolutionary African Perspectives) shows
 """
 from src.mixcloud_rss import MixcloudRSSGenerator
 import xml.etree.ElementTree as ET
 from datetime import datetime
 def generate_rap_feed(username="WRFG"):
    """Generate RSS feed filtered for RAP shows."""
    # Create generator
    generator = MixcloudRSSGenerator()
    # Set up filters for "Public Affairs" in the title
    # This should catch variations like "afrikan" vs "african"
    filters = {
        'keywords': 'public affairs'
    }
    # Generate feed with a higher limit to catch all shows
    print(f"Generating RSS feed for {username} filtered by 'Public Affairs' shows...")
    rss_feed = generator.generate_rss_from_username(username, limit=100, filters=filters)
    if rss_feed:
        # Save to file
        filename = f"{username}_public_affairs_rap.xml"
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(rss_feed)
        print(f"✅ RSS feed saved to: {filename}")
        # Also print the RSS URL for the web server
        print(f"\n📡 RSS URL for web server:")
        print(f"http://localhost:5000/rss/{username}?limit=100&keywords=public%20affairs")
        # Parse and show episodes
        root = ET.fromstring(rss_feed)
        items = root.findall('.//item')
        print(f"\n📊 Found {len(items)} 'Public Affairs' episodes")
        # Show episode details
        if items:
            print("\n📅 Public Affairs RAP episodes:")
            for item in items:
                title = item.find('title').text
                pub_date_str = item.find('pubDate').text
                link = item.find('link').text
                # Parse date for better display
                try:
                    pub_date = datetime.strptime(pub_date_str, "%a, %d %b %Y %H:%M:%S %z")
                    date_display = pub_date.strftime("%B %d, %Y")
                except:
                    date_display = pub_date_str
                print(f"\n  📻 {title}")
                print(f"     Date: {date_display}")
                print(f"     URL: {link}")
                # Check if it's the July 21 show
                if "21 july" in title.lower() or "july 21" in title.lower():
                    print(f"     ⭐ This is the July 21 show!")
        # Generate specific URL for your podcast system
        print(f"\n🎯 For your podcast processing system, use this RSS URL:")
        print(f"http://localhost:5000/rss/WRFG?limit=100&keywords=public%20affairs")
    else:
        print("❌ Error: Could not generate RSS feed")
 if __name__ == "__main__":
    generate_rap_feed()
--- a/generate_rap_only_feed.py
+++ b/generate_rap_only_feed.py
@ -0,0 +1,85 @@
 #!/usr/bin/env python3
 """
 Generate RSS feed for ONLY the RAP (Revolutionary African/Afrikan Perspectives) shows
 """
 from src.mixcloud_rss import MixcloudRSSGenerator
 import xml.etree.ElementTree as ET
 from datetime import datetime
 def generate_rap_only_feed(username="WRFG"):
    """Generate RSS feed filtered for ONLY RAP shows."""
    # Create generator
    generator = MixcloudRSSGenerator()
    # Set up filters for "RAP" in the title
    # This will catch both "African" and "Afrikan" variations
    filters = {
        'keywords': 'RAP'  # This will match "RAP - Revolutionary African/Afrikan Perspectives"
    }
    # Generate feed with a higher limit to catch all shows
    print(f"Generating RSS feed for {username} filtered by RAP shows only...")
    rss_feed = generator.generate_rss_from_username(username, limit=200, filters=filters)
    if rss_feed:
        # Save to file
        filename = f"{username}_rap_only.xml"
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(rss_feed)
        print(f"✅ RSS feed saved to: {filename}")
        # Also print the RSS URL for the web server
        print(f"\n📡 RSS URL for web server:")
        print(f"http://localhost:5000/rss/{username}?limit=200&keywords=RAP")
        # Parse and show episodes
        root = ET.fromstring(rss_feed)
        items = root.findall('.//item')
        print(f"\n📊 Found {len(items)} RAP episodes")
        # Show episode details
        if items:
            print("\n📅 Revolutionary African/Afrikan Perspectives episodes:")
            for item in items:
                title = item.find('title').text
                pub_date_str = item.find('pubDate').text
                link = item.find('link').text
                description = item.find('description').text if item.find('description') is not None else ""
                # Parse date for better display
                try:
                    pub_date = datetime.strptime(pub_date_str, "%a, %d %b %Y %H:%M:%S %z")
                    date_display = pub_date.strftime("%B %d, %Y")
                except:
                    date_display = pub_date_str
                print(f"\n  📻 {title}")
                print(f"     Date: {date_display}")
                print(f"     URL: {link}")
                # Check if it's the July 21 show
                if "21 july" in title.lower() or "july 21" in title.lower():
                    print(f"     ⭐ This is the July 21 show!")
                # Check for African vs Afrikan spelling
                if "afrikan" in title.lower():
                    print(f"     📝 Note: Uses 'Afrikan' spelling")
                elif "african" in title.lower():
                    print(f"     📝 Note: Uses 'African' spelling")
        # Generate specific URL for your podcast system
        print(f"\n🎯 For your podcast processing system, use this RSS URL:")
        print(f"http://localhost:5000/rss/WRFG?limit=200&keywords=RAP")
        # Also create a direct link to the July 21 episode
        print(f"\n🔗 Direct link to July 21 RAP show:")
        print(f"https://www.mixcloud.com/WRFG/public-affairs-rap-revolutionary-african-perspectives-21-july-2025/")
    else:
        print("❌ Error: Could not generate RSS feed")
 if __name__ == "__main__":
    generate_rap_only_feed()
--- a/generate_rap_precise_feed.py
+++ b/generate_rap_precise_feed.py
@ -0,0 +1,90 @@
 #!/usr/bin/env python3
 """
 Generate RSS feed for ONLY the Revolutionary African/Afrikan Perspectives shows
 Using multiple keywords to be more precise
 """
 from src.mixcloud_rss import MixcloudRSSGenerator
 import xml.etree.ElementTree as ET
 from datetime import datetime
 def generate_rap_precise_feed(username="WRFG"):
    """Generate RSS feed filtered for ONLY Revolutionary African/Afrikan Perspectives shows."""
    # Create generator
    generator = MixcloudRSSGenerator()
    # Set up filters - use "revolutionary" as it's unique to these shows
    filters = {
        'keywords': 'revolutionary'  # This should only match the RAP shows
    }
    # Generate feed
    print(f"Generating RSS feed for {username} filtered by Revolutionary African/Afrikan Perspectives shows...")
    rss_feed = generator.generate_rss_from_username(username, limit=200, filters=filters)
    if rss_feed:
        # Save to file
        filename = f"{username}_revolutionary_african_perspectives.xml"
        with open(filename, 'w', encoding='utf-8') as f:
            f.write(rss_feed)
        print(f"✅ RSS feed saved to: {filename}")
        # RSS URLs for different filtering options
        print(f"\n📡 RSS URLs for web server:")
        print(f"Option 1 (by 'revolutionary'): http://localhost:5000/rss/{username}?limit=200&keywords=revolutionary")
        print(f"Option 2 (by 'public affairs' + 'revolutionary'): http://localhost:5000/rss/{username}?limit=200&keywords=public%20affairs,revolutionary")
        # Parse and show episodes
        root = ET.fromstring(rss_feed)
        items = root.findall('.//item')
        print(f"\n📊 Found {len(items)} Revolutionary African/Afrikan Perspectives episodes")
        # Show episode details
        if items:
            print("\n📅 All Revolutionary African/Afrikan Perspectives episodes:")
            july_21_found = False
            for item in items:
                title = item.find('title').text
                pub_date_str = item.find('pubDate').text
                link = item.find('link').text
                # Parse date for better display
                try:
                    pub_date = datetime.strptime(pub_date_str, "%a, %d %b %Y %H:%M:%S %z")
                    date_display = pub_date.strftime("%B %d, %Y")
                    show_date = pub_date.strftime("%Y-%m-%d")
                except:
                    date_display = pub_date_str
                    show_date = ""
                print(f"\n  📻 {title}")
                print(f"     Date: {date_display}")
                print(f"     URL: {link}")
                # Check if it's the July 21 show
                if "21 july" in title.lower() or "july 21" in title.lower() or "2025-07-21" in show_date:
                    print(f"     ⭐ This is the July 21, 2025 show!")
                    july_21_found = True
                    july_21_url = link
            if july_21_found:
                print(f"\n✨ JULY 21 SHOW FOUND!")
                print(f"Direct URL: {july_21_url}")
                print(f"\nTo analyze this specific show in your podcast system:")
                print(f"1. Use the RSS feed URL above")
                print(f"2. Or process this specific episode URL directly")
        # Summary
        print(f"\n📈 Summary:")
        print(f"- Total RAP episodes found: {len(items)}")
        print(f"- These are weekly shows featuring Revolutionary African/Afrikan Perspectives")
        print(f"- The feed includes both 'African' and 'Afrikan' spelling variations")
    else:
        print("❌ Error: Could not generate RSS feed")
 if __name__ == "__main__":
    generate_rap_precise_feed()
--- a/mixcloud-rss.log
+++ b/mixcloud-rss.log
@ -0,0 +1,68 @@
 * Serving Flask app 'web_app'
 * Debug mode: on
 INFO:werkzeug:[31m[1mWARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.[0m
 * Running on all addresses (0.0.0.0)
 * Running on http://127.0.0.1:5000
 * Running on http://192.168.68.59:5000
 INFO:werkzeug:[33mPress CTRL+C to quit[0m
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 22:53:22] "GET /health HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 22:55:20] "GET / HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 22:55:21] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
 INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/mixcloud_rss.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/mixcloud_rss.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/web_app.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/web_app.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:00:37] "GET / HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:00:51] "POST /generate HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:03] "POST /api/validate HTTP/1.1" 200 -
 ERROR:__main__:Error generating RSS: can't compare offset-naive and offset-aware datetimes
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:21] "[35m[1mPOST /generate HTTP/1.1[0m" 500 -
 ERROR:__main__:Error generating RSS: can't compare offset-naive and offset-aware datetimes
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:30] "[35m[1mPOST /generate HTTP/1.1[0m" 500 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:40] "GET / HTTP/1.1" 200 -
 ERROR:__main__:Error generating RSS: can't compare offset-naive and offset-aware datetimes
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:43] "[35m[1mPOST /generate HTTP/1.1[0m" 500 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:53] "POST /generate HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:07:42] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:15:57] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:19:38] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
 INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:29:23] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
 INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
 INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
 INFO:werkzeug: * Restarting with stat
 WARNING:werkzeug: * Debugger is active!
 INFO:werkzeug: * Debugger PIN: 785-868-005
--- a/requirements.txt
+++ b/requirements.txt
@ -0,0 +1,3 @@
 requests>=2.31.0
 beautifulsoup4>=4.12.0
 lxml>=4.9.0
--- a/src/pycache/mixcloud_rss.cpython-310.pyc
+++ b/src/pycache/mixcloud_rss.cpython-310.pyc
--- a/src/cli.py
+++ b/src/cli.py
@ -0,0 +1,189 @@
 #!/usr/bin/env python3
 """
 Backend-only CLI for Mixcloud RSS Generation
 Uses shared content syndication services for RSS generation.
 Replaces web_app.py and legacy mixcloud_rss.py dependencies.
 """
 import argparse
 import json
 import os
 import sys
 from typing import Dict, Optional
 # Add parent directories to path for shared imports
 sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../'))
 from shared.services.content_syndication import (
    ContentSyndicationService,
    FeedFilterService
 )
 def main():
    """Command-line interface for backend Mixcloud RSS generation."""
    parser = argparse.ArgumentParser(
        description="Generate RSS feeds from Mixcloud users (Backend CLI)",
        formatter_class=argparse.RawDescriptionHelpFormatter,
        epilog="""
 Examples:
  %(prog)s WRFG                                    # Basic RSS for WRFG user
  %(prog)s --url https://mixcloud.com/NTSRadio/   # From full URL
  %(prog)s WRFG --limit 50 --output feed.xml      # Save 50 episodes to file
  %(prog)s WRFG --keywords "rap,public affairs"   # Filter by keywords
  %(prog)s WRFG --rap-only                        # RAP shows only
  %(prog)s WRFG --date-range 2024-01-01 2024-12-31 # Date filtering
        """
    )
    # Input options
    input_group = parser.add_mutually_exclusive_group(required=True)
    input_group.add_argument("username", nargs='?', help="Mixcloud username")
    input_group.add_argument("--url", help="Mixcloud URL")
    # Output options
    parser.add_argument("-o", "--output", help="Output file path (default: stdout)")
    parser.add_argument("-l", "--limit", type=int, default=20, 
                       help="Number of episodes to include (default: 20)")
    # Caching options
    parser.add_argument("--cache-dir", default="./cache", 
                       help="Cache directory path (default: ./cache)")
    parser.add_argument("--cache-ttl", type=int, default=3600,
                       help="Cache TTL in seconds (default: 3600)")
    # Filtering options
    filter_group = parser.add_argument_group("Filtering Options")
    filter_group.add_argument("--keywords", 
                             help="Filter by keywords in title (comma-separated)")
    filter_group.add_argument("--tags", 
                             help="Filter by tags (comma-separated)")
    filter_group.add_argument("--date-range", nargs=2, metavar=('START', 'END'),
                             help="Filter by date range (YYYY-MM-DD format)")
    filter_group.add_argument("--specific-dates", 
                             help="Filter by specific dates (comma-separated)")
    # Convenience options
    convenience_group = parser.add_argument_group("Convenience Options")
    convenience_group.add_argument("--rap-only", action='store_true',
                                  help="Filter for Revolutionary African Perspectives shows only")
    # Utility options
    parser.add_argument("--validate", action='store_true',
                       help="Validate user without generating feed")
    parser.add_argument("--user-info", action='store_true',
                       help="Show user information only")
    parser.add_argument("--verbose", "-v", action='store_true',
                       help="Verbose output")
    args = parser.parse_args()
    # Determine username
    username = args.username
    if args.url:
        # Initialize service to extract username
        syndication_service = ContentSyndicationService(args.cache_dir, args.cache_ttl)
        try:
            rss_feed = syndication_service.generate_mixcloud_rss_from_url(args.url, limit=1)
            # Extract username from URL using service
            username = syndication_service.mixcloud_client.extract_username_from_url(args.url)
            if not username:
                print(f"Error: Could not extract username from URL: {args.url}", file=sys.stderr)
                return 1
        except Exception as e:
            print(f"Error: {e}", file=sys.stderr)
            return 1
    if not username:
        print("Error: No username provided", file=sys.stderr)
        return 1
    # Initialize content syndication service
    syndication_service = ContentSyndicationService(args.cache_dir, args.cache_ttl)
    # Handle validation only
    if args.validate:
        result = syndication_service.validate_mixcloud_user(username)
        if result['valid']:
            print(f"✅ Valid user: {result['username']} ({result['name']}) - {result['show_count']} shows")
            return 0
        else:
            print(f"❌ Invalid user: {result['message']}")
            return 1
    # Handle user info only
    if args.user_info:
        user_data = syndication_service.get_mixcloud_user_info(username)
        if user_data:
            print(f"User: {user_data.get('name', username)}")
            print(f"Username: {username}")
            print(f"Bio: {user_data.get('biog', 'N/A')}")
            print(f"Shows: {user_data.get('cloudcast_count', 0)}")
            print(f"Profile: https://www.mixcloud.com/{username}/")
            return 0
        else:
            print(f"Error: User '{username}' not found", file=sys.stderr)
            return 1
    # Build filters
    filters = {}
    if args.rap_only:
        filters = FeedFilterService.create_rap_filter()
        if args.verbose:
            print("Applied RAP filter", file=sys.stderr)
    if args.keywords:
        filters['keywords'] = args.keywords
        if args.verbose:
            print(f"Filter: keywords = {args.keywords}", file=sys.stderr)
    if args.tags:
        filters['tags'] = args.tags  
        if args.verbose:
            print(f"Filter: tags = {args.tags}", file=sys.stderr)
    if args.date_range:
        filters['start_date'] = args.date_range[0]
        filters['end_date'] = args.date_range[1]
        if args.verbose:
            print(f"Filter: date range = {args.date_range[0]} to {args.date_range[1]}", file=sys.stderr)
    if args.specific_dates:
        filters['specific_dates'] = args.specific_dates
        if args.verbose:
            print(f"Filter: specific dates = {args.specific_dates}", file=sys.stderr)
    # Generate RSS feed
    try:
        if args.verbose:
            print(f"Generating RSS for user: {username}", file=sys.stderr)
            print(f"Limit: {args.limit} episodes", file=sys.stderr)
            if filters:
                print(f"Filters applied: {list(filters.keys())}", file=sys.stderr)
        rss_feed = syndication_service.generate_mixcloud_rss(username, args.limit, filters)
        if rss_feed:
            if args.output:
                with open(args.output, "w", encoding="utf-8") as f:
                    f.write(rss_feed)
                print(f"RSS feed saved to: {args.output}", file=sys.stderr)
            else:
                print(rss_feed)
            return 0
        else:
            print(f"Error: Could not generate RSS feed for user '{username}'", file=sys.stderr)
            return 1
    except Exception as e:
        print(f"Error: {e}", file=sys.stderr)
        if args.verbose:
            import traceback
            traceback.print_exc()
        return 1
 if __name__ == "__main__":
    sys.exit(main())
--- a/src/mixcloud_rss.py
+++ b/src/mixcloud_rss.py
@ -0,0 +1,375 @@
 #!/usr/bin/env python3
 """
 Mixcloud to RSS Feed Generator
 Converts Mixcloud user pages or show pages into RSS feeds that can be consumed
 by podcast apps or feed readers.
 """
 import json
 import re
 import xml.etree.ElementTree as ET
 from datetime import datetime
 from typing import Dict, List, Optional, Union
 from urllib.parse import quote, urlencode, urlparse
 import hashlib
 import os
 import requests
 from bs4 import BeautifulSoup
 class MixcloudRSSGenerator:
    """Generate RSS feeds from Mixcloud pages."""
    def __init__(self, cache_dir: str = "./cache", cache_ttl: int = 3600):
        """
        Initialize the Mixcloud RSS Generator.
        Args:
            cache_dir: Directory for caching API responses
            cache_ttl: Cache time-to-live in seconds (default: 1 hour)
        """
        self.cache_dir = cache_dir
        self.cache_ttl = cache_ttl
        self.api_base = "https://api.mixcloud.com"
        self.base_url = "https://www.mixcloud.com"
        # Create cache directory if it doesn't exist
        os.makedirs(cache_dir, exist_ok=True)
    def _get_cache_path(self, url: str) -> str:
        """Generate cache file path for a URL."""
        url_hash = hashlib.md5(url.encode()).hexdigest()
        return os.path.join(self.cache_dir, f"{url_hash}.json")
    def _get_cached_data(self, url: str) -> Optional[Dict]:
        """Get cached data if available and not expired."""
        cache_path = self._get_cache_path(url)
        if os.path.exists(cache_path):
            # Check if cache is still valid
            cache_age = datetime.now().timestamp() - os.path.getmtime(cache_path)
            if cache_age < self.cache_ttl:
                with open(cache_path, 'r') as f:
                    return json.load(f)
        return None
    def _save_to_cache(self, url: str, data: Dict) -> None:
        """Save data to cache."""
        cache_path = self._get_cache_path(url)
        with open(cache_path, 'w') as f:
            json.dump(data, f)
    def _fetch_mixcloud_data(self, api_url: str) -> Optional[Dict]:
        """Fetch data from Mixcloud API with caching."""
        # Check cache first
        cached_data = self._get_cached_data(api_url)
        if cached_data:
            return cached_data
        try:
            response = requests.get(api_url, timeout=10)
            response.raise_for_status()
            data = response.json()
            # Save to cache
            self._save_to_cache(api_url, data)
            return data
        except Exception as e:
            print(f"Error fetching Mixcloud data: {e}")
            return None
    def _extract_username_from_url(self, url: str) -> Optional[str]:
        """Extract username from Mixcloud URL."""
        # Handle various Mixcloud URL formats
        patterns = [
            r'mixcloud\.com/([^/]+)/?$',
            r'mixcloud\.com/([^/]+)/(?:uploads|favorites|listens)?/?$',
            r'mixcloud\.com/([^/]+)/[^/]+/?$',  # Specific show
        ]
        for pattern in patterns:
            match = re.search(pattern, url)
            if match:
                return match.group(1)
        return None
    def _format_duration(self, seconds: int) -> str:
        """Format duration in seconds to HH:MM:SS."""
        hours = seconds // 3600
        minutes = (seconds % 3600) // 60
        secs = seconds % 60
        if hours > 0:
            return f"{hours:02d}:{minutes:02d}:{secs:02d}"
        else:
            return f"{minutes:02d}:{secs:02d}"
    def _filter_shows(self, shows: List[Dict], filters: Dict = None) -> List[Dict]:
        """Filter shows based on criteria."""
        if not filters:
            return shows
        filtered_shows = shows
        # Filter by date range
        if filters.get('start_date'):
            start_date = datetime.fromisoformat(filters['start_date'].replace('Z', '+00:00'))
            filtered_shows = [
                show for show in filtered_shows
                if datetime.fromisoformat(show['created_time'].replace('Z', '+00:00')) >= start_date
            ]
        if filters.get('end_date'):
            end_date = datetime.fromisoformat(filters['end_date'].replace('Z', '+00:00'))
            filtered_shows = [
                show for show in filtered_shows
                if datetime.fromisoformat(show['created_time'].replace('Z', '+00:00')) <= end_date
            ]
        # Filter by keywords in title
        if filters.get('keywords'):
            keywords = filters['keywords'].lower().split(',')
            filtered_shows = [
                show for show in filtered_shows
                if any(keyword.strip() in show.get('name', '').lower() for keyword in keywords)
            ]
        # Filter by tags
        if filters.get('tags'):
            filter_tags = [tag.strip().lower() for tag in filters['tags'].split(',')]
            filtered_shows = [
                show for show in filtered_shows
                if any(
                    tag['name'].lower() in filter_tags 
                    for tag in show.get('tags', [])
                )
            ]
        # Filter by specific dates (e.g., "July 21")
        if filters.get('specific_dates'):
            dates = filters['specific_dates'].split(',')
            filtered_shows = [
                show for show in filtered_shows
                if self._matches_date(show['created_time'], dates)
            ]
        return filtered_shows
    def _matches_date(self, created_time: str, dates: List[str]) -> bool:
        """Check if created_time matches any of the specified dates."""
        show_date = datetime.fromisoformat(created_time.replace('Z', '+00:00'))
        for date_str in dates:
            date_str = date_str.strip().lower()
            # Handle various date formats
            # "July 21" or "Jul 21"
            if any(month in date_str for month in ['january', 'february', 'march', 'april', 'may', 'june', 
                                                    'july', 'august', 'september', 'october', 'november', 'december',
                                                    'jan', 'feb', 'mar', 'apr', 'may', 'jun', 
                                                    'jul', 'aug', 'sep', 'oct', 'nov', 'dec']):
                try:
                    # Parse month and day
                    parsed_date = datetime.strptime(f"{date_str} {show_date.year}", "%B %d %Y")
                    if show_date.date() == parsed_date.date():
                        return True
                except:
                    try:
                        parsed_date = datetime.strptime(f"{date_str} {show_date.year}", "%b %d %Y")
                        if show_date.date() == parsed_date.date():
                            return True
                    except:
                        pass
            # "2024-07-21" format
            elif '-' in date_str:
                try:
                    parsed_date = datetime.fromisoformat(date_str)
                    if show_date.date() == parsed_date.date():
                        return True
                except:
                    pass
            # "07/21/2024" or "7/21/2024" format
            elif '/' in date_str:
                for fmt in ['%m/%d/%Y', '%m/%d/%y', '%d/%m/%Y', '%d/%m/%y']:
                    try:
                        parsed_date = datetime.strptime(date_str, fmt)
                        if show_date.date() == parsed_date.date():
                            return True
                    except:
                        pass
        return False
    def _build_rss_feed(self, user_data: Dict, shows: List[Dict]) -> str:
        """Build RSS XML feed from user data and shows."""
        # Create root RSS element
        rss = ET.Element("rss", version="2.0", attrib={
            "xmlns:itunes": "http://www.itunes.com/dtds/podcast-1.0.dtd",
            "xmlns:content": "http://purl.org/rss/1.0/modules/content/"
        })
        channel = ET.SubElement(rss, "channel")
        # Channel metadata
        ET.SubElement(channel, "title").text = user_data.get("name", "Mixcloud Feed")
        ET.SubElement(channel, "link").text = f"{self.base_url}{user_data.get('key', '')}"
        ET.SubElement(channel, "description").text = user_data.get("biog", "Mixcloud podcast feed")
        ET.SubElement(channel, "language").text = "en-us"
        ET.SubElement(channel, "lastBuildDate").text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S +0000")
        # iTunes podcast metadata
        ET.SubElement(channel, "itunes:author").text = user_data.get("name", "")
        ET.SubElement(channel, "itunes:summary").text = user_data.get("biog", "")
        if user_data.get("pictures", {}).get("large"):
            image = ET.SubElement(channel, "itunes:image")
            image.set("href", user_data["pictures"]["large"])
        # Add each show as an item
        for show in shows:
            item = ET.SubElement(channel, "item")
            # Basic item elements
            ET.SubElement(item, "title").text = show.get("name", "")
            ET.SubElement(item, "link").text = f"{self.base_url}{show.get('key', '')}"
            # Description with tags
            description = show.get("description", "")
            if show.get("tags"):
                tags = ", ".join([tag["name"] for tag in show["tags"]])
                description += f"\n\nTags: {tags}"
            ET.SubElement(item, "description").text = description
            # Publication date
            created_time = show.get("created_time")
            if created_time:
                pub_date = datetime.fromisoformat(created_time.replace("Z", "+00:00"))
                ET.SubElement(item, "pubDate").text = pub_date.strftime("%a, %d %b %Y %H:%M:%S +0000")
            # GUID
            ET.SubElement(item, "guid", isPermaLink="true").text = f"{self.base_url}{show.get('key', '')}"
            # Audio enclosure (if audio URL is available)
            audio_url = show.get("audio_url") or f"{self.base_url}{show.get('key', '')}"
            enclosure = ET.SubElement(item, "enclosure")
            enclosure.set("url", audio_url)
            enclosure.set("type", "audio/mpeg")
            enclosure.set("length", str(show.get("audio_length", 0)))
            # iTunes elements
            ET.SubElement(item, "itunes:author").text = user_data.get("name", "")
            ET.SubElement(item, "itunes:summary").text = description
            ET.SubElement(item, "itunes:duration").text = self._format_duration(show.get("audio_length", 0))
            if show.get("pictures", {}).get("large"):
                ET.SubElement(item, "itunes:image").set("href", show["pictures"]["large"])
        # Convert to string
        return '<?xml version="1.0" encoding="UTF-8"?>\n' + ET.tostring(rss, encoding="unicode")
    def get_user_shows(self, username: str, limit: int = 20) -> Optional[List[Dict]]:
        """Get list of shows for a Mixcloud user."""
        # Fetch user data
        user_api_url = f"{self.api_base}/{username}/"
        user_data = self._fetch_mixcloud_data(user_api_url)
        if not user_data:
            return None
        # Fetch user's cloudcasts (shows)
        shows_api_url = f"{self.api_base}/{username}/cloudcasts/"
        params = {"limit": limit}
        all_shows = []
        while len(all_shows) < limit:
            shows_data = self._fetch_mixcloud_data(f"{shows_api_url}?{urlencode(params)}")
            if not shows_data or "data" not in shows_data:
                break
            all_shows.extend(shows_data["data"])
            # Check for next page
            if "paging" in shows_data and "next" in shows_data["paging"]:
                shows_api_url = shows_data["paging"]["next"]
            else:
                break
        return {"user": user_data, "shows": all_shows[:limit]}
    def generate_rss_from_url(self, mixcloud_url: str, limit: int = 20, filters: Dict = None) -> Optional[str]:
        """Generate RSS feed from a Mixcloud URL with optional filters."""
        username = self._extract_username_from_url(mixcloud_url)
        if not username:
            raise ValueError(f"Could not extract username from URL: {mixcloud_url}")
        data = self.get_user_shows(username, limit * 2)  # Get more shows to filter from
        if not data:
            return None
        # Apply filters
        filtered_shows = self._filter_shows(data["shows"], filters)[:limit]
        return self._build_rss_feed(data["user"], filtered_shows)
    def generate_rss_from_username(self, username: str, limit: int = 20, filters: Dict = None) -> Optional[str]:
        """Generate RSS feed from a Mixcloud username with optional filters."""
        data = self.get_user_shows(username, limit * 2)  # Get more shows to filter from
        if not data:
            return None
        # Apply filters
        filtered_shows = self._filter_shows(data["shows"], filters)[:limit]
        return self._build_rss_feed(data["user"], filtered_shows)
 def main():
    """Command-line interface for the Mixcloud RSS generator."""
    import argparse
    parser = argparse.ArgumentParser(description="Generate RSS feeds from Mixcloud pages")
    parser.add_argument("input", help="Mixcloud URL or username")
    parser.add_argument("-l", "--limit", type=int, default=20, help="Number of episodes to include (default: 20)")
    parser.add_argument("-o", "--output", help="Output file path (default: stdout)")
    parser.add_argument("-c", "--cache-dir", default="./cache", help="Cache directory path")
    parser.add_argument("-t", "--cache-ttl", type=int, default=3600, help="Cache TTL in seconds (default: 3600)")
    args = parser.parse_args()
    # Create generator
    generator = MixcloudRSSGenerator(cache_dir=args.cache_dir, cache_ttl=args.cache_ttl)
    # Determine if input is URL or username
    if "mixcloud.com" in args.input:
        rss_feed = generator.generate_rss_from_url(args.input, args.limit)
    else:
        rss_feed = generator.generate_rss_from_username(args.input, args.limit)
    if rss_feed:
        if args.output:
            with open(args.output, "w", encoding="utf-8") as f:
                f.write(rss_feed)
            print(f"RSS feed saved to: {args.output}")
        else:
            print(rss_feed)
    else:
        print("Error: Could not generate RSS feed")
        return 1
    return 0
 if __name__ == "__main__":
    exit(main())
		`@ -0,0 +1 @@`
							{"key": "/WRFG/", "url": "https://www.mixcloud.com/WRFG/", "name": "WRFG Atlanta", "username": "WRFG", "pictures": {"small": "https://thumbnailer.mixcloud.com/unsafe/25x25/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "thumbnail": "https://thumbnailer.mixcloud.com/unsafe/50x50/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium_mobile": "https://thumbnailer.mixcloud.com/unsafe/80x80/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium": "https://thumbnailer.mixcloud.com/unsafe/100x100/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "large": "https://thumbnailer.mixcloud.com/unsafe/300x300/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "320wx320h": "https://thumbnailer.mixcloud.com/unsafe/320x320/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "extra_large": "https://thumbnailer.mixcloud.com/unsafe/600x600/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "640wx640h": "https://thumbnailer.mixcloud.com/unsafe/640x640/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d"}, "biog": "Founded in 1973 in Atlanta, GA, Radio Free Georgia is a non-profit, non-commercial, independent, community radio station. It broadcasts on 89.3 FM and is licensed at 100,000 watts. \n\nWRFG is committed to bringing progressive news and handpicked independent music to the metro Atlanta area via FM and the world via our internet stream. \n\nLearn more: https://wrfg.org/\nInstagram: https://www.instagram.com/wrfgatlanta/\nFacebook: https://www.facebook.com/wrfgatl89.3fm", "created_time": "2019-05-11T19:08:22Z", "updated_time": "2019-05-11T19:08:22Z", "follower_count": 673, "following_count": 27, "cloudcast_count": 17838, "favorite_count": 7, "listen_count": 0, "is_pro": true, "is_premium": false, "city": "Atlanta", "country": "United States", "cover_pictures": {"835wx120h": "https://thumbnailer.mixcloud.com/unsafe/835x120/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1113wx160h": "https://thumbnailer.mixcloud.com/unsafe/1113x160/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1670wx240h": "https://thumbnailer.mixcloud.com/unsafe/1670x240/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789"}, "picture_primary_color": "000000"}