Initial commit - Mixcloud RSS Generator
This commit is contained in:
commit
d7d82c4211
|
|
@ -0,0 +1,66 @@
|
||||||
|
# Changelog - Mixcloud RSS Generator
|
||||||
|
|
||||||
|
All notable changes to the Mixcloud RSS Generator component will be documented in this file.
|
||||||
|
|
||||||
|
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
|
||||||
|
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
|
||||||
|
|
||||||
|
## [Unreleased]
|
||||||
|
### To Do
|
||||||
|
- Add support for playlists and categories
|
||||||
|
- Implement feed pagination for users with many shows
|
||||||
|
- Add configurable cache expiration
|
||||||
|
- Support for custom feed metadata
|
||||||
|
|
||||||
|
## [0.3.0] - 2025-08-04
|
||||||
|
### Added
|
||||||
|
- Generated specialized RSS feeds for Revolutionary African Perspectives (RAP) show
|
||||||
|
- Created filtered feeds for specific date ranges (July 21 episode)
|
||||||
|
- Added precise show filtering capabilities
|
||||||
|
- Support for WRFG radio show feeds
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- Enhanced feed generation scripts for better show filtering
|
||||||
|
- Improved caching mechanism for faster feed updates
|
||||||
|
|
||||||
|
## [0.2.0] - 2025-07-15
|
||||||
|
### Added
|
||||||
|
- Web interface for RSS feed generation
|
||||||
|
- RESTful API endpoints for programmatic access
|
||||||
|
- Built-in caching system for improved performance
|
||||||
|
- Docker support with dedicated Dockerfile
|
||||||
|
- HTML template for web interface
|
||||||
|
|
||||||
|
### Changed
|
||||||
|
- Restructured code into modular components (src directory)
|
||||||
|
- Improved error handling for invalid Mixcloud URLs
|
||||||
|
- Enhanced feed metadata with proper iTunes tags
|
||||||
|
|
||||||
|
### Fixed
|
||||||
|
- Audio URL extraction for newer Mixcloud API changes
|
||||||
|
- Character encoding issues in show descriptions
|
||||||
|
|
||||||
|
## [0.1.0] - 2025-06-01
|
||||||
|
### Added
|
||||||
|
- Initial release with core functionality
|
||||||
|
- Command-line interface for RSS feed generation
|
||||||
|
- Support for converting Mixcloud user shows to RSS
|
||||||
|
- Basic feed generation with episode metadata
|
||||||
|
- Compatible with major podcast apps
|
||||||
|
- Configurable episode limits
|
||||||
|
|
||||||
|
### Technical Details
|
||||||
|
- Python-based implementation
|
||||||
|
- Uses Mixcloud's public API
|
||||||
|
- Generates standard RSS 2.0 feeds with podcast extensions
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Integration with Personal AI Assistant
|
||||||
|
|
||||||
|
This component is used by the main Personal AI Assistant project for:
|
||||||
|
- Monitoring podcast RSS feeds for new episodes
|
||||||
|
- Providing audio sources for the transcription pipeline
|
||||||
|
- Enabling podcast app compatibility for processed shows
|
||||||
|
|
||||||
|
For main project changes, see the [parent changelog](../CHANGELOG.md).
|
||||||
|
|
@ -0,0 +1,228 @@
|
||||||
|
# CLAUDE.md - Mixcloud RSS Generator
|
||||||
|
|
||||||
|
This file provides guidance to Claude Code when working with the Mixcloud RSS Generator component of the Personal AI Assistant project.
|
||||||
|
|
||||||
|
## Relationship to Main Project
|
||||||
|
- Part of the Personal AI Assistant ecosystem
|
||||||
|
- See main [CLAUDE.md](../CLAUDE.md) for general project guidelines
|
||||||
|
- Update [CHANGELOG.md](./CHANGELOG.md) when making changes to this component
|
||||||
|
|
||||||
|
## Understanding Component History
|
||||||
|
**New to this component?** Review [CHANGELOG.md](./CHANGELOG.md) to understand:
|
||||||
|
- Evolution from v0.1.0 CLI tool to v0.3.0 with specialized feeds
|
||||||
|
- API changes and adaptations
|
||||||
|
- Integration timeline with main project
|
||||||
|
|
||||||
|
## Project Overview
|
||||||
|
A backend-only CLI tool that converts Mixcloud user shows into RSS feeds compatible with podcast apps and feed readers. Uses shared content syndication services from the main AI Assistant project for reusability and consistency.
|
||||||
|
|
||||||
|
**Architecture Change (v1.0)**: Refactored from Flask web app to backend-only CLI using shared services.
|
||||||
|
|
||||||
|
## Technology Stack
|
||||||
|
- **Language**: Python 3.8+
|
||||||
|
- **Architecture**: Backend-only CLI with shared services
|
||||||
|
- **Key Libraries**:
|
||||||
|
- `requests` - HTTP requests to Mixcloud (via shared services)
|
||||||
|
- `xml.etree.ElementTree` - RSS/XML generation (via shared services)
|
||||||
|
- **Shared Services**: Content syndication components from `shared/services/content_syndication/`
|
||||||
|
- **Removed**: Flask, BeautifulSoup4 (moved to shared services)
|
||||||
|
- **Caching**: File-based caching in `./cache` directory
|
||||||
|
|
||||||
|
## Quick Start
|
||||||
|
|
||||||
|
### Backend CLI Usage
|
||||||
|
```bash
|
||||||
|
# REQUIRED FIRST STEP - Activate virtual environment (when using local development)
|
||||||
|
source venv/bin/activate
|
||||||
|
|
||||||
|
# Set PYTHONPATH for shared services
|
||||||
|
export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
|
||||||
|
|
||||||
|
# Generate RSS feed for a Mixcloud user
|
||||||
|
python src/cli.py WRFG
|
||||||
|
|
||||||
|
# Save to file
|
||||||
|
python src/cli.py WRFG -o feed.xml
|
||||||
|
|
||||||
|
# Limit number of episodes
|
||||||
|
python src/cli.py WRFG -l 50
|
||||||
|
|
||||||
|
# Advanced filtering
|
||||||
|
python src/cli.py WRFG --keywords "rap,public affairs" --limit 100
|
||||||
|
python src/cli.py WRFG --rap-only --limit 200
|
||||||
|
```
|
||||||
|
|
||||||
|
### Legacy Command Line (Deprecated)
|
||||||
|
```bash
|
||||||
|
# Still available for compatibility
|
||||||
|
python src/mixcloud_rss.py username -o feed.xml
|
||||||
|
```
|
||||||
|
|
||||||
|
## Architecture Notes
|
||||||
|
|
||||||
|
### New Backend-Only Architecture
|
||||||
|
1. **CLI Interface**: `src/cli.py` - New primary interface with advanced filtering
|
||||||
|
2. **Shared Services**: Uses `shared/services/content_syndication/` components:
|
||||||
|
- `ContentSyndicationService` - Main orchestration
|
||||||
|
- `MixcloudAPIClient` - API interactions with caching
|
||||||
|
- `RSSFeedGenerator` - RSS 2.0 generation with iTunes extensions
|
||||||
|
- `FeedFilterService` - Advanced filtering (dates, keywords, tags)
|
||||||
|
|
||||||
|
### RSS Feed Generation Flow
|
||||||
|
1. CLI parses arguments and builds filters
|
||||||
|
2. ContentSyndicationService orchestrates the process
|
||||||
|
3. MixcloudAPIClient fetches user data with caching
|
||||||
|
4. FeedFilterService applies filtering criteria
|
||||||
|
5. RSSFeedGenerator creates RSS 2.0 compliant XML
|
||||||
|
6. Results output to file or stdout
|
||||||
|
|
||||||
|
### Key Files
|
||||||
|
- `src/cli.py` - **NEW** Backend CLI interface (primary)
|
||||||
|
- `src/mixcloud_rss.py` - Legacy RSS generation logic (deprecated)
|
||||||
|
- `generate_*.py` - Specialized feed generators (work with legacy code)
|
||||||
|
- `cache/` - Cached API responses (gitignored)
|
||||||
|
- **Archived**: `archived_projects/mixcloud-ui/` - Former Flask web interface
|
||||||
|
|
||||||
|
### Caching Strategy
|
||||||
|
- Default TTL: 3600 seconds (1 hour)
|
||||||
|
- Cache key: MD5 hash of request parameters
|
||||||
|
- Stored as JSON files in `./cache` directory
|
||||||
|
|
||||||
|
## Development Commands
|
||||||
|
|
||||||
|
### Testing Feed Generation
|
||||||
|
```bash
|
||||||
|
# REQUIRED FIRST STEP - Activate virtual environment (when using local development)
|
||||||
|
source venv/bin/activate
|
||||||
|
export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
|
||||||
|
|
||||||
|
# Test with new CLI
|
||||||
|
python src/cli.py WRFG --validate # Quick user validation
|
||||||
|
python src/cli.py WRFG --user-info # User information
|
||||||
|
python src/cli.py WRFG --verbose # Verbose RSS generation
|
||||||
|
|
||||||
|
# Test filtering
|
||||||
|
python src/cli.py WRFG --rap-only --limit 100
|
||||||
|
python src/cli.py WRFG --keywords "interview" --output test.xml
|
||||||
|
|
||||||
|
# Validate RSS output
|
||||||
|
python -m xml.dom.minidom test.xml # Pretty print and validate XML
|
||||||
|
|
||||||
|
# Legacy testing (still works)
|
||||||
|
python src/mixcloud_rss.py WRFG
|
||||||
|
python generate_rap_feed.py
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cache Management
|
||||||
|
```bash
|
||||||
|
# Clear cache
|
||||||
|
rm -rf cache/*.json
|
||||||
|
|
||||||
|
# View cached data
|
||||||
|
ls -la cache/
|
||||||
|
```
|
||||||
|
|
||||||
|
## Common Issues and Solutions
|
||||||
|
|
||||||
|
### Mixcloud API Changes
|
||||||
|
**Problem**: Feed generation fails with extraction errors
|
||||||
|
**Solution**:
|
||||||
|
- Check if Mixcloud HTML structure changed
|
||||||
|
- Update BeautifulSoup selectors in `extract_shows_from_html()`
|
||||||
|
- Look for new API endpoints in browser network tab
|
||||||
|
|
||||||
|
### Audio URL Extraction
|
||||||
|
**Problem**: "Could not extract audio URL" errors
|
||||||
|
**Solution**:
|
||||||
|
- Mixcloud often changes their audio URL format
|
||||||
|
- Check `extract_audio_url()` method
|
||||||
|
- May need to update regex patterns or API calls
|
||||||
|
|
||||||
|
### RSS Feed Validation
|
||||||
|
**Problem**: Podcast apps reject the feed
|
||||||
|
**Solution**:
|
||||||
|
- Ensure all required RSS elements are present
|
||||||
|
- Check iTunes podcast extensions
|
||||||
|
- Validate dates are in RFC822 format
|
||||||
|
- Use online RSS validators
|
||||||
|
|
||||||
|
### Character Encoding
|
||||||
|
**Problem**: Special characters appear garbled
|
||||||
|
**Solution**:
|
||||||
|
- Ensure UTF-8 encoding throughout
|
||||||
|
- Use `.encode('utf-8')` when writing files
|
||||||
|
- Set XML encoding declaration
|
||||||
|
|
||||||
|
## Integration with Main Project
|
||||||
|
|
||||||
|
### Usage in Personal AI Assistant
|
||||||
|
1. **Podcast Monitoring**: RSS feeds enable the podcast processing pipeline
|
||||||
|
2. **Episode Detection**: New episodes detected via RSS polling
|
||||||
|
3. **Audio Source**: Provides URLs for audio downloading
|
||||||
|
|
||||||
|
### Specialized Feeds
|
||||||
|
The `generate_*.py` scripts create filtered feeds for specific shows:
|
||||||
|
- `generate_rap_feed.py` - Revolutionary African Perspectives shows
|
||||||
|
- `generate_july21_feed.py` - Specific date filtering
|
||||||
|
|
||||||
|
## Testing
|
||||||
|
|
||||||
|
### Manual Testing
|
||||||
|
```bash
|
||||||
|
# REQUIRED FIRST STEP - Activate virtual environment (when using local development)
|
||||||
|
source venv/bin/activate
|
||||||
|
|
||||||
|
# Test basic functionality
|
||||||
|
python src/mixcloud_rss.py WRFG -o test.xml
|
||||||
|
cat test.xml | head -20 # Check output
|
||||||
|
|
||||||
|
# Test web interface
|
||||||
|
python src/web_app.py
|
||||||
|
# Visit http://localhost:5000 and test form submission
|
||||||
|
```
|
||||||
|
|
||||||
|
### Feed Validation
|
||||||
|
- Use https://validator.w3.org/feed/ for RSS validation
|
||||||
|
- Test in actual podcast apps (Apple Podcasts, Overcast)
|
||||||
|
- Check all links are accessible
|
||||||
|
|
||||||
|
## Deployment Considerations
|
||||||
|
|
||||||
|
### Docker Support
|
||||||
|
```bash
|
||||||
|
# Build image
|
||||||
|
docker build -t mixcloud-rss .
|
||||||
|
|
||||||
|
# Run container
|
||||||
|
docker run -p 5000:5000 mixcloud-rss
|
||||||
|
```
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
- No required environment variables
|
||||||
|
- Optional: `CACHE_DIR`, `CACHE_TTL` for customization
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Debug Mode
|
||||||
|
```python
|
||||||
|
# Add to scripts for verbose output
|
||||||
|
import logging
|
||||||
|
logging.basicConfig(level=logging.DEBUG)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Common Error Messages
|
||||||
|
- **"No shows found"**: Check if username is correct or if Mixcloud is accessible
|
||||||
|
- **"Cache directory not writable"**: Ensure `./cache` exists and has write permissions
|
||||||
|
- **"Invalid XML"**: Check for unescaped special characters in show data
|
||||||
|
|
||||||
|
### Performance
|
||||||
|
- Cache is crucial for performance
|
||||||
|
- Consider implementing Redis cache for production
|
||||||
|
- Batch requests when generating multiple feeds
|
||||||
|
|
||||||
|
## Best Practices
|
||||||
|
1. Always validate generated RSS feeds
|
||||||
|
2. Test with multiple podcast apps
|
||||||
|
3. Monitor Mixcloud for API/structure changes
|
||||||
|
4. Keep cache directory clean in development
|
||||||
|
5. Log errors for debugging production issues
|
||||||
|
|
@ -0,0 +1,23 @@
|
||||||
|
FROM python:3.11-slim
|
||||||
|
|
||||||
|
WORKDIR /app
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
COPY requirements.txt .
|
||||||
|
RUN pip install --no-cache-dir -r requirements.txt
|
||||||
|
|
||||||
|
# Copy application code
|
||||||
|
COPY src/ ./src/
|
||||||
|
|
||||||
|
# Create cache directory
|
||||||
|
RUN mkdir -p /app/cache
|
||||||
|
|
||||||
|
# Expose port
|
||||||
|
EXPOSE 5000
|
||||||
|
|
||||||
|
# Set environment variables
|
||||||
|
ENV FLASK_APP=src/web_app.py
|
||||||
|
ENV PYTHONUNBUFFERED=1
|
||||||
|
|
||||||
|
# Run the web server
|
||||||
|
CMD ["python", "src/web_app.py"]
|
||||||
|
|
@ -0,0 +1,214 @@
|
||||||
|
# Mixcloud RSS Generator (Backend CLI)
|
||||||
|
|
||||||
|
Convert Mixcloud shows into RSS feeds using a lightweight command-line interface that leverages shared content syndication services.
|
||||||
|
|
||||||
|
## Features
|
||||||
|
|
||||||
|
- 🎵 Convert any Mixcloud user's shows into an RSS feed
|
||||||
|
- 📱 Compatible with podcast apps (Apple Podcasts, Overcast, etc.)
|
||||||
|
- 🚀 Fast with built-in caching
|
||||||
|
- 🔧 Backend-only CLI tool (no web interface)
|
||||||
|
- 📡 Advanced filtering options (keywords, dates, tags)
|
||||||
|
- ♻️ Uses shared services for reusability across projects
|
||||||
|
|
||||||
|
## Architecture
|
||||||
|
|
||||||
|
This project has been refactored to use shared services:
|
||||||
|
- **Backend Services**: Located in `shared/services/content_syndication/`
|
||||||
|
- **CLI Interface**: `src/cli.py` provides command-line access
|
||||||
|
- **Legacy Components**: Web UI archived in `archived_projects/mixcloud-ui/`
|
||||||
|
|
||||||
|
## Installation
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Navigate to the mixcloud-rss-generator directory
|
||||||
|
cd mixcloud-rss-generator
|
||||||
|
|
||||||
|
# Install dependencies
|
||||||
|
pip install -r requirements.txt
|
||||||
|
|
||||||
|
# Ensure shared services are accessible
|
||||||
|
export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
|
||||||
|
```
|
||||||
|
|
||||||
|
## Usage
|
||||||
|
|
||||||
|
### Basic Usage
|
||||||
|
|
||||||
|
Generate RSS feed from Mixcloud user:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Basic RSS generation
|
||||||
|
python src/cli.py WRFG
|
||||||
|
|
||||||
|
# From full Mixcloud URL
|
||||||
|
python src/cli.py --url https://www.mixcloud.com/NTSRadio/
|
||||||
|
|
||||||
|
# Save to file with custom limit
|
||||||
|
python src/cli.py WRFG --limit 50 --output feed.xml
|
||||||
|
```
|
||||||
|
|
||||||
|
### Advanced Filtering
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Filter by keywords in title
|
||||||
|
python src/cli.py WRFG --keywords "rap,public affairs" --limit 100
|
||||||
|
|
||||||
|
# Filter by date range
|
||||||
|
python src/cli.py WRFG --date-range 2024-01-01 2024-12-31
|
||||||
|
|
||||||
|
# Filter by specific dates
|
||||||
|
python src/cli.py WRFG --specific-dates "July 21,Aug 15,2024-09-01"
|
||||||
|
|
||||||
|
# Revolutionary African Perspectives only (convenience filter)
|
||||||
|
python src/cli.py WRFG --rap-only --limit 100
|
||||||
|
|
||||||
|
# Combine multiple filters
|
||||||
|
python src/cli.py WRFG --keywords "interview" --tags "house,techno" --limit 30
|
||||||
|
```
|
||||||
|
|
||||||
|
### Utility Operations
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Validate user without generating feed
|
||||||
|
python src/cli.py WRFG --validate
|
||||||
|
|
||||||
|
# Get user information
|
||||||
|
python src/cli.py WRFG --user-info
|
||||||
|
|
||||||
|
# Verbose output for debugging
|
||||||
|
python src/cli.py WRFG --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
### Integration with Podcast Apps
|
||||||
|
|
||||||
|
1. Generate RSS feed and save to a publicly accessible location
|
||||||
|
2. Use the file path or served URL in your podcast app:
|
||||||
|
- **Apple Podcasts**: File → Add Show by URL
|
||||||
|
- **Overcast**: Add URL → Plus button → Add URL
|
||||||
|
- **Pocket Casts**: Search → Enter URL
|
||||||
|
- **Castro**: Library → Sources → Plus → Add Podcast by URL
|
||||||
|
|
||||||
|
## Shared Services Architecture
|
||||||
|
|
||||||
|
The RSS generation now uses modular services from `shared/services/content_syndication/`:
|
||||||
|
|
||||||
|
- **ContentSyndicationService**: Main orchestration service
|
||||||
|
- **MixcloudAPIClient**: Handles Mixcloud API interactions with caching
|
||||||
|
- **RSSFeedGenerator**: Creates RSS 2.0 compliant feeds
|
||||||
|
- **FeedFilterService**: Advanced content filtering capabilities
|
||||||
|
|
||||||
|
## Configuration
|
||||||
|
|
||||||
|
### Cache Settings
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Custom cache directory and TTL
|
||||||
|
python src/cli.py WRFG --cache-dir ./custom-cache --cache-ttl 7200
|
||||||
|
```
|
||||||
|
|
||||||
|
### Environment Variables
|
||||||
|
|
||||||
|
- `PYTHONPATH`: Must include parent project directory for shared imports
|
||||||
|
- No other environment variables required for basic operation
|
||||||
|
|
||||||
|
## Examples
|
||||||
|
|
||||||
|
### Generate Filtered Feed
|
||||||
|
|
||||||
|
```python
|
||||||
|
# In Python script
|
||||||
|
from shared.services.content_syndication import ContentSyndicationService
|
||||||
|
|
||||||
|
# Initialize service
|
||||||
|
service = ContentSyndicationService(cache_dir="./cache", cache_ttl=3600)
|
||||||
|
|
||||||
|
# Generate with filters
|
||||||
|
filters = {"keywords": "rap,public affairs", "start_date": "2024-01-01"}
|
||||||
|
rss_feed = service.generate_mixcloud_rss("WRFG", limit=50, filters=filters)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Integration with Main AI Project
|
||||||
|
|
||||||
|
```python
|
||||||
|
# Use in podcast processing pipeline
|
||||||
|
from shared.services.content_syndication import ContentSyndicationService
|
||||||
|
|
||||||
|
service = ContentSyndicationService()
|
||||||
|
rss_feed = service.generate_rap_feed("WRFG", limit=100) # Convenience method
|
||||||
|
|
||||||
|
# Save for podcast processing
|
||||||
|
with open("data/feeds/wrfg_rap.xml", "w") as f:
|
||||||
|
f.write(rss_feed)
|
||||||
|
```
|
||||||
|
|
||||||
|
## Migration Notes
|
||||||
|
|
||||||
|
### From Web Interface
|
||||||
|
|
||||||
|
The web interface (`web_app.py` and `templates/`) has been archived to `archived_projects/mixcloud-ui/`. Key changes:
|
||||||
|
|
||||||
|
- **Before**: `python src/web_app.py` (Flask web server)
|
||||||
|
- **After**: `python src/cli.py [options]` (CLI tool)
|
||||||
|
|
||||||
|
### From Legacy Script
|
||||||
|
|
||||||
|
The original `mixcloud_rss.py` remains for compatibility but new usage should prefer the CLI:
|
||||||
|
|
||||||
|
- **Legacy**: `python src/mixcloud_rss.py username`
|
||||||
|
- **New**: `python src/cli.py username`
|
||||||
|
|
||||||
|
## Troubleshooting
|
||||||
|
|
||||||
|
### Import Errors
|
||||||
|
```bash
|
||||||
|
# Ensure PYTHONPATH includes project root
|
||||||
|
export PYTHONPATH=/path/to/my-ai-projects:$PYTHONPATH
|
||||||
|
python src/cli.py WRFG --validate
|
||||||
|
```
|
||||||
|
|
||||||
|
### Cache Issues
|
||||||
|
```bash
|
||||||
|
# Clear cache if experiencing stale data
|
||||||
|
rm -rf cache/*.json
|
||||||
|
python src/cli.py WRFG --verbose
|
||||||
|
```
|
||||||
|
|
||||||
|
### API Errors
|
||||||
|
- **User not found**: Check username spelling and profile visibility
|
||||||
|
- **No shows**: User might have private shows or no content
|
||||||
|
- **Rate limiting**: Wait between requests or increase cache TTL
|
||||||
|
|
||||||
|
## Advanced Usage
|
||||||
|
|
||||||
|
### Specialized Feeds
|
||||||
|
|
||||||
|
The CLI includes convenience options for specialized content:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Revolutionary African Perspectives shows only
|
||||||
|
python src/cli.py WRFG --rap-only --limit 200
|
||||||
|
|
||||||
|
# Recent interviews only
|
||||||
|
python src/cli.py WRFG --keywords "interview" --date-range 2024-01-01 $(date +%Y-%m-%d)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Batch Processing
|
||||||
|
|
||||||
|
```bash
|
||||||
|
# Process multiple users
|
||||||
|
for user in WRFG NTSRadio ResidentAdvisor; do
|
||||||
|
python src/cli.py $user --output "feeds/${user}.xml" --limit 50
|
||||||
|
done
|
||||||
|
```
|
||||||
|
|
||||||
|
## Integration with AI Assistant
|
||||||
|
|
||||||
|
This RSS generator integrates with the Personal AI Assistant project for:
|
||||||
|
- **Podcast Processing**: RSS feeds enable episode detection
|
||||||
|
- **Audio Analysis**: Provides metadata for audio processing
|
||||||
|
- **Content Monitoring**: Automated feed checking for new episodes
|
||||||
|
|
||||||
|
## License
|
||||||
|
|
||||||
|
MIT License - part of the Personal AI Assistant ecosystem.
|
||||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
|
@ -0,0 +1 @@
|
||||||
|
{"key": "/WRFG/", "url": "https://www.mixcloud.com/WRFG/", "name": "WRFG Atlanta", "username": "WRFG", "pictures": {"small": "https://thumbnailer.mixcloud.com/unsafe/25x25/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "thumbnail": "https://thumbnailer.mixcloud.com/unsafe/50x50/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium_mobile": "https://thumbnailer.mixcloud.com/unsafe/80x80/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium": "https://thumbnailer.mixcloud.com/unsafe/100x100/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "large": "https://thumbnailer.mixcloud.com/unsafe/300x300/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "320wx320h": "https://thumbnailer.mixcloud.com/unsafe/320x320/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "extra_large": "https://thumbnailer.mixcloud.com/unsafe/600x600/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "640wx640h": "https://thumbnailer.mixcloud.com/unsafe/640x640/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d"}, "biog": "Founded in 1973 in Atlanta, GA, Radio Free Georgia is a non-profit, non-commercial, independent, community radio station. It broadcasts on 89.3 FM and is licensed at 100,000 watts. \n\nWRFG is committed to bringing progressive news and handpicked independent music to the metro Atlanta area via FM and the world via our internet stream. \n\nLearn more: https://wrfg.org/\nInstagram: https://www.instagram.com/wrfgatlanta/\nFacebook: https://www.facebook.com/wrfgatl89.3fm", "created_time": "2019-05-11T19:08:22Z", "updated_time": "2019-05-11T19:08:22Z", "follower_count": 673, "following_count": 27, "cloudcast_count": 17838, "favorite_count": 7, "listen_count": 0, "is_pro": true, "is_premium": false, "city": "Atlanta", "country": "United States", "cover_pictures": {"835wx120h": "https://thumbnailer.mixcloud.com/unsafe/835x120/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1113wx160h": "https://thumbnailer.mixcloud.com/unsafe/1113x160/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1670wx240h": "https://thumbnailer.mixcloud.com/unsafe/1670x240/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789"}, "picture_primary_color": "000000"}
|
||||||
File diff suppressed because one or more lines are too long
File diff suppressed because one or more lines are too long
|
|
@ -0,0 +1 @@
|
||||||
|
{"key": "/WRFG/", "url": "https://www.mixcloud.com/WRFG/", "name": "WRFG Atlanta", "username": "WRFG", "pictures": {"small": "https://thumbnailer.mixcloud.com/unsafe/25x25/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "thumbnail": "https://thumbnailer.mixcloud.com/unsafe/50x50/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium_mobile": "https://thumbnailer.mixcloud.com/unsafe/80x80/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "medium": "https://thumbnailer.mixcloud.com/unsafe/100x100/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "large": "https://thumbnailer.mixcloud.com/unsafe/300x300/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "320wx320h": "https://thumbnailer.mixcloud.com/unsafe/320x320/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "extra_large": "https://thumbnailer.mixcloud.com/unsafe/600x600/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d", "640wx640h": "https://thumbnailer.mixcloud.com/unsafe/640x640/profile/f/4/3/2/f015-1494-4464-8f0c-9c5efa4ef91d"}, "biog": "Founded in 1973 in Atlanta, GA, Radio Free Georgia is a non-profit, non-commercial, independent, community radio station. It broadcasts on 89.3 FM and is licensed at 100,000 watts. \n\nWRFG is committed to bringing progressive news and handpicked independent music to the metro Atlanta area via FM and the world via our internet stream. \n\nLearn more: https://wrfg.org/\nInstagram: https://www.instagram.com/wrfgatlanta/\nFacebook: https://www.facebook.com/wrfgatl89.3fm", "created_time": "2019-05-11T19:08:22Z", "updated_time": "2019-05-11T19:08:22Z", "follower_count": 673, "following_count": 27, "cloudcast_count": 17838, "favorite_count": 7, "listen_count": 0, "is_pro": true, "is_premium": false, "city": "Atlanta", "country": "United States", "cover_pictures": {"835wx120h": "https://thumbnailer.mixcloud.com/unsafe/835x120/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1113wx160h": "https://thumbnailer.mixcloud.com/unsafe/1113x160/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789", "1670wx240h": "https://thumbnailer.mixcloud.com/unsafe/1670x240/profile_cover/0/1/5/2/323b-581d-4ca3-be95-e5ddb0e22789"}, "picture_primary_color": "000000"}
|
||||||
|
|
@ -0,0 +1,19 @@
|
||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
mixcloud-rss:
|
||||||
|
build: .
|
||||||
|
container_name: mixcloud-rss-generator
|
||||||
|
ports:
|
||||||
|
- "5000:5000"
|
||||||
|
volumes:
|
||||||
|
- ./cache:/app/cache
|
||||||
|
environment:
|
||||||
|
- SECRET_KEY=${SECRET_KEY:-your-secret-key-here}
|
||||||
|
restart: unless-stopped
|
||||||
|
healthcheck:
|
||||||
|
test: ["CMD", "curl", "-f", "http://localhost:5000/health"]
|
||||||
|
interval: 30s
|
||||||
|
timeout: 10s
|
||||||
|
retries: 3
|
||||||
|
start_period: 40s
|
||||||
|
|
@ -0,0 +1,63 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Generate RSS feed for specific dates (e.g., July 21 show)
|
||||||
|
"""
|
||||||
|
|
||||||
|
from src.mixcloud_rss import MixcloudRSSGenerator
|
||||||
|
|
||||||
|
def generate_filtered_feed(username, specific_dates):
|
||||||
|
"""Generate RSS feed filtered by specific dates."""
|
||||||
|
|
||||||
|
# Create generator
|
||||||
|
generator = MixcloudRSSGenerator()
|
||||||
|
|
||||||
|
# Set up filters
|
||||||
|
filters = {
|
||||||
|
'specific_dates': specific_dates
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate feed
|
||||||
|
print(f"Generating RSS feed for {username} filtered by dates: {specific_dates}")
|
||||||
|
rss_feed = generator.generate_rss_from_username(username, limit=50, filters=filters)
|
||||||
|
|
||||||
|
if rss_feed:
|
||||||
|
# Save to file
|
||||||
|
filename = f"{username}_filtered_{specific_dates.replace(',', '_').replace(' ', '')}.xml"
|
||||||
|
with open(filename, 'w', encoding='utf-8') as f:
|
||||||
|
f.write(rss_feed)
|
||||||
|
print(f"✅ RSS feed saved to: {filename}")
|
||||||
|
|
||||||
|
# Also print the RSS URL for the web server
|
||||||
|
print(f"\n📡 RSS URL for web server:")
|
||||||
|
print(f"http://localhost:5000/rss/{username}?limit=50&specific_dates={specific_dates}")
|
||||||
|
|
||||||
|
# Count episodes
|
||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
root = ET.fromstring(rss_feed)
|
||||||
|
items = root.findall('.//item')
|
||||||
|
print(f"\n📊 Found {len(items)} episodes matching the filter")
|
||||||
|
|
||||||
|
# Show episode details
|
||||||
|
if items:
|
||||||
|
print("\n📅 Matching episodes:")
|
||||||
|
for item in items:
|
||||||
|
title = item.find('title').text
|
||||||
|
pub_date = item.find('pubDate').text
|
||||||
|
print(f" - {title} ({pub_date})")
|
||||||
|
else:
|
||||||
|
print("❌ Error: Could not generate RSS feed")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
import sys
|
||||||
|
|
||||||
|
if len(sys.argv) < 2:
|
||||||
|
print("Usage: python generate_july21_feed.py <username> [dates]")
|
||||||
|
print("Example: python generate_july21_feed.py djusername 'July 21'")
|
||||||
|
print("Example: python generate_july21_feed.py djusername 'July 21, August 15'")
|
||||||
|
sys.exit(1)
|
||||||
|
|
||||||
|
username = sys.argv[1]
|
||||||
|
dates = sys.argv[2] if len(sys.argv) > 2 else "July 21"
|
||||||
|
|
||||||
|
generate_filtered_feed(username, dates)
|
||||||
|
|
@ -0,0 +1,74 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Generate RSS feed for Public Affairs RAP (Revolutionary African Perspectives) shows
|
||||||
|
"""
|
||||||
|
|
||||||
|
from src.mixcloud_rss import MixcloudRSSGenerator
|
||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
def generate_rap_feed(username="WRFG"):
|
||||||
|
"""Generate RSS feed filtered for RAP shows."""
|
||||||
|
|
||||||
|
# Create generator
|
||||||
|
generator = MixcloudRSSGenerator()
|
||||||
|
|
||||||
|
# Set up filters for "Public Affairs" in the title
|
||||||
|
# This should catch variations like "afrikan" vs "african"
|
||||||
|
filters = {
|
||||||
|
'keywords': 'public affairs'
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate feed with a higher limit to catch all shows
|
||||||
|
print(f"Generating RSS feed for {username} filtered by 'Public Affairs' shows...")
|
||||||
|
rss_feed = generator.generate_rss_from_username(username, limit=100, filters=filters)
|
||||||
|
|
||||||
|
if rss_feed:
|
||||||
|
# Save to file
|
||||||
|
filename = f"{username}_public_affairs_rap.xml"
|
||||||
|
with open(filename, 'w', encoding='utf-8') as f:
|
||||||
|
f.write(rss_feed)
|
||||||
|
print(f"✅ RSS feed saved to: {filename}")
|
||||||
|
|
||||||
|
# Also print the RSS URL for the web server
|
||||||
|
print(f"\n📡 RSS URL for web server:")
|
||||||
|
print(f"http://localhost:5000/rss/{username}?limit=100&keywords=public%20affairs")
|
||||||
|
|
||||||
|
# Parse and show episodes
|
||||||
|
root = ET.fromstring(rss_feed)
|
||||||
|
items = root.findall('.//item')
|
||||||
|
print(f"\n📊 Found {len(items)} 'Public Affairs' episodes")
|
||||||
|
|
||||||
|
# Show episode details
|
||||||
|
if items:
|
||||||
|
print("\n📅 Public Affairs RAP episodes:")
|
||||||
|
for item in items:
|
||||||
|
title = item.find('title').text
|
||||||
|
pub_date_str = item.find('pubDate').text
|
||||||
|
link = item.find('link').text
|
||||||
|
|
||||||
|
# Parse date for better display
|
||||||
|
try:
|
||||||
|
pub_date = datetime.strptime(pub_date_str, "%a, %d %b %Y %H:%M:%S %z")
|
||||||
|
date_display = pub_date.strftime("%B %d, %Y")
|
||||||
|
except:
|
||||||
|
date_display = pub_date_str
|
||||||
|
|
||||||
|
print(f"\n 📻 {title}")
|
||||||
|
print(f" Date: {date_display}")
|
||||||
|
print(f" URL: {link}")
|
||||||
|
|
||||||
|
# Check if it's the July 21 show
|
||||||
|
if "21 july" in title.lower() or "july 21" in title.lower():
|
||||||
|
print(f" ⭐ This is the July 21 show!")
|
||||||
|
|
||||||
|
# Generate specific URL for your podcast system
|
||||||
|
print(f"\n🎯 For your podcast processing system, use this RSS URL:")
|
||||||
|
print(f"http://localhost:5000/rss/WRFG?limit=100&keywords=public%20affairs")
|
||||||
|
|
||||||
|
else:
|
||||||
|
print("❌ Error: Could not generate RSS feed")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
generate_rap_feed()
|
||||||
|
|
@ -0,0 +1,85 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Generate RSS feed for ONLY the RAP (Revolutionary African/Afrikan Perspectives) shows
|
||||||
|
"""
|
||||||
|
|
||||||
|
from src.mixcloud_rss import MixcloudRSSGenerator
|
||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
def generate_rap_only_feed(username="WRFG"):
|
||||||
|
"""Generate RSS feed filtered for ONLY RAP shows."""
|
||||||
|
|
||||||
|
# Create generator
|
||||||
|
generator = MixcloudRSSGenerator()
|
||||||
|
|
||||||
|
# Set up filters for "RAP" in the title
|
||||||
|
# This will catch both "African" and "Afrikan" variations
|
||||||
|
filters = {
|
||||||
|
'keywords': 'RAP' # This will match "RAP - Revolutionary African/Afrikan Perspectives"
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate feed with a higher limit to catch all shows
|
||||||
|
print(f"Generating RSS feed for {username} filtered by RAP shows only...")
|
||||||
|
rss_feed = generator.generate_rss_from_username(username, limit=200, filters=filters)
|
||||||
|
|
||||||
|
if rss_feed:
|
||||||
|
# Save to file
|
||||||
|
filename = f"{username}_rap_only.xml"
|
||||||
|
with open(filename, 'w', encoding='utf-8') as f:
|
||||||
|
f.write(rss_feed)
|
||||||
|
print(f"✅ RSS feed saved to: {filename}")
|
||||||
|
|
||||||
|
# Also print the RSS URL for the web server
|
||||||
|
print(f"\n📡 RSS URL for web server:")
|
||||||
|
print(f"http://localhost:5000/rss/{username}?limit=200&keywords=RAP")
|
||||||
|
|
||||||
|
# Parse and show episodes
|
||||||
|
root = ET.fromstring(rss_feed)
|
||||||
|
items = root.findall('.//item')
|
||||||
|
print(f"\n📊 Found {len(items)} RAP episodes")
|
||||||
|
|
||||||
|
# Show episode details
|
||||||
|
if items:
|
||||||
|
print("\n📅 Revolutionary African/Afrikan Perspectives episodes:")
|
||||||
|
for item in items:
|
||||||
|
title = item.find('title').text
|
||||||
|
pub_date_str = item.find('pubDate').text
|
||||||
|
link = item.find('link').text
|
||||||
|
description = item.find('description').text if item.find('description') is not None else ""
|
||||||
|
|
||||||
|
# Parse date for better display
|
||||||
|
try:
|
||||||
|
pub_date = datetime.strptime(pub_date_str, "%a, %d %b %Y %H:%M:%S %z")
|
||||||
|
date_display = pub_date.strftime("%B %d, %Y")
|
||||||
|
except:
|
||||||
|
date_display = pub_date_str
|
||||||
|
|
||||||
|
print(f"\n 📻 {title}")
|
||||||
|
print(f" Date: {date_display}")
|
||||||
|
print(f" URL: {link}")
|
||||||
|
|
||||||
|
# Check if it's the July 21 show
|
||||||
|
if "21 july" in title.lower() or "july 21" in title.lower():
|
||||||
|
print(f" ⭐ This is the July 21 show!")
|
||||||
|
|
||||||
|
# Check for African vs Afrikan spelling
|
||||||
|
if "afrikan" in title.lower():
|
||||||
|
print(f" 📝 Note: Uses 'Afrikan' spelling")
|
||||||
|
elif "african" in title.lower():
|
||||||
|
print(f" 📝 Note: Uses 'African' spelling")
|
||||||
|
|
||||||
|
# Generate specific URL for your podcast system
|
||||||
|
print(f"\n🎯 For your podcast processing system, use this RSS URL:")
|
||||||
|
print(f"http://localhost:5000/rss/WRFG?limit=200&keywords=RAP")
|
||||||
|
|
||||||
|
# Also create a direct link to the July 21 episode
|
||||||
|
print(f"\n🔗 Direct link to July 21 RAP show:")
|
||||||
|
print(f"https://www.mixcloud.com/WRFG/public-affairs-rap-revolutionary-african-perspectives-21-july-2025/")
|
||||||
|
|
||||||
|
else:
|
||||||
|
print("❌ Error: Could not generate RSS feed")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
generate_rap_only_feed()
|
||||||
|
|
@ -0,0 +1,90 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Generate RSS feed for ONLY the Revolutionary African/Afrikan Perspectives shows
|
||||||
|
Using multiple keywords to be more precise
|
||||||
|
"""
|
||||||
|
|
||||||
|
from src.mixcloud_rss import MixcloudRSSGenerator
|
||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
from datetime import datetime
|
||||||
|
|
||||||
|
def generate_rap_precise_feed(username="WRFG"):
|
||||||
|
"""Generate RSS feed filtered for ONLY Revolutionary African/Afrikan Perspectives shows."""
|
||||||
|
|
||||||
|
# Create generator
|
||||||
|
generator = MixcloudRSSGenerator()
|
||||||
|
|
||||||
|
# Set up filters - use "revolutionary" as it's unique to these shows
|
||||||
|
filters = {
|
||||||
|
'keywords': 'revolutionary' # This should only match the RAP shows
|
||||||
|
}
|
||||||
|
|
||||||
|
# Generate feed
|
||||||
|
print(f"Generating RSS feed for {username} filtered by Revolutionary African/Afrikan Perspectives shows...")
|
||||||
|
rss_feed = generator.generate_rss_from_username(username, limit=200, filters=filters)
|
||||||
|
|
||||||
|
if rss_feed:
|
||||||
|
# Save to file
|
||||||
|
filename = f"{username}_revolutionary_african_perspectives.xml"
|
||||||
|
with open(filename, 'w', encoding='utf-8') as f:
|
||||||
|
f.write(rss_feed)
|
||||||
|
print(f"✅ RSS feed saved to: {filename}")
|
||||||
|
|
||||||
|
# RSS URLs for different filtering options
|
||||||
|
print(f"\n📡 RSS URLs for web server:")
|
||||||
|
print(f"Option 1 (by 'revolutionary'): http://localhost:5000/rss/{username}?limit=200&keywords=revolutionary")
|
||||||
|
print(f"Option 2 (by 'public affairs' + 'revolutionary'): http://localhost:5000/rss/{username}?limit=200&keywords=public%20affairs,revolutionary")
|
||||||
|
|
||||||
|
# Parse and show episodes
|
||||||
|
root = ET.fromstring(rss_feed)
|
||||||
|
items = root.findall('.//item')
|
||||||
|
print(f"\n📊 Found {len(items)} Revolutionary African/Afrikan Perspectives episodes")
|
||||||
|
|
||||||
|
# Show episode details
|
||||||
|
if items:
|
||||||
|
print("\n📅 All Revolutionary African/Afrikan Perspectives episodes:")
|
||||||
|
july_21_found = False
|
||||||
|
|
||||||
|
for item in items:
|
||||||
|
title = item.find('title').text
|
||||||
|
pub_date_str = item.find('pubDate').text
|
||||||
|
link = item.find('link').text
|
||||||
|
|
||||||
|
# Parse date for better display
|
||||||
|
try:
|
||||||
|
pub_date = datetime.strptime(pub_date_str, "%a, %d %b %Y %H:%M:%S %z")
|
||||||
|
date_display = pub_date.strftime("%B %d, %Y")
|
||||||
|
show_date = pub_date.strftime("%Y-%m-%d")
|
||||||
|
except:
|
||||||
|
date_display = pub_date_str
|
||||||
|
show_date = ""
|
||||||
|
|
||||||
|
print(f"\n 📻 {title}")
|
||||||
|
print(f" Date: {date_display}")
|
||||||
|
print(f" URL: {link}")
|
||||||
|
|
||||||
|
# Check if it's the July 21 show
|
||||||
|
if "21 july" in title.lower() or "july 21" in title.lower() or "2025-07-21" in show_date:
|
||||||
|
print(f" ⭐ This is the July 21, 2025 show!")
|
||||||
|
july_21_found = True
|
||||||
|
july_21_url = link
|
||||||
|
|
||||||
|
if july_21_found:
|
||||||
|
print(f"\n✨ JULY 21 SHOW FOUND!")
|
||||||
|
print(f"Direct URL: {july_21_url}")
|
||||||
|
print(f"\nTo analyze this specific show in your podcast system:")
|
||||||
|
print(f"1. Use the RSS feed URL above")
|
||||||
|
print(f"2. Or process this specific episode URL directly")
|
||||||
|
|
||||||
|
# Summary
|
||||||
|
print(f"\n📈 Summary:")
|
||||||
|
print(f"- Total RAP episodes found: {len(items)}")
|
||||||
|
print(f"- These are weekly shows featuring Revolutionary African/Afrikan Perspectives")
|
||||||
|
print(f"- The feed includes both 'African' and 'Afrikan' spelling variations")
|
||||||
|
|
||||||
|
else:
|
||||||
|
print("❌ Error: Could not generate RSS feed")
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
generate_rap_precise_feed()
|
||||||
|
|
@ -0,0 +1,68 @@
|
||||||
|
* Serving Flask app 'web_app'
|
||||||
|
* Debug mode: on
|
||||||
|
INFO:werkzeug:[31m[1mWARNING: This is a development server. Do not use it in a production deployment. Use a production WSGI server instead.[0m
|
||||||
|
* Running on all addresses (0.0.0.0)
|
||||||
|
* Running on http://127.0.0.1:5000
|
||||||
|
* Running on http://192.168.68.59:5000
|
||||||
|
INFO:werkzeug:[33mPress CTRL+C to quit[0m
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 22:53:22] "GET /health HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 22:55:20] "GET / HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 22:55:21] "[33mGET /favicon.ico HTTP/1.1[0m" 404 -
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/mixcloud_rss.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/mixcloud_rss.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/web_app.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/Claude/MyProject/personal-ai-assistant/mixcloud-rss-generator/src/web_app.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:00:37] "GET / HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:00:51] "POST /generate HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:03] "POST /api/validate HTTP/1.1" 200 -
|
||||||
|
ERROR:__main__:Error generating RSS: can't compare offset-naive and offset-aware datetimes
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:21] "[35m[1mPOST /generate HTTP/1.1[0m" 500 -
|
||||||
|
ERROR:__main__:Error generating RSS: can't compare offset-naive and offset-aware datetimes
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:30] "[35m[1mPOST /generate HTTP/1.1[0m" 500 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:40] "GET / HTTP/1.1" 200 -
|
||||||
|
ERROR:__main__:Error generating RSS: can't compare offset-naive and offset-aware datetimes
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:43] "[35m[1mPOST /generate HTTP/1.1[0m" 500 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:01:53] "POST /generate HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:07:42] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:15:57] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:19:38] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug:127.0.0.1 - - [26/Jul/2025 23:29:23] "GET /rss/WRFG?limit=200&keywords=revolutionary HTTP/1.1" 200 -
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
INFO:werkzeug: * Detected change in '/var/home/enias/.local/lib/python3.10/site-packages/nvidia/__init__.py', reloading
|
||||||
|
INFO:werkzeug: * Restarting with stat
|
||||||
|
WARNING:werkzeug: * Debugger is active!
|
||||||
|
INFO:werkzeug: * Debugger PIN: 785-868-005
|
||||||
|
|
@ -0,0 +1,3 @@
|
||||||
|
requests>=2.31.0
|
||||||
|
beautifulsoup4>=4.12.0
|
||||||
|
lxml>=4.9.0
|
||||||
Binary file not shown.
|
|
@ -0,0 +1,189 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Backend-only CLI for Mixcloud RSS Generation
|
||||||
|
|
||||||
|
Uses shared content syndication services for RSS generation.
|
||||||
|
Replaces web_app.py and legacy mixcloud_rss.py dependencies.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import argparse
|
||||||
|
import json
|
||||||
|
import os
|
||||||
|
import sys
|
||||||
|
from typing import Dict, Optional
|
||||||
|
|
||||||
|
# Add parent directories to path for shared imports
|
||||||
|
sys.path.insert(0, os.path.join(os.path.dirname(__file__), '../../'))
|
||||||
|
|
||||||
|
from shared.services.content_syndication import (
|
||||||
|
ContentSyndicationService,
|
||||||
|
FeedFilterService
|
||||||
|
)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Command-line interface for backend Mixcloud RSS generation."""
|
||||||
|
parser = argparse.ArgumentParser(
|
||||||
|
description="Generate RSS feeds from Mixcloud users (Backend CLI)",
|
||||||
|
formatter_class=argparse.RawDescriptionHelpFormatter,
|
||||||
|
epilog="""
|
||||||
|
Examples:
|
||||||
|
%(prog)s WRFG # Basic RSS for WRFG user
|
||||||
|
%(prog)s --url https://mixcloud.com/NTSRadio/ # From full URL
|
||||||
|
%(prog)s WRFG --limit 50 --output feed.xml # Save 50 episodes to file
|
||||||
|
%(prog)s WRFG --keywords "rap,public affairs" # Filter by keywords
|
||||||
|
%(prog)s WRFG --rap-only # RAP shows only
|
||||||
|
%(prog)s WRFG --date-range 2024-01-01 2024-12-31 # Date filtering
|
||||||
|
"""
|
||||||
|
)
|
||||||
|
|
||||||
|
# Input options
|
||||||
|
input_group = parser.add_mutually_exclusive_group(required=True)
|
||||||
|
input_group.add_argument("username", nargs='?', help="Mixcloud username")
|
||||||
|
input_group.add_argument("--url", help="Mixcloud URL")
|
||||||
|
|
||||||
|
# Output options
|
||||||
|
parser.add_argument("-o", "--output", help="Output file path (default: stdout)")
|
||||||
|
parser.add_argument("-l", "--limit", type=int, default=20,
|
||||||
|
help="Number of episodes to include (default: 20)")
|
||||||
|
|
||||||
|
# Caching options
|
||||||
|
parser.add_argument("--cache-dir", default="./cache",
|
||||||
|
help="Cache directory path (default: ./cache)")
|
||||||
|
parser.add_argument("--cache-ttl", type=int, default=3600,
|
||||||
|
help="Cache TTL in seconds (default: 3600)")
|
||||||
|
|
||||||
|
# Filtering options
|
||||||
|
filter_group = parser.add_argument_group("Filtering Options")
|
||||||
|
filter_group.add_argument("--keywords",
|
||||||
|
help="Filter by keywords in title (comma-separated)")
|
||||||
|
filter_group.add_argument("--tags",
|
||||||
|
help="Filter by tags (comma-separated)")
|
||||||
|
filter_group.add_argument("--date-range", nargs=2, metavar=('START', 'END'),
|
||||||
|
help="Filter by date range (YYYY-MM-DD format)")
|
||||||
|
filter_group.add_argument("--specific-dates",
|
||||||
|
help="Filter by specific dates (comma-separated)")
|
||||||
|
|
||||||
|
# Convenience options
|
||||||
|
convenience_group = parser.add_argument_group("Convenience Options")
|
||||||
|
convenience_group.add_argument("--rap-only", action='store_true',
|
||||||
|
help="Filter for Revolutionary African Perspectives shows only")
|
||||||
|
|
||||||
|
# Utility options
|
||||||
|
parser.add_argument("--validate", action='store_true',
|
||||||
|
help="Validate user without generating feed")
|
||||||
|
parser.add_argument("--user-info", action='store_true',
|
||||||
|
help="Show user information only")
|
||||||
|
parser.add_argument("--verbose", "-v", action='store_true',
|
||||||
|
help="Verbose output")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# Determine username
|
||||||
|
username = args.username
|
||||||
|
if args.url:
|
||||||
|
# Initialize service to extract username
|
||||||
|
syndication_service = ContentSyndicationService(args.cache_dir, args.cache_ttl)
|
||||||
|
try:
|
||||||
|
rss_feed = syndication_service.generate_mixcloud_rss_from_url(args.url, limit=1)
|
||||||
|
# Extract username from URL using service
|
||||||
|
username = syndication_service.mixcloud_client.extract_username_from_url(args.url)
|
||||||
|
if not username:
|
||||||
|
print(f"Error: Could not extract username from URL: {args.url}", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error: {e}", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
if not username:
|
||||||
|
print("Error: No username provided", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
# Initialize content syndication service
|
||||||
|
syndication_service = ContentSyndicationService(args.cache_dir, args.cache_ttl)
|
||||||
|
|
||||||
|
# Handle validation only
|
||||||
|
if args.validate:
|
||||||
|
result = syndication_service.validate_mixcloud_user(username)
|
||||||
|
if result['valid']:
|
||||||
|
print(f"✅ Valid user: {result['username']} ({result['name']}) - {result['show_count']} shows")
|
||||||
|
return 0
|
||||||
|
else:
|
||||||
|
print(f"❌ Invalid user: {result['message']}")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
# Handle user info only
|
||||||
|
if args.user_info:
|
||||||
|
user_data = syndication_service.get_mixcloud_user_info(username)
|
||||||
|
if user_data:
|
||||||
|
print(f"User: {user_data.get('name', username)}")
|
||||||
|
print(f"Username: {username}")
|
||||||
|
print(f"Bio: {user_data.get('biog', 'N/A')}")
|
||||||
|
print(f"Shows: {user_data.get('cloudcast_count', 0)}")
|
||||||
|
print(f"Profile: https://www.mixcloud.com/{username}/")
|
||||||
|
return 0
|
||||||
|
else:
|
||||||
|
print(f"Error: User '{username}' not found", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
# Build filters
|
||||||
|
filters = {}
|
||||||
|
|
||||||
|
if args.rap_only:
|
||||||
|
filters = FeedFilterService.create_rap_filter()
|
||||||
|
if args.verbose:
|
||||||
|
print("Applied RAP filter", file=sys.stderr)
|
||||||
|
|
||||||
|
if args.keywords:
|
||||||
|
filters['keywords'] = args.keywords
|
||||||
|
if args.verbose:
|
||||||
|
print(f"Filter: keywords = {args.keywords}", file=sys.stderr)
|
||||||
|
|
||||||
|
if args.tags:
|
||||||
|
filters['tags'] = args.tags
|
||||||
|
if args.verbose:
|
||||||
|
print(f"Filter: tags = {args.tags}", file=sys.stderr)
|
||||||
|
|
||||||
|
if args.date_range:
|
||||||
|
filters['start_date'] = args.date_range[0]
|
||||||
|
filters['end_date'] = args.date_range[1]
|
||||||
|
if args.verbose:
|
||||||
|
print(f"Filter: date range = {args.date_range[0]} to {args.date_range[1]}", file=sys.stderr)
|
||||||
|
|
||||||
|
if args.specific_dates:
|
||||||
|
filters['specific_dates'] = args.specific_dates
|
||||||
|
if args.verbose:
|
||||||
|
print(f"Filter: specific dates = {args.specific_dates}", file=sys.stderr)
|
||||||
|
|
||||||
|
# Generate RSS feed
|
||||||
|
try:
|
||||||
|
if args.verbose:
|
||||||
|
print(f"Generating RSS for user: {username}", file=sys.stderr)
|
||||||
|
print(f"Limit: {args.limit} episodes", file=sys.stderr)
|
||||||
|
if filters:
|
||||||
|
print(f"Filters applied: {list(filters.keys())}", file=sys.stderr)
|
||||||
|
|
||||||
|
rss_feed = syndication_service.generate_mixcloud_rss(username, args.limit, filters)
|
||||||
|
|
||||||
|
if rss_feed:
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, "w", encoding="utf-8") as f:
|
||||||
|
f.write(rss_feed)
|
||||||
|
print(f"RSS feed saved to: {args.output}", file=sys.stderr)
|
||||||
|
else:
|
||||||
|
print(rss_feed)
|
||||||
|
return 0
|
||||||
|
else:
|
||||||
|
print(f"Error: Could not generate RSS feed for user '{username}'", file=sys.stderr)
|
||||||
|
return 1
|
||||||
|
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error: {e}", file=sys.stderr)
|
||||||
|
if args.verbose:
|
||||||
|
import traceback
|
||||||
|
traceback.print_exc()
|
||||||
|
return 1
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
sys.exit(main())
|
||||||
|
|
@ -0,0 +1,375 @@
|
||||||
|
#!/usr/bin/env python3
|
||||||
|
"""
|
||||||
|
Mixcloud to RSS Feed Generator
|
||||||
|
|
||||||
|
Converts Mixcloud user pages or show pages into RSS feeds that can be consumed
|
||||||
|
by podcast apps or feed readers.
|
||||||
|
"""
|
||||||
|
|
||||||
|
import json
|
||||||
|
import re
|
||||||
|
import xml.etree.ElementTree as ET
|
||||||
|
from datetime import datetime
|
||||||
|
from typing import Dict, List, Optional, Union
|
||||||
|
from urllib.parse import quote, urlencode, urlparse
|
||||||
|
import hashlib
|
||||||
|
import os
|
||||||
|
|
||||||
|
import requests
|
||||||
|
from bs4 import BeautifulSoup
|
||||||
|
|
||||||
|
|
||||||
|
class MixcloudRSSGenerator:
|
||||||
|
"""Generate RSS feeds from Mixcloud pages."""
|
||||||
|
|
||||||
|
def __init__(self, cache_dir: str = "./cache", cache_ttl: int = 3600):
|
||||||
|
"""
|
||||||
|
Initialize the Mixcloud RSS Generator.
|
||||||
|
|
||||||
|
Args:
|
||||||
|
cache_dir: Directory for caching API responses
|
||||||
|
cache_ttl: Cache time-to-live in seconds (default: 1 hour)
|
||||||
|
"""
|
||||||
|
self.cache_dir = cache_dir
|
||||||
|
self.cache_ttl = cache_ttl
|
||||||
|
self.api_base = "https://api.mixcloud.com"
|
||||||
|
self.base_url = "https://www.mixcloud.com"
|
||||||
|
|
||||||
|
# Create cache directory if it doesn't exist
|
||||||
|
os.makedirs(cache_dir, exist_ok=True)
|
||||||
|
|
||||||
|
def _get_cache_path(self, url: str) -> str:
|
||||||
|
"""Generate cache file path for a URL."""
|
||||||
|
url_hash = hashlib.md5(url.encode()).hexdigest()
|
||||||
|
return os.path.join(self.cache_dir, f"{url_hash}.json")
|
||||||
|
|
||||||
|
def _get_cached_data(self, url: str) -> Optional[Dict]:
|
||||||
|
"""Get cached data if available and not expired."""
|
||||||
|
cache_path = self._get_cache_path(url)
|
||||||
|
|
||||||
|
if os.path.exists(cache_path):
|
||||||
|
# Check if cache is still valid
|
||||||
|
cache_age = datetime.now().timestamp() - os.path.getmtime(cache_path)
|
||||||
|
if cache_age < self.cache_ttl:
|
||||||
|
with open(cache_path, 'r') as f:
|
||||||
|
return json.load(f)
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _save_to_cache(self, url: str, data: Dict) -> None:
|
||||||
|
"""Save data to cache."""
|
||||||
|
cache_path = self._get_cache_path(url)
|
||||||
|
with open(cache_path, 'w') as f:
|
||||||
|
json.dump(data, f)
|
||||||
|
|
||||||
|
def _fetch_mixcloud_data(self, api_url: str) -> Optional[Dict]:
|
||||||
|
"""Fetch data from Mixcloud API with caching."""
|
||||||
|
# Check cache first
|
||||||
|
cached_data = self._get_cached_data(api_url)
|
||||||
|
if cached_data:
|
||||||
|
return cached_data
|
||||||
|
|
||||||
|
try:
|
||||||
|
response = requests.get(api_url, timeout=10)
|
||||||
|
response.raise_for_status()
|
||||||
|
data = response.json()
|
||||||
|
|
||||||
|
# Save to cache
|
||||||
|
self._save_to_cache(api_url, data)
|
||||||
|
|
||||||
|
return data
|
||||||
|
except Exception as e:
|
||||||
|
print(f"Error fetching Mixcloud data: {e}")
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _extract_username_from_url(self, url: str) -> Optional[str]:
|
||||||
|
"""Extract username from Mixcloud URL."""
|
||||||
|
# Handle various Mixcloud URL formats
|
||||||
|
patterns = [
|
||||||
|
r'mixcloud\.com/([^/]+)/?$',
|
||||||
|
r'mixcloud\.com/([^/]+)/(?:uploads|favorites|listens)?/?$',
|
||||||
|
r'mixcloud\.com/([^/]+)/[^/]+/?$', # Specific show
|
||||||
|
]
|
||||||
|
|
||||||
|
for pattern in patterns:
|
||||||
|
match = re.search(pattern, url)
|
||||||
|
if match:
|
||||||
|
return match.group(1)
|
||||||
|
|
||||||
|
return None
|
||||||
|
|
||||||
|
def _format_duration(self, seconds: int) -> str:
|
||||||
|
"""Format duration in seconds to HH:MM:SS."""
|
||||||
|
hours = seconds // 3600
|
||||||
|
minutes = (seconds % 3600) // 60
|
||||||
|
secs = seconds % 60
|
||||||
|
|
||||||
|
if hours > 0:
|
||||||
|
return f"{hours:02d}:{minutes:02d}:{secs:02d}"
|
||||||
|
else:
|
||||||
|
return f"{minutes:02d}:{secs:02d}"
|
||||||
|
|
||||||
|
def _filter_shows(self, shows: List[Dict], filters: Dict = None) -> List[Dict]:
|
||||||
|
"""Filter shows based on criteria."""
|
||||||
|
if not filters:
|
||||||
|
return shows
|
||||||
|
|
||||||
|
filtered_shows = shows
|
||||||
|
|
||||||
|
# Filter by date range
|
||||||
|
if filters.get('start_date'):
|
||||||
|
start_date = datetime.fromisoformat(filters['start_date'].replace('Z', '+00:00'))
|
||||||
|
filtered_shows = [
|
||||||
|
show for show in filtered_shows
|
||||||
|
if datetime.fromisoformat(show['created_time'].replace('Z', '+00:00')) >= start_date
|
||||||
|
]
|
||||||
|
|
||||||
|
if filters.get('end_date'):
|
||||||
|
end_date = datetime.fromisoformat(filters['end_date'].replace('Z', '+00:00'))
|
||||||
|
filtered_shows = [
|
||||||
|
show for show in filtered_shows
|
||||||
|
if datetime.fromisoformat(show['created_time'].replace('Z', '+00:00')) <= end_date
|
||||||
|
]
|
||||||
|
|
||||||
|
# Filter by keywords in title
|
||||||
|
if filters.get('keywords'):
|
||||||
|
keywords = filters['keywords'].lower().split(',')
|
||||||
|
filtered_shows = [
|
||||||
|
show for show in filtered_shows
|
||||||
|
if any(keyword.strip() in show.get('name', '').lower() for keyword in keywords)
|
||||||
|
]
|
||||||
|
|
||||||
|
# Filter by tags
|
||||||
|
if filters.get('tags'):
|
||||||
|
filter_tags = [tag.strip().lower() for tag in filters['tags'].split(',')]
|
||||||
|
filtered_shows = [
|
||||||
|
show for show in filtered_shows
|
||||||
|
if any(
|
||||||
|
tag['name'].lower() in filter_tags
|
||||||
|
for tag in show.get('tags', [])
|
||||||
|
)
|
||||||
|
]
|
||||||
|
|
||||||
|
# Filter by specific dates (e.g., "July 21")
|
||||||
|
if filters.get('specific_dates'):
|
||||||
|
dates = filters['specific_dates'].split(',')
|
||||||
|
filtered_shows = [
|
||||||
|
show for show in filtered_shows
|
||||||
|
if self._matches_date(show['created_time'], dates)
|
||||||
|
]
|
||||||
|
|
||||||
|
return filtered_shows
|
||||||
|
|
||||||
|
def _matches_date(self, created_time: str, dates: List[str]) -> bool:
|
||||||
|
"""Check if created_time matches any of the specified dates."""
|
||||||
|
show_date = datetime.fromisoformat(created_time.replace('Z', '+00:00'))
|
||||||
|
|
||||||
|
for date_str in dates:
|
||||||
|
date_str = date_str.strip().lower()
|
||||||
|
|
||||||
|
# Handle various date formats
|
||||||
|
# "July 21" or "Jul 21"
|
||||||
|
if any(month in date_str for month in ['january', 'february', 'march', 'april', 'may', 'june',
|
||||||
|
'july', 'august', 'september', 'october', 'november', 'december',
|
||||||
|
'jan', 'feb', 'mar', 'apr', 'may', 'jun',
|
||||||
|
'jul', 'aug', 'sep', 'oct', 'nov', 'dec']):
|
||||||
|
try:
|
||||||
|
# Parse month and day
|
||||||
|
parsed_date = datetime.strptime(f"{date_str} {show_date.year}", "%B %d %Y")
|
||||||
|
if show_date.date() == parsed_date.date():
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
try:
|
||||||
|
parsed_date = datetime.strptime(f"{date_str} {show_date.year}", "%b %d %Y")
|
||||||
|
if show_date.date() == parsed_date.date():
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# "2024-07-21" format
|
||||||
|
elif '-' in date_str:
|
||||||
|
try:
|
||||||
|
parsed_date = datetime.fromisoformat(date_str)
|
||||||
|
if show_date.date() == parsed_date.date():
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
# "07/21/2024" or "7/21/2024" format
|
||||||
|
elif '/' in date_str:
|
||||||
|
for fmt in ['%m/%d/%Y', '%m/%d/%y', '%d/%m/%Y', '%d/%m/%y']:
|
||||||
|
try:
|
||||||
|
parsed_date = datetime.strptime(date_str, fmt)
|
||||||
|
if show_date.date() == parsed_date.date():
|
||||||
|
return True
|
||||||
|
except:
|
||||||
|
pass
|
||||||
|
|
||||||
|
return False
|
||||||
|
|
||||||
|
def _build_rss_feed(self, user_data: Dict, shows: List[Dict]) -> str:
|
||||||
|
"""Build RSS XML feed from user data and shows."""
|
||||||
|
# Create root RSS element
|
||||||
|
rss = ET.Element("rss", version="2.0", attrib={
|
||||||
|
"xmlns:itunes": "http://www.itunes.com/dtds/podcast-1.0.dtd",
|
||||||
|
"xmlns:content": "http://purl.org/rss/1.0/modules/content/"
|
||||||
|
})
|
||||||
|
|
||||||
|
channel = ET.SubElement(rss, "channel")
|
||||||
|
|
||||||
|
# Channel metadata
|
||||||
|
ET.SubElement(channel, "title").text = user_data.get("name", "Mixcloud Feed")
|
||||||
|
ET.SubElement(channel, "link").text = f"{self.base_url}{user_data.get('key', '')}"
|
||||||
|
ET.SubElement(channel, "description").text = user_data.get("biog", "Mixcloud podcast feed")
|
||||||
|
ET.SubElement(channel, "language").text = "en-us"
|
||||||
|
ET.SubElement(channel, "lastBuildDate").text = datetime.now().strftime("%a, %d %b %Y %H:%M:%S +0000")
|
||||||
|
|
||||||
|
# iTunes podcast metadata
|
||||||
|
ET.SubElement(channel, "itunes:author").text = user_data.get("name", "")
|
||||||
|
ET.SubElement(channel, "itunes:summary").text = user_data.get("biog", "")
|
||||||
|
|
||||||
|
if user_data.get("pictures", {}).get("large"):
|
||||||
|
image = ET.SubElement(channel, "itunes:image")
|
||||||
|
image.set("href", user_data["pictures"]["large"])
|
||||||
|
|
||||||
|
# Add each show as an item
|
||||||
|
for show in shows:
|
||||||
|
item = ET.SubElement(channel, "item")
|
||||||
|
|
||||||
|
# Basic item elements
|
||||||
|
ET.SubElement(item, "title").text = show.get("name", "")
|
||||||
|
ET.SubElement(item, "link").text = f"{self.base_url}{show.get('key', '')}"
|
||||||
|
|
||||||
|
# Description with tags
|
||||||
|
description = show.get("description", "")
|
||||||
|
if show.get("tags"):
|
||||||
|
tags = ", ".join([tag["name"] for tag in show["tags"]])
|
||||||
|
description += f"\n\nTags: {tags}"
|
||||||
|
ET.SubElement(item, "description").text = description
|
||||||
|
|
||||||
|
# Publication date
|
||||||
|
created_time = show.get("created_time")
|
||||||
|
if created_time:
|
||||||
|
pub_date = datetime.fromisoformat(created_time.replace("Z", "+00:00"))
|
||||||
|
ET.SubElement(item, "pubDate").text = pub_date.strftime("%a, %d %b %Y %H:%M:%S +0000")
|
||||||
|
|
||||||
|
# GUID
|
||||||
|
ET.SubElement(item, "guid", isPermaLink="true").text = f"{self.base_url}{show.get('key', '')}"
|
||||||
|
|
||||||
|
# Audio enclosure (if audio URL is available)
|
||||||
|
audio_url = show.get("audio_url") or f"{self.base_url}{show.get('key', '')}"
|
||||||
|
enclosure = ET.SubElement(item, "enclosure")
|
||||||
|
enclosure.set("url", audio_url)
|
||||||
|
enclosure.set("type", "audio/mpeg")
|
||||||
|
enclosure.set("length", str(show.get("audio_length", 0)))
|
||||||
|
|
||||||
|
# iTunes elements
|
||||||
|
ET.SubElement(item, "itunes:author").text = user_data.get("name", "")
|
||||||
|
ET.SubElement(item, "itunes:summary").text = description
|
||||||
|
ET.SubElement(item, "itunes:duration").text = self._format_duration(show.get("audio_length", 0))
|
||||||
|
|
||||||
|
if show.get("pictures", {}).get("large"):
|
||||||
|
ET.SubElement(item, "itunes:image").set("href", show["pictures"]["large"])
|
||||||
|
|
||||||
|
# Convert to string
|
||||||
|
return '<?xml version="1.0" encoding="UTF-8"?>\n' + ET.tostring(rss, encoding="unicode")
|
||||||
|
|
||||||
|
def get_user_shows(self, username: str, limit: int = 20) -> Optional[List[Dict]]:
|
||||||
|
"""Get list of shows for a Mixcloud user."""
|
||||||
|
# Fetch user data
|
||||||
|
user_api_url = f"{self.api_base}/{username}/"
|
||||||
|
user_data = self._fetch_mixcloud_data(user_api_url)
|
||||||
|
|
||||||
|
if not user_data:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Fetch user's cloudcasts (shows)
|
||||||
|
shows_api_url = f"{self.api_base}/{username}/cloudcasts/"
|
||||||
|
params = {"limit": limit}
|
||||||
|
|
||||||
|
all_shows = []
|
||||||
|
|
||||||
|
while len(all_shows) < limit:
|
||||||
|
shows_data = self._fetch_mixcloud_data(f"{shows_api_url}?{urlencode(params)}")
|
||||||
|
|
||||||
|
if not shows_data or "data" not in shows_data:
|
||||||
|
break
|
||||||
|
|
||||||
|
all_shows.extend(shows_data["data"])
|
||||||
|
|
||||||
|
# Check for next page
|
||||||
|
if "paging" in shows_data and "next" in shows_data["paging"]:
|
||||||
|
shows_api_url = shows_data["paging"]["next"]
|
||||||
|
else:
|
||||||
|
break
|
||||||
|
|
||||||
|
return {"user": user_data, "shows": all_shows[:limit]}
|
||||||
|
|
||||||
|
def generate_rss_from_url(self, mixcloud_url: str, limit: int = 20, filters: Dict = None) -> Optional[str]:
|
||||||
|
"""Generate RSS feed from a Mixcloud URL with optional filters."""
|
||||||
|
username = self._extract_username_from_url(mixcloud_url)
|
||||||
|
|
||||||
|
if not username:
|
||||||
|
raise ValueError(f"Could not extract username from URL: {mixcloud_url}")
|
||||||
|
|
||||||
|
data = self.get_user_shows(username, limit * 2) # Get more shows to filter from
|
||||||
|
|
||||||
|
if not data:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Apply filters
|
||||||
|
filtered_shows = self._filter_shows(data["shows"], filters)[:limit]
|
||||||
|
|
||||||
|
return self._build_rss_feed(data["user"], filtered_shows)
|
||||||
|
|
||||||
|
def generate_rss_from_username(self, username: str, limit: int = 20, filters: Dict = None) -> Optional[str]:
|
||||||
|
"""Generate RSS feed from a Mixcloud username with optional filters."""
|
||||||
|
data = self.get_user_shows(username, limit * 2) # Get more shows to filter from
|
||||||
|
|
||||||
|
if not data:
|
||||||
|
return None
|
||||||
|
|
||||||
|
# Apply filters
|
||||||
|
filtered_shows = self._filter_shows(data["shows"], filters)[:limit]
|
||||||
|
|
||||||
|
return self._build_rss_feed(data["user"], filtered_shows)
|
||||||
|
|
||||||
|
|
||||||
|
def main():
|
||||||
|
"""Command-line interface for the Mixcloud RSS generator."""
|
||||||
|
import argparse
|
||||||
|
|
||||||
|
parser = argparse.ArgumentParser(description="Generate RSS feeds from Mixcloud pages")
|
||||||
|
parser.add_argument("input", help="Mixcloud URL or username")
|
||||||
|
parser.add_argument("-l", "--limit", type=int, default=20, help="Number of episodes to include (default: 20)")
|
||||||
|
parser.add_argument("-o", "--output", help="Output file path (default: stdout)")
|
||||||
|
parser.add_argument("-c", "--cache-dir", default="./cache", help="Cache directory path")
|
||||||
|
parser.add_argument("-t", "--cache-ttl", type=int, default=3600, help="Cache TTL in seconds (default: 3600)")
|
||||||
|
|
||||||
|
args = parser.parse_args()
|
||||||
|
|
||||||
|
# Create generator
|
||||||
|
generator = MixcloudRSSGenerator(cache_dir=args.cache_dir, cache_ttl=args.cache_ttl)
|
||||||
|
|
||||||
|
# Determine if input is URL or username
|
||||||
|
if "mixcloud.com" in args.input:
|
||||||
|
rss_feed = generator.generate_rss_from_url(args.input, args.limit)
|
||||||
|
else:
|
||||||
|
rss_feed = generator.generate_rss_from_username(args.input, args.limit)
|
||||||
|
|
||||||
|
if rss_feed:
|
||||||
|
if args.output:
|
||||||
|
with open(args.output, "w", encoding="utf-8") as f:
|
||||||
|
f.write(rss_feed)
|
||||||
|
print(f"RSS feed saved to: {args.output}")
|
||||||
|
else:
|
||||||
|
print(rss_feed)
|
||||||
|
else:
|
||||||
|
print("Error: Could not generate RSS feed")
|
||||||
|
return 1
|
||||||
|
|
||||||
|
return 0
|
||||||
|
|
||||||
|
|
||||||
|
if __name__ == "__main__":
|
||||||
|
exit(main())
|
||||||
Loading…
Reference in New Issue