This project aims to generate structured PDF reports from podcast interviews, highlighting key takeaways, quotes, and insights. The goal is to create shareable and accessible summaries for a broader audience.
- Summarization using LLMs
- Search & Retrieval
- PDF Report Generation
- Web UI (Streamlit) for user interaction
- Clone the repository:
git clone https://github.com/DataTalksClub/podcast-summary-generation.git cd podcast-summaries
- Install dependencies:
pip install -r requirements.txt
# Start the backend services (if needed)
docker-compose up -d
# Run the application
python main.py
- LLM Processing → Summarization, Extracting Key Insights
- Storage & Retrieval → Search Engine (ElasticSearch/In-memory DB)
- PDF Generation → Formatted Report
- Web UI → User Interaction & Downloads
- Open an issue before working on any feature.
- Use feature branches for development.
- Submit PRs with at least 2 approvals before merging.
🚀 Let's build something great together! 🎙️📄