podcast-generation
Generate real-time AI podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model with WebSocket streaming, complete with PCM to WAV conversion and frontend playback integration.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
154 skills found
Generate real-time AI podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model with WebSocket streaming, complete with PCM to WAV conversion and frontend playback integration.
Search, browse, and download podcast episodes from Apple Podcasts via the iTunes Search API. Ideal for archiving audio content, batch downloading, and retrieving rich metadata for research or personal media libraries.
Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.
Fast-reference guide and utility skill for Helm chart development, template syntax, and Kubernetes application deployment.
An end-to-end video processing pipeline that transforms raw recordings into transcripts, key insights, short clips, and polished articles.
Transcribe YouTube videos and local audio/video files with high-precision speaker diarization. Supports major formats for ready-to-use LLM analysis.
Development guide for creating custom nodes in FlowGram.ai workflows, supporting both auto-generated simple forms and complex custom UI components.
Prevents AI hallucination and ensures evidence-based, verifiable outputs when analyzing code, reviewing technical documents, or providing recommendations.
Unified content extraction and action planning engine. Automatically processes URLs (YouTube, articles, PDFs) into actionable plans.
Physical hardware synthesis bridge for PAI. Generates blueprints, 3D printing code, SVG paths for laser cutting, and G-Code for CNC machining to bring agentic designs into the physical world.
Local speech-to-text transcription using the OpenAI Whisper CLI, providing private, high-accuracy audio processing without external API keys.
Anthropic Claude integration patterns: streaming, RAG with pgvector, tool use, model selection (Haiku/Sonnet/Opus), prompt caching, and cost management for AI-powered engineering.