gemini-audio
Implement Google Gemini API audio capabilities: process, transcribe, and summarize audio files, analyze environmental sounds, and generate natural speech with controllable TTS.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
164 skills found
Implement Google Gemini API audio capabilities: process, transcribe, and summarize audio files, analyze environmental sounds, and generate natural speech with controllable TTS.
Unified AI gateway for 100+ LLMs with OpenAI-compatible API, model fallbacks, load balancing, and enterprise-grade tools.
Implement Google Gemini API vision capabilities for image/document analysis including captioning, object detection, segmentation, and multi-image comparison.
Build AI agents with the OpenAI Agents SDK for Python. Supports multi-agent handoffs, function tools, stateful sessions, streaming, and Azure OpenAI integration via LiteLLM.
A structured prompting framework to transform casual inputs into professional, modular LLM prompts with persona, context, task, format, and guardrails.
Build AI agents with tool calling and multi-step reasoning. Generate, manage, and orchestrate custom skill files for Claude Code, Cursor, Cline, and other AI assistants to standardize your development workflows.
Intelligent strategic planning and requirements gathering with multi-perspective consensus loops and structured deliberation.
A comprehensive financial modeling suite for investment analysis, featuring DCF valuation, sensitivity testing, Monte Carlo simulations, and scenario planning.
Generate artistic 3D city-themed food diorama images using Google Gemini API. Creates Pop Mart style four-quadrant layouts featuring iconic dishes, cultural symbols, and city-specific heritage elements.
Implement LlamaExtract for robust structured data extraction from PDF, DOCX, and PPTX files using Pydantic schemas.
Analyze your product and codebase to identify, qualify, and rank high-potential business leads with actionable outreach strategies.
Creates professional, editable PowerPoint (.pptx) presentations with AI-generated full-slide images, brand consistency, and style references.