gemini-audio
Implement Google Gemini API audio capabilities: process, transcribe, and summarize audio files, analyze environmental sounds, and generate natural speech with controllable TTS.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
143 skills found
Implement Google Gemini API audio capabilities: process, transcribe, and summarize audio files, analyze environmental sounds, and generate natural speech with controllable TTS.
Search the web for real-time data and research using the Turing Tavily proxy. Use for up-to-date information, current events, and web-based research tasks.
AI-powered generator for viral XiaoHongShu posts, including titles, captions, hashtags, cover image prompts, and posting strategies.
Automate Instagram posts via Telegram or CLI. Features residential proxy bypass, session caching, and WaveSpeed image integration.
Free AI-powered web search via Exa MCP. Includes deep research, company/people lookup, and code context without API keys.
Local speech-to-text transcription using the OpenAI Whisper CLI, providing private, high-accuracy audio processing without external API keys.
Build production-grade AI agents using LangGraph, Anthropic/OpenAI/vLLM, and structured outputs. Features streaming, A2A protocol, Pydantic validation, vector memory, and guardrails for resilient, multi-agent workflows.
Implement Google Gemini API vision capabilities for image/document analysis including captioning, object detection, segmentation, and multi-image comparison.
Implement production-ready AI chat interfaces using OpenAI ChatKit React components. Features include hook configuration, streaming, theming, conversation history, and custom tool integration for Next.js applications.
An all-in-one Chinese daily utility toolkit: weather, currency exchange, news, and package tracking. Zero configuration, no API keys required.
Translates Excel (.xlsx) files from English to Chinese while preserving all formatting, images, and charts.
Expert guide for OpenCode AI: TUI commands, CLI operations, AGENTS.md configuration, custom agent workflows, and project setup.