speak
Local text-to-speech conversion using Kokoro TTS. Generate audio, read text aloud, and handle multilingual speech synthesis directly in your terminal.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
103 skills found
Local text-to-speech conversion using Kokoro TTS. Generate audio, read text aloud, and handle multilingual speech synthesis directly in your terminal.
Expert Kokoro TTS implementation skill for real-time, secure, and offline voice synthesis in JARVIS-style assistants. Features streaming output, prosody control, and performance-optimized audio generation.
Generate high-quality text-to-speech audio using Microsoft Edge's neural voice engine via uvx edge-tts.
Transcribe audio files (wav, mp3, ogg) to text using the Qwen ASR model. Fast, local-friendly, and requires no API keys.
Local speech-to-text transcription using the OpenAI Whisper CLI, providing private, high-accuracy audio processing without external API keys.
AI-powered video editing agent for talking head videos, featuring speech-to-text, disfluency detection, and browser-based review workflows.
Unified local ML inference server for ASR, TTS, Translation, Image Generation, and Vision on Apple Silicon, powered by MLX.
Convert clinical text to natural, empathetic speech using ElevenLabs for patient instructions, medication reminders, and accessible health content.
Enforces professional voice, tone, and technical style guidelines for React documentation, ensuring consistency across Learn, Reference, and Blog pages.
Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.
Generate high-quality text-to-speech audio using Microsoft Edge's neural TTS service. Supports multiple languages, voices, and adjustable audio parameters.
Transforms content to match specific voice profiles, tones, or styles using configurable YAML templates for consistent brand and narrative output.