Agent Skills Hub

Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.

103 skills found

speak

Local text-to-speech conversion using Kokoro TTS. Generate audio, read text aloud, and handle multilingual speech synthesis directly in your terminal.

Views: 11★ 4,453

EngineeringAutomation

text-to-speech

Expert Kokoro TTS implementation skill for real-time, secure, and offline voice synthesis in JARVIS-style assistants. Features streaming output, prosody control, and performance-optimized audio generation.

Views: 23★ 37

ProductivityContentAutomation

edge-tts

Generate high-quality text-to-speech audio using Microsoft Edge's neural voice engine via uvx edge-tts.

Views: 19★ 120

ProductivityEngineeringAutomation

qwen-asr

Transcribe audio files (wav, mp3, ogg) to text using the Qwen ASR model. Fast, local-friendly, and requires no API keys.

Views: 11★ 4,456

ProductivityAutomationResearch

openai-whisper

Local speech-to-text transcription using the OpenAI Whisper CLI, providing private, high-accuracy audio processing without external API keys.

Views: 17★ 366,037

ContentAutomationProductivity

videocut:剪口播

AI-powered video editing agent for talking head videos, featuring speech-to-text, disfluency detection, and browser-based review workflows.

Views: 26★ 1,515

ProductivityEngineeringAutomationContent

mls

Unified local ML inference server for ASR, TTS, Translation, Image Generation, and Vision on Apple Silicon, powered by MLX.

Views: 109★ 11

ProductivityEducationAutomation

elevenlabs

Convert clinical text to natural, empathetic speech using ElevenLabs for patient instructions, medication reminders, and accessible health content.

Views: 9★ 4,456

EngineeringContent

docs-voice

Enforces professional voice, tone, and technical style guidelines for React documentation, ensuring consistency across Learn, Reference, and Blog pages.

Views: 6★ 173

ProductivityEngineeringData AnalysisContentResearch

ai-multimodal

Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.

Views: 14★ 9

ProductivityAutomationContent

edge-tts-uvx

Generate high-quality text-to-speech audio using Microsoft Edge's neural TTS service. Supports multiple languages, voices, and adjustable audio parameters.

Views: 14★ 4,454

ContentProductivityAutomation