audio
Generate high-quality audio with support for ElevenLabs, OpenAI, and Google Text-to-Speech. Features voice cloning, multilingual capabilities, and flexible CLI controls.
Introduction
The Audio Generation skill offers a unified, API-driven interface for converting text into natural-sounding speech. Designed for AI coding agents and developers, this tool abstracts the complexity of multiple text-to-speech (TTS) providers into a single, cohesive CLI. It is ideal for developers building interactive applications, automated narration systems, or accessibility tools that require high-fidelity synthetic audio output.
-
Multi-provider support: Seamlessly switch between ElevenLabs for advanced voice cloning and natural synthesis, OpenAI for high-performance TTS-1 and HD models, and Google Text-to-Speech for extensive international language coverage.
-
Native CLI implementation: Built with a clean TypeScript architecture using native fetch, avoiding heavy external HTTP library dependencies.
-
Flexible voice management: List available voice options for each provider, ensuring users can select the perfect tone for their specific use case.
-
Multilingual capability: Leverages the latest models from major providers to support a wide range of global languages and localized accents.
-
High-quality output: Configurable settings for various audio formats and models (e.g., eleven_multilingual_v2, tts-1-hd).
-
To get started, configure your environment variables with valid API keys for ELEVENLABS_API_KEY, OPENAI_API_KEY, and GOOGLE_API_KEY.
-
Usage involves simple commands such as generating audio by specifying the --provider, --text, and --voice, or listing voices via the voices command.
-
The tool is designed for Bun 1.0+ runtime, ensuring fast execution and efficient performance in CI/CD pipelines or local development environments.
-
Constraints: Requires active API subscriptions for the respective providers; Ensure that system dependencies for audio playback or storage are handled if processing large batches of files.
-
Practical tips: Use the --output flag to define specific file paths and naming conventions for your generated assets; chain this skill with other agentic workflows to automate content narration.
Repository Stats
- Stars
- 0
- Forks
- 0
- Open Issues
- 0
- Language
- TypeScript
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- May 4, 2026, 12:09 AM