edge-tts
Generate high-quality text-to-speech audio using Microsoft Edge's neural voice engine via uvx edge-tts.
Introduction
The edge-tts skill provides a powerful interface for converting text into natural-sounding speech using Microsoft Edge's advanced neural TTS service. By leveraging the uvx edge-tts command, this skill enables seamless text-to-speech integration for agents, allowing them to provide auditory responses, assist with accessibility, or handle content generation for multitasking scenarios. It supports a diverse range of voices across multiple languages, including English, Chinese, and French, with granular control over audio parameters.
-
Supports high-quality neural voices with natural prosody and intonation.
-
Configurable audio settings including speech rate (speed), volume adjustment, and pitch modification.
-
Capability to generate side-by-side subtitle files in addition to standard audio outputs.
-
Lists all available voices for specific regional and gender-based customization.
-
Ideal for accessibility tools, reading assistance for long-form content, and creating voice-overs for video or multimedia projects.
-
Typical usage involves passing text content as an input variable to the shell command; ensure you provide the output file path in the expected temporary directory format.
-
The output is provided as a media file, typically in .mp3 format, which can be further processed or played back by the agent environment.
-
Constraints: Requires a system with uv/uvx installed to execute the edge-tts utility; ensure internet connectivity as the service communicates with Microsoft's backend for synthesis.
-
For best results, specify the --voice parameter to match the desired language and character personality, such as en-US-AriaNeural for professional reading or zh-CN-XiaoxiaoNeural for conversational tasks.
Repository Stats
- Stars
- 120
- Forks
- 12
- Open Issues
- 0
- Language
- Python
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- May 1, 2026, 09:12 AM