edge-tts-uvx
Generate high-quality text-to-speech audio using Microsoft Edge's neural TTS service. Supports multiple languages, voices, and adjustable audio parameters.
Introduction
The edge-tts-uvx skill provides a command-line interface to the powerful Microsoft Edge neural text-to-speech service, allowing users to convert text documents, messages, or scripts into natural-sounding audio files. This tool is designed for developers, content creators, and accessibility-focused users who need reliable audio generation without external dependencies. By leveraging the node-edge-tts engine, this skill enables seamless integration into automated workflows, such as reading back AI-generated text, creating accessible content, or generating voiceovers for media projects. It is an excellent utility for those multitasking or needing specialized voice modulation for different digital environments.
-
Support for a vast library of neural voices covering multiple regions, languages, and accents, including naturalistic male and female personas.
-
Granular control over audio output parameters, including customizable rate (speed), pitch adjustments, and volume levels to match specific project requirements.
-
Native support for generating subtitles alongside audio files, simplifying the production of accessible multimedia content.
-
Flexible output options allowing users to direct audio to specific media formats, primarily MP3.
-
Direct access to voice metadata and status using shell-based commands for easier auditing and management of available phonetic resources.
-
Input requirements include plain text content, while outputs are rendered as high-fidelity audio media files in the designated temporary directories.
-
The tool is invoked via
uvx edge-tts, ensuring a portable, isolated environment without requiring global installation of dependencies. -
Performance depends on network connectivity to the Microsoft Edge TTS endpoints; ensure stable internet access when initiating voice synthesis tasks.
-
Users should utilize the
--list-voicescommand to discover and select the most appropriate neural persona, such as English US-Aria or Chinese CN-Xiaoxiao, depending on the linguistic context of the target text. -
When generating voiceovers, specify the output file path correctly to prevent overwriting existing media and ensure valid file naming conventions.
Repository Stats
- Stars
- 4,454
- Forks
- 1,215
- Open Issues
- 7
- Language
- Python
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- Apr 30, 2026, 11:26 AM