edge-tts

Introduction

The edge-tts skill provides a powerful interface for converting text into natural-sounding speech using Microsoft Edge's advanced neural TTS service. By leveraging the uvx edge-tts command, this skill enables seamless text-to-speech integration for agents, allowing them to provide auditory responses, assist with accessibility, or handle content generation for multitasking scenarios. It supports a diverse range of voices across multiple languages, including English, Chinese, and French, with granular control over audio parameters.

Supports high-quality neural voices with natural prosody and intonation.
Configurable audio settings including speech rate (speed), volume adjustment, and pitch modification.
Capability to generate side-by-side subtitle files in addition to standard audio outputs.
Lists all available voices for specific regional and gender-based customization.
Ideal for accessibility tools, reading assistance for long-form content, and creating voice-overs for video or multimedia projects.
Typical usage involves passing text content as an input variable to the shell command; ensure you provide the output file path in the expected temporary directory format.
The output is provided as a media file, typically in .mp3 format, which can be further processed or played back by the agent environment.
Constraints: Requires a system with uv/uvx installed to execute the edge-tts utility; ensure internet connectivity as the service communicates with Microsoft's backend for synthesis.
For best results, specify the --voice parameter to match the desired language and character personality, such as en-US-AriaNeural for professional reading or zh-CN-XiaoxiaoNeural for conversational tasks.

Startup Courses

Online Courses

Physical Courses

Introduction

Repository Stats