openai-whisper-api
Transcribe audio files into text using the OpenAI Whisper API for your OpenClaw assistant.
Introduction
The openai-whisper-api skill integrates OpenAI’s advanced speech-to-text transcription capabilities directly into your OpenClaw workflow. Designed for users who rely on voice notes, recordings, or audio messages, this skill provides a reliable and fast way to convert spoken content into actionable text. By leveraging the Whisper model via the standard OpenAI Audio Transcriptions API, it ensures high-quality recognition for a wide range of languages and audio formats. It is an ideal tool for researchers documenting interviews, professionals capturing meeting minutes, or any user looking to enhance their assistant with robust voice processing capabilities. The skill is highly configurable, supporting custom models, language hints, and prompt-based context for better transcription accuracy.
-
Full support for standard audio formats including .m4a, .ogg, and others accepted by OpenAI.
-
Direct integration with OpenClaw configuration for seamless API key management.
-
Flexibility to specify custom Whisper models, such as whisper-1, to balance performance and quality.
-
Ability to pass specific language parameters to improve recognition in non-English scenarios.
-
Support for context-aware transcriptions via the prompt flag, which is particularly useful for including speaker names or specific technical terminology.
-
Output versatility with options for raw text or structured JSON formatting for downstream integration.
-
Ensure that your OPENAI_API_KEY is correctly set in your ~/.openclaw/openclaw.json file to authenticate requests.
-
You can use an OpenAI-compatible proxy or a local gateway by setting the OPENAI_BASE_URL variable, allowing for air-gapped or private cloud deployments.
-
For best results, ensure audio files are clear; while Whisper is robust to background noise, extreme distortion may impact transcription precision.
-
The script-based execution model allows this skill to be integrated into broader shell-based automation chains or batch processing tasks.
-
Users handling sensitive information should ensure their endpoint configuration reflects their security requirements regarding data transmission.
Repository Stats
- Stars
- 365,626
- Forks
- 74,932
- Open Issues
- 7,000
- Language
- TypeScript
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- Apr 28, 2026, 11:37 AM