Productivity
openai-whisper avatar

openai-whisper

Local speech-to-text transcription using the Whisper CLI. Transcribe audio files or voice recordings directly on your machine without requiring an API key.

Introduction

The openai-whisper skill provides a seamless, local-first speech-to-text transcription engine for your OpenClaw assistant. By leveraging the OpenAI Whisper CLI locally, this tool eliminates the need for cloud-based API subscriptions, ensuring your audio data remains private and processed entirely on your hardware. It is designed for users who require high-quality, reliable transcription of voice notes, meeting recordings, or media files without the latency or privacy concerns associated with remote AI services.

  • Local transcription processing: Whisper models run directly on your device, ensuring complete data sovereignty and privacy.

  • CLI-driven efficiency: Utilizes the robust Whisper command-line interface for reliable batch processing and automation.

  • Flexible model selection: Supports multiple Whisper model sizes (from small to turbo), allowing users to balance between transcription speed and linguistic accuracy.

  • Multi-format output support: Easily generate output in various formats such as plain text (txt) or subtitle files (srt).

  • Translation capabilities: Built-in support for translating audio content into English as part of the transcription workflow.

  • Setup requirements: Models are automatically downloaded to ~/.cache/whisper upon the first execution; ensure sufficient disk space for the chosen model size.

  • Performance optimization: For faster, real-time transcription needs, prioritize using smaller model variants; for maximum precision and complex accents, opt for larger model versions.

  • Usage: Execute the tool by providing the path to your audio file (e.g., .mp3, .m4a), specifying the desired model, and defining your output directory.

  • Constraints: Performance is dependent on the host machine's hardware capabilities (CPU/GPU availability); avoid running excessively large models on resource-constrained devices to prevent system slowdowns.

Repository Stats

Stars
365,661
Forks
74,940
Open Issues
6,976
Language
TypeScript
Default Branch
main
Sync Status
Idle
Last Synced
Apr 28, 2026, 12:36 PM
View on GitHub