videocut:剪口播
AI-powered video editing agent for talking head videos, featuring speech-to-text, disfluency detection, and browser-based review workflows.
Introduction
This agent is a specialized tool designed to automate the editing process for talking head videos (口播视频). It addresses the limitations of standard consumer video editors by leveraging AI to understand the semantic context of a video. By combining high-accuracy transcription from Volcengine with LLM-powered linguistic analysis, the agent identifies filler words, repetitive speech, stuttering, and intentional silent segments. It transforms a raw recording into a structured project that can be audited through a dedicated local web server, ensuring precision frame-level editing.
-
Precise linguistic analysis: Identifies and suggests removal for repeated sentences, verbal corrections, residual fragments (残句), and excessive filler words (um, ah).
-
Intelligent segmentation: Performs silence detection and speech-to-text with character-level timestamps to isolate valid content from noise.
-
Integrated review workflow: Automatically generates a review.html interface where users can preview clips, toggle specific segments for deletion, and confirm changes before final export.
-
High-fidelity processing: Uses FFmpeg with frame-level trimming and original parameter-matching re-encoding, ensuring that the final output maintains consistent bitrate and visual quality.
-
Automated directory management: Organizes raw source files, intermediate analysis logs, subtitles, and final exports into timestamped project folders for easy tracking.
-
Inputs: Accepts standard MP4 video files and requires a configured Volcengine API key.
-
Outputs: Generates subtitles_words.json, auto-selected deletion lists, and final edited MP4 videos using complex filters for frame-accurate cuts.
-
Usage constraints: Designed for local operation with Node.js and Python environments; requires FFmpeg installed for video manipulation.
-
Best practices: Users should review generated mouth-error reports (口误分析.md) and adjust semantic preferences through user habit configuration files to improve accuracy over time.
Repository Stats
- Stars
- 1,515
- Forks
- 248
- Open Issues
- 17
- Language
- JavaScript
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- May 3, 2026, 08:22 PM