Content
instagram-reels avatar

instagram-reels

Download Instagram Reels, extract metadata, and generate full audio transcripts using Groq Whisper. Supports TikTok and YouTube Shorts via yt-dlp.

Introduction

This skill provides a robust command-line pipeline for content creators, researchers, and social media analysts to extract and process short-form video data. By leveraging yt-dlp for powerful media extraction and the Groq API's Whisper-large-v3-turbo model, the agent can transform video content into actionable text transcripts and structured metadata in seconds. It is designed for users who need to repurpose video content, archive social media data, or perform large-scale sentiment and thematic analysis on video platforms.

  • Automated extraction of video metadata including captions, uploader information, and video duration using yt-dlp JSON parsing.

  • High-speed audio transcription via Groq Cloud, offering near real-time processing of up to 25-minute audio segments.

  • Universal platform compatibility: supports Instagram Reels, TikTok, YouTube Shorts, and any video source supported by the yt-dlp library.

  • Integrated FFmpeg support for efficient audio conversion from proprietary container formats to MP3 for transcription readiness.

  • Detailed JSON output including full text transcripts and timestamped segments for precise content alignment.

  • Requires local installation of yt-dlp and ffmpeg for media processing.

  • Requires a valid Groq API key for cloud-based transcription services.

  • Metadata extraction works natively for public videos; private content requires authentication via cookies.txt integration.

  • Best practices include periodic cleanup of temporary files to prevent storage accumulation in the workspace.

  • Suitable for developers and power users looking to automate social media research workflows or content repurposing pipelines without manual interaction.

Repository Stats

Stars
4,455
Forks
1,215
Open Issues
7
Language
Python
Default Branch
main
Sync Status
Idle
Last Synced
Apr 30, 2026, 09:49 AM
View on GitHub