gemini-video-understanding
Perform advanced video analysis using Google's Gemini API: summarize content, transcribe audio, extract timestamps, clip segments, and analyze YouTube URLs or local files with support for multiple models and long contexts.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
415 skills found
Perform advanced video analysis using Google's Gemini API: summarize content, transcribe audio, extract timestamps, clip segments, and analyze YouTube URLs or local files with support for multiple models and long contexts.
Manage and automate Obsidian tasks directly via the TaskNotes plugin HTTP API with CLI-based task creation, listing, status updates, and project filtering.
Essential guide to llmemory for document storage and search: installation, database setup with pgvector, document ingestion, hybrid/semantic retrieval, and building RAG systems with multi-tenant support.
Provider-agnostic MCP skill for wait-for-change automation on PR events like status checks, merges, and comments.
Create new Figma design or FigJam files directly via the MCP server. Automatically resolves plans and initializes new canvases for your design workflows.
Free AI-powered web search via Exa MCP. Includes deep research, company/people lookup, and code context without API keys.
Home Assistant OS (HAOS) operations skill for agents. Features read-only diagnostics, automation design, health auditing, and safety-first configuration management.
Manage Vercel DNS records for 0 Finance domains using the Vercel CLI.
Framework for building, registering, and orchestrating Model Context Protocol (MCP) tools and AI agent workflows within the Hive native Rust architecture.
Comprehensive email management and automation tool. Send, receive, and organize emails with attachment support across multiple providers.
Generate images using the Cloudflare Workers AI flux-1-schnell model. Enables text-to-image capabilities directly within your workflow.
Aggressively prune grammatical scaffolding and filler text from inputs to optimize LLM token usage while retaining core semantic content.