ai-multimodal
Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
397 skills found
Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.
Analyze and audit Excel spreadsheets to understand logic, identify formula errors, detect risks, and generate documentation for legacy or unknown files.
CLI tool that automatically renames academic PDF files using AI-extracted metadata, supporting multiple providers like Claude, OpenAI, Gemini, and Ollama.
Expert copyeditor and writing specialist that identifies grammar, logic, and flow errors in text with actionable, targeted fix suggestions.
Advanced Google search using a real, JavaScript-rendered Chrome browser. Ideal for scraping full page content, site-specific queries, and time-filtered results.
Review backend pull requests with security enforcement and GitHub CLI integration in a strict read-only environment.
Normalizes testing defect logs by correcting typos, abbreviations, and ambiguous descriptions based on product-specific codebooks and station validation.
Master core marketing concepts, psychology, and frameworks. Includes funnel mapping, positioning, value propositions, and customer journey analysis for effective strategy.
AI-powered coach for Xiaohongshu (XHS) note writing. Generate viral, platform-optimized content with storytelling templates, engagement hacks, and automated compliance tagging.
Analyzes codebases to generate hierarchical documentation, onboarding guides, and architectural mapping, helping teams understand and document their projects efficiently.
Persistent, Git-friendly memory for Claude. Automatically store and retrieve project decisions, bug fixes, and coding patterns in a local .mv2 file.
Framework for building multi-agent systems, AgentOS runtimes, and MCP-integrated AI agents.