ai-multimodal
Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
562 skills found
Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.
Automated inbound and outbound AI email workflow for 0 Finance, enabling agents to manage invoices, bank transfers, and financial conversations.
Train and manage neural networks in distributed E2B sandboxes using the Flow Nexus platform, supporting custom architectures like Transformers, LSTMs, and GANs.
Generate realistic virtual product try-on visualizations to help customers evaluate fit, drape, and scale before purchasing.
Multi-platform content generator for Chinese social media including Xiaohongshu, Zhihu, Official Accounts, and Douyin with native formatting.
Standardizes project context by managing artifacts (product, tech-stack, workflow, tracks) in a conductor/ directory. Supports project scaffolding, artifact synchronization, and AI alignment for greenfield and brownfield projects.
Automated generation of customer-facing App Store 'What's New' release notes by analyzing git commit history since the last version bump.
Capture and formalize software development ideas into structured design documents within the Hashbrown repository, including research and conceptual sketches.
Brainstorm creative domain names and instantly verify availability across multiple TLDs like .com, .io, .ai, and more to streamline your branding process.
A microworld operating system for LLM-based agent living memory, transforming filesystems into navigable rooms and code into habitable worlds.
Expert consultant for designing and building high-quality, consistent AI agent skills. Guides you through discovery, architecture, and creation phases to ensure reliable, composable, and efficient skill delivery.
Queen-led multi-agent orchestration for Claude Code, featuring Byzantine consensus, persistent collective memory, and adaptive task distribution for complex software projects.