gemini-vision
Implement Google Gemini API vision capabilities for image/document analysis including captioning, object detection, segmentation, and multi-image comparison.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
557 skills found
Implement Google Gemini API vision capabilities for image/document analysis including captioning, object detection, segmentation, and multi-image comparison.
A comprehensive framework for deep analysis of articles, papers, and long-form content using 10+ thinking models like SCQA, First Principles, and Systems Thinking.
Build production-grade RAG systems using vector databases, semantic search, and LangGraph to ground LLMs in external knowledge.
Generates comprehensive, best-practice unit tests for functions and classes, supporting multiple frameworks like pytest, unittest, and jest.
Implement production-grade AI agents with LangGraph, tool-calling guardrails, SSE streaming, and episodic memory. Includes anti-patterns, fix pairs, and stateful architecture patterns.
Apply behavioral science, mental models, and psychological principles to marketing strategy, copywriting, and decision-making.
Diagnose dotCMS CI/CD GitHub Actions failures, including PR builds, merge queue issues, and nightly test reports.
Standardized debugging and diagnostic guidelines for AI coding agents.
End-to-end GitHub repository maintenance agent. Automates triage, PR review, issue analysis, and maintenance reporting to ensure long-term repository health, stability, and growth.
Generate high-converting, professional README.md files for open source projects and CLI tools using a value-driven, scannable framework.
Generate consistent, Conventional Commits-compliant messages directly from your staged git diffs.
Comprehensive secure coding guidelines for 15+ languages, covering OWASP Top 10, infrastructure security, and best practices to identify vulnerabilities in code, configurations, and cloud setups.