evaluation
Build systematic evaluation frameworks for AI agents using multi-dimensional rubrics, LLM-as-a-judge, and regression testing to measure performance, quality, and context engineering effectiveness.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
506 skills found
Build systematic evaluation frameworks for AI agents using multi-dimensional rubrics, LLM-as-a-judge, and regression testing to measure performance, quality, and context engineering effectiveness.
Automated LinkedIn lead generation for tech services. Identifies non-tech founders, performs website gap analysis, and generates professional PDF audit reports for high-value B2B outreach.
Seamlessly toggle between live and mocked external dependencies using the Model Context Protocol (MCP) for autonomous development environments.
Architectural expert for the SpecKit template, managing Spec-Driven Development, design patterns, and microservices lifecycle automation.
Expert guidance for Claude Messages API: structured outputs, prompt caching, tool use, and migration from deprecated Claude 3.x models to 4.5. Prevents common API errors.
Local text-to-speech conversion using Kokoro TTS. Generate audio, read text aloud, and handle multilingual speech synthesis directly in your terminal.
Validates and coordinates batch study guide operations, preventing errors by enforcing template compatibility, file availability, and source-only policies before agent execution.
Search and download 3D models from Printables with automated manifest generation for 3D printing and prototyping workflows.
Manage, sync, and apply AI agent skills, kits, and presets using the Skills Hub CLI. Streamline your project setup by browsing catalogs, inspecting configurations, and deploying curated instruction policies and skill packages.
Find, connect, and use over 100,000 MCP tools and skills via the Smithery CLI to integrate external services, manage agent workspaces, and automate workflows.
Evaluate Deca agent prompts and behavioral consistency through automated test runners, manual LLM judgment, and structured reporting.
GoHighLevel workflow automation expert. Integrates with Hylo GHL API to manage workflows, API endpoints, UI navigation, and automation planning.