eval-harness
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
409 skills found
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Dialectical reasoning and adversarial coding agent for MCP-enabled editors, forcing LLMs to resolve internal contradictions for higher quality outputs.
Production-ready Scrum Master assistant for sprint management, capacity planning, and real-time team analytics.
Implement production-grade AI agents with LangGraph, tool-calling guardrails, SSE streaming, and episodic memory. Includes anti-patterns, fix pairs, and stateful architecture patterns.
Autonomous research specialist for verified information gathering, source evaluation, and structured synthesis.
Prevents AI hallucination and ensures evidence-based, verifiable outputs when analyzing code, reviewing technical documents, or providing recommendations.
Complete browser automation with Playwright. Features local dev server detection, script generation, screenshot capture, form filling, responsive testing, and UX validation.
AI-powered video editing agent for talking head videos, featuring speech-to-text, disfluency detection, and browser-based review workflows.
Manage calendar events, check availability, and schedule meetings seamlessly during or outside of voice and text interactions.
Comprehensive API test automation suite supporting REST/GraphQL. Features functional, performance, and contract testing with integrated Mock services.
Implementation patterns for MERIDIAN autonomous AI agents using Claude API, including BaseAgent lifecycle, structured tool use, token budget enforcement, and cron scheduling.
GoHighLevel workflow automation expert. Integrates with Hylo GHL API to manage workflows, API endpoints, UI navigation, and automation planning.