eval-harness
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
380 skills found
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Connect to the Notion API to create, manage, and query pages, databases, and blocks for your AI-powered knowledge management.
Validates cryptographic implementations using the Google Wycheproof test vector suite to detect security edge cases and known vulnerabilities.
Lightweight MCP (Model Context Protocol) connection handler supporting stdio, SSE, and streamable HTTP transports for seamless server integration.
Intelligent strategic planning and requirements gathering with multi-perspective consensus loops and structured deliberation.
Manage your Whop digital store via API: create products, plans, track payments, and memberships. Perfect for automating digital product business workflows.
Explains complex concepts using master teaching frameworks like Feynman, Socratic, and Cognitive Load theory to ensure deep, clear understanding.
Sends debugging data, logs, and visual output to the Ray desktop application via its local API for real-time developer feedback.