eval-harness
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
538 skills found
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Scans Solana programs (native/Anchor) for 6 critical vulnerabilities, including arbitrary CPI, improper PDA validation, and missing ownership checks, providing detailed fix recommendations.
Deploy specialized AI swarms to perform comprehensive, multi-domain GitHub pull request reviews covering security, performance, architecture, and style.
Manage Fly.io edge infrastructure: deploy apps, scale machines, configure volumes, secrets, and networking via the Fly.io Machines API. Python-based, zero-dependency.
Create, alter, and validate Snowflake semantic views via the CLI. Automate the generation, documentation, and testing of semantic layer definitions to ensure model accuracy and star schema compliance.
Framework for building multi-agent systems, AgentOS runtimes, and MCP-integrated AI agents.
Build distinctive, high-end React Native Expo interfaces using liquid glass design and iOS Human Interface Guidelines for production-grade mobile apps.
Scaffold and generate new GitHub Copilot Agent Skills. Provides templates, directory structures, and instructions to build specialized AI capabilities with bundled resources.
Systematic debugging workflow for Claude Code hooks. Use to troubleshoot non-firing hooks, output errors, or unexpected behavioral issues.
Diagnose and resolve connection, sync, subscription, and type issues in Dojo.js applications. Use for troubleshooting Torii, entity queries, and state updates.
Analyze codebase statistics: LOC, language distribution, and code-to-comment ratios using pygount.
A CTF solver agent that performs triage on challenges, identifies the vulnerability category, and routes tasks to specialized skills for web, pwn, crypto, forensic, and reverse engineering analysis.