eval-harness
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
154 skills found
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Guide for implementing features using architecture-first design, TDD, rich domain models, and Swift 6.2 patterns, ensuring a clean separation between Domain, Infrastructure, and App layers.
Master DP patterns with complete implementations for memoization, tabulation, and state design for production-ready solutions.
Evaluate Deca agent prompts and behavioral consistency through automated test runners, manual LLM judgment, and structured reporting.
Standardized NixOS module patterns for system configuration, package management, and home-manager setups.
A framework for building modular AI agent rigs using Nix, featuring parametrable skills, knowledge management, and automated tool configuration.
Standardized Rust documentation practices for the HASH codebase, ensuring consistency in doc comments, intra-doc links, and error handling.
Apply reality-first coding standards: intentional naming, focused functions, guard clauses, and deterministic side effects, with no speculative features.
Navi programming language expert. Use for writing Navi code, debugging, implementing concurrency, handling error states, and managing Navi's type system or module integrations.
Fast-reference guide and utility skill for Helm chart development, template syntax, and Kubernetes application deployment.
Expert Rust analysis for ownership, borrowing, and lifetime errors, including E0382, E0597, and memory safety patterns.
Pragmatic AI-assisted coding standards focused on clean code, simplicity, and maintainability. Enforces best practices like SRP, DRY, and KISS to prevent over-engineering.