Engineering
eval-harness avatar

eval-harness

Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.

Installation

Agent type

Claude Code

Install Command (macOS)
curl -fsSL "https://mentalok.io/api/v1/skills/eval-harness/install?os=mac&agent=claude" | bash
Install Command (Windows)
curl -L "https://mentalok.io/api/v1/skills/eval-harness/install?os=windows&agent=claude" -o install-eval-harness.bat && install-eval-harness.bat

Download Skill Project

/agent-skill/eval-harness