verification-before-completion
Enforces a mandatory verification protocol before claiming work success: mandates running tests and confirming output to prevent false assertions.
Introduction
The verification-before-completion skill acts as a rigorous gatekeeper for coding agents, establishing an ironclad rule that no status claim—such as 'fixed', 'complete', or 'passing'—may be made without independent, fresh evidence. This skill is designed for autonomous coding agents and developers who aim to eliminate the common failure mode of assuming code works without empirical validation. By shifting the agent's behavior from heuristic-based reporting to evidence-based verification, it significantly improves the reliability of deliverables.
-
Mandates a strict identification, execution, and confirmation cycle for every completion claim.
-
Provides a systematic framework for checking build successes, test pass rates, linting status, and requirement compliance.
-
Integrates directly into development loops, specifically before Git operations like committing, pushing, or opening PRs.
-
Actively combats 'rationalization' and 'agent hallucination' by treating any claim of satisfaction or success without supporting command output as a red flag.
-
Features a comprehensive mapping of common software development claims to their required proof, such as requiring full red-green TDD cycles for bug fixes.
-
Usage: This skill should be active during the final stages of any task, specifically when an agent is about to declare a task finished or verified.
-
Input/Output: Expects the agent to identify the relevant terminal command (e.g., npm test, cargo build) and process the raw output stream to confirm zero failures or expected exit codes before signaling success to the user.
-
Practical constraints: The agent must ignore internal confidence metrics or previous run results; it must initiate a fresh execution to ensure current environment context is accounted for.
-
Safety: Adhering to this skill prevents shipping bugs, undefined functions, or incomplete features caused by premature confidence or exhaustion-related errors.
-
Integration: It acts as an essential check for subagent-driven development, ensuring that agents do not delegate or approve work that has not been demonstrably verified.
Repository Stats
- Stars
- 170,798
- Forks
- 15,078
- Open Issues
- 285
- Language
- Shell
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- Apr 28, 2026, 11:38 AM