Engineering
evidence-first-debugging avatar

evidence-first-debugging

Enforces a strict evidence-based debugging workflow using structured observation, hypothesis testing, and causality validation to eliminate speculation in technical investigations.

Introduction

The evidence-first-debugging skill is a professional-grade diagnostic framework designed for software engineers, site reliability engineers (SRE), and QA analysts. It shifts the debugging paradigm from intuition-based guessing to a rigorous, scientific-method approach. By mandating the use of a 15-section Unified Investigation Template, this skill ensures that every claim is anchored to verifiable signals, preventing common traps like correlation-causation fallacies, incomplete verification, and ungrounded speculation during complex incident responses.

  • Structured Observation Recording: Mandates that all FACTS, OBSERVATIONS, and RESULTS are tagged with unique evidence IDs [En] to maintain an audit trail for every investigation claim.
  • Hypothesis-Driven Testing: Requires explicit documentation of hypotheses, including clear prediction statements and falsifiable tests, ensuring that every debugging branch can be logically disproven or confirmed.
  • Causality Gate Validation: Implements strict classification rules for action-result links to ensure that code changes or configuration tweaks are backed by evidence rather than correlative guesswork.
  • Domain-Specific Extensions: Dynamically loads specialized debugging or performance modules (e.g., call stack analysis, dependency graphs, baseline metrics, and resource utilization) based on the investigation type (bug vs. performance regression).
  • Verification Gates: Enforces a requirement that investigations cannot be marked as resolved-verified without an explicit, successful verification command or test case that addresses the original issue.

Usage and Constraints:

  • Ideal for debugging software bugs, crashes, flaky tests, memory leaks, latency regressions, and complex performance throughput issues.
  • The skill requires input in the form of system signals, logs, or metrics. Outputs are strictly formatted; any abbreviated output must include a mandatory truncation disclosure block (total lines, method, fingerprint, and command).
  • Users should expect to interact with the system by providing raw signal data and following the agent's prompts to move through the Investigation Template sections (0-14).
  • Key keywords for search and integration include: root cause analysis, causality check, debugging extension, performance monitoring, investigation template, flaky tests, evidence-based development, verification gate, and scientific method for software.

Repository Stats

Stars
40
Forks
7
Open Issues
469
Language
Python
Default Branch
main
Sync Status
Idle
Last Synced
May 3, 2026, 05:04 AM
View on GitHub