Engineering
shift-right-testing avatar

shift-right-testing

Implement production reliability with feature flags, canary deployments, synthetic monitoring, and chaos engineering. Use for progressive delivery and data-driven quality loops.

Introduction

The Shift-Right Testing skill provides a comprehensive framework for testing, validating, and ensuring system resilience directly within production environments. Designed for DevOps engineers, site reliability engineers (SREs), and quality assurance professionals, this skill facilitates the transition from 'testing before release' to 'continuous validation after deployment.' By leveraging production data, it allows teams to ship faster while maintaining high safety standards through automated progressive delivery patterns and proactive observability.

  • Progressive Rollout Orchestration: Automate traffic management using feature flag patterns (1% to 100%) and canary deployment strategies to minimize blast radius during new code releases.

  • Proactive Production Monitoring: Integrate synthetic monitoring for 24/7 endpoint validation and real user monitoring (RUM) to capture actual user experience metrics and latency data.

  • Resilience Validation: Utilize chaos engineering tasks to inject controlled failures—such as network latency or dependency outages—to verify system stability and automated recovery protocols.

  • Incident-to-Test Conversion: Transform production incidents into regression test cases, ensuring that once an issue is resolved in production, it is permanently protected against in pre-production testing suites.

  • Fleet-Wide Agent Coordination: Automatically spawn domain-specific agents (qe-production-intelligence, qe-chaos-engineer, qe-performance-tester) to handle complex production tasks in parallel.

  • Inputs include feature flag identifiers, deployment manifests (e.g., Flagger/Kubernetes configs), and target SLO metrics like p95 latency, error rates, and Apdex scores.

  • Outputs typically consist of deployment health summaries, chaos experiment results, incident replay logs, and automated rollback triggers.

  • Best practices dictate that production should be treated as the ultimate testing ground; always maintain instant rollback capabilities via feature flagging before initiating any traffic shift.

  • Users should monitor the suggested memory namespaces (aqe/shift-right/*) to track canary results, synthetic test configurations, and ongoing chaos experiment status to ensure long-term reliability.

Repository Stats

Stars
329
Forks
65
Open Issues
4
Language
TypeScript
Default Branch
main
Sync Status
Idle
Last Synced
Apr 28, 2026, 12:34 PM
View on GitHub