shift-right-testing
Production-grade testing strategy implementing feature flags, canary releases, synthetic monitoring, and chaos engineering for continuous reliability in live environments.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
154 skills found
Production-grade testing strategy implementing feature flags, canary releases, synthetic monitoring, and chaos engineering for continuous reliability in live environments.
Build and orchestrate end-to-end MLOps pipelines covering data preparation, training, validation, and automated deployment.
Tools for deploying, managing, and monitoring DataRobot models, including prediction environment configuration, champion/challenger workflows, and deployment operations.
Production-ready reinforcement learning using Stable Baselines3. Train agents, design custom environments, implement training callbacks, and optimize workflows with a scikit-learn-style API.
Connect your AI agent to the Hugging Face Hub via MCP. Search models, datasets, and papers, manage repos, run cloud compute jobs, and invoke Gradio Spaces as functional AI tools.
PyTorch Lightning skill for scalable deep learning: automates model training, multi-GPU orchestration, data pipelines, and distributed training strategies like DDP, FSDP, and DeepSpeed.
World-class senior data engineering skill for building scalable data pipelines, ETL/ELT systems, and modern data infrastructure using Python, Spark, dbt, and Kafka.
A framework to transform experimental ML prototypes into robust, production-ready Python packages using src layout, hybrid architecture, and strict configuration management.
End-to-end autonomous research agent: from idea generation and literature review to experiment execution, adversarial review loops, and paper writing.
Provides resiliency, health monitoring, and fault tolerance utilities for NVIDIA GPU-accelerated distributed applications, including process management and API key handling.
Expert guidance for configuring FeatBit observability via OpenTelemetry. Use for setting up metrics, logs, traces, and connecting OTEL backends like Seq, Jaeger, or Prometheus for FeatBit backend monitoring.
Train and manage neural networks in distributed E2B sandboxes using the Flow Nexus platform, supporting custom architectures like Transformers, LSTMs, and GANs.