ops-devops-platform
DevOps and platform engineering patterns: Kubernetes, Terraform, GitOps, CI/CD, observability, incident response, and cloud-native ops.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
155 skills found
DevOps and platform engineering patterns: Kubernetes, Terraform, GitOps, CI/CD, observability, incident response, and cloud-native ops.
Gate 2 development cycle skill that validates observability implementation, including structured logging, OpenTelemetry tracing, and instrumentation coverage, without modifying code.
Production-grade observability stack featuring Prometheus metrics, Grafana dashboarding, PromQL query language, alerting rules, and AI-powered anomaly detection for cloud-native applications.
Production-grade testing strategy implementing feature flags, canary releases, synthetic monitoring, and chaos engineering for continuous reliability in live environments.
Master KPI dashboard design with proven metrics frameworks, SMART goals, and hierarchy patterns to drive business performance from executive insights to operational monitoring.
Systematic performance engineering: baseline measurement, profiling, bottleneck diagnosis, and evidence-based optimization guidance for high-performance applications.
Expert guidance for configuring FeatBit observability via OpenTelemetry. Use for setting up metrics, logs, traces, and connecting OTEL backends like Seq, Jaeger, or Prometheus for FeatBit backend monitoring.
Advanced multi-language debugging support with stack trace analysis, runtime error triage, and automated diagnostic tools for containerized and distributed systems.
World-class senior data engineering skill for building scalable data pipelines, ETL/ELT systems, and modern data infrastructure using Python, Spark, dbt, and Kafka.
A project-specific template skill for maintaining architectural consistency, coding standards, and deployment workflows in AI-powered full-stack applications.
Master cross-language error handling patterns: exceptions, Result types, and graceful degradation for resilient application development.
A guide for building high-quality MCP (Model Context Protocol) servers in Python or TypeScript to integrate external APIs and services into LLM workflows.