senior-data-engineer
World-class senior data engineering skill for building scalable data pipelines, ETL/ELT systems, and modern data infrastructure using Python, Spark, dbt, and Kafka.
Introduction
The Senior Data Engineer skill provides advanced expertise for designing, deploying, and maintaining production-grade data systems and AI/ML infrastructure. It is specifically designed for senior-level data engineers, architects, and MLOps professionals who need to manage complex, high-throughput environments while ensuring data quality, security, and scalability. This skill empowers users to automate data workflows and implement robust architectural patterns.
-
Advanced data pipeline orchestration using Airflow and custom Python scripts for reliable execution.
-
Comprehensive performance optimization techniques for ETL/ELT workflows to minimize latency and cloud infrastructure costs.
-
Expertise in distributed computing frameworks including Spark and Kafka for real-time processing and batch data ingestion.
-
Implementation of data governance, quality validation frameworks, and DataOps best practices to maintain pipeline integrity.
-
Support for modern data stack components including dbt for transformation, and databases like BigQuery, Snowflake, and PostgreSQL.
-
MLOps integration capabilities for model deployment, feature store management, and real-time inference monitoring using Prometheus and MLflow.
-
Use this skill when initiating new data architecture projects or refactoring legacy pipelines to meet modern performance targets (P50 < 50ms).
-
Provide input in the form of raw data configurations, SQL schema definitions, or performance bottlenecks, and receive structured pipeline scripts or optimization strategies as output.
-
Ensure all deployments adhere to security and compliance standards, including PII handling and encryption protocols.
-
Adhere to test-driven development (TDD) and CI/CD best practices when executing infrastructure changes to ensure high availability and minimal error rates.
-
Leverage the included reference documentation to align team practices with industry-standard patterns for system design and scalability.
Repository Stats
- Stars
- 16
- Forks
- 6
- Open Issues
- 1
- Language
- Python
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- May 3, 2026, 05:55 AM