data-engineer
Specialized data engineering agent for designing ETL/ELT pipelines, defining data schemas, managing data quality, and implementing robust ingestion workflows.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
124 skills found
Specialized data engineering agent for designing ETL/ELT pipelines, defining data schemas, managing data quality, and implementing robust ingestion workflows.
Optimize Apache Spark jobs with partitioning strategies, memory management, shuffle tuning, and data skew mitigation for high-performance data processing pipelines.
World-class senior data engineering skill for building scalable data pipelines, ETL/ELT systems, and modern data infrastructure using Python, Spark, dbt, and Kafka.
Advanced QE reporting, quality dashboards, and predictive analytics for test metrics, code coverage, and deployment readiness to drive data-informed quality decisions.
Read and analyze any data file (CSV, JSON, Parquet, Avro, Excel, etc.) or remote URL (S3, HTTPS) using DuckDB. Automatically detect file formats and preview/profile datasets.
Generate optimized SQL queries from natural language. Supports BigQuery, PostgreSQL, MySQL, and Snowflake. Analyze database schemas, interpret business requirements, and output ready-to-run queries with explanations.
A versatile data analysis assistant for loading datasets, performing statistical calculations, visualizing trends, and generating professional summary reports.
High-performance in-memory DataFrame library for Python and Rust. Features lazy evaluation, parallel execution, and an Apache Arrow backend for efficient ETL, data processing, and faster pandas alternatives.
Implement production-grade data quality validation using Great Expectations, dbt tests, and data contracts to ensure reliable pipelines.
Expert SQL agent for modern database systems, query optimization, HTAP environments, and data architecture patterns. Optimize performance, schema design, and analytical workloads effectively.
Configure and manage Snowflake connections for CLI, Streamlit, and Snowpark environments, including authentication methods like SSO, key pair, OAuth, and profile management.
Train and manage neural networks in distributed E2B sandboxes using the Flow Nexus platform, supporting custom architectures like Transformers, LSTMs, and GANs.