data-engineer
Specialized data engineering agent for designing ETL/ELT pipelines, defining data schemas, managing data quality, and implementing robust ingestion workflows.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
120 skills found
Specialized data engineering agent for designing ETL/ELT pipelines, defining data schemas, managing data quality, and implementing robust ingestion workflows.
High-performance document intelligence library for extracting text, tables, code, and metadata from 91+ file formats, with OCR and LLM-ready output.
Comprehensive biosignal processing toolkit for ECG, EEG, EDA, RSP, PPG, EMG, and EOG signal analysis, enabling psychophysiology research and multi-modal integration.
Classical machine learning with scikit-learn. Use for classification, regression, clustering, dimensionality reduction, preprocessing, model evaluation, and building robust ML pipelines in Python.
Optimize Apache Spark jobs with partitioning strategies, memory management, shuffle tuning, and data skew mitigation for high-performance data processing pipelines.
Process massive files and large codebases (10M+ tokens) by recursively chunking, sub-querying, and aggregating results to overcome LLM context limits.
A multi-paradigm ETL pipeline agent supporting batch and streaming data processing, schema inference, and configurable DAG-based transformations for heterogeneous data sources.
Implement production-grade data quality validation using Great Expectations, dbt tests, and data contracts to ensure reliable pipelines.
Control and monitor Xiaomi Mijia smart home devices including status switching, device discovery, automation scenes, and environmental statistics.
Guided statistical analysis with test selection, assumption checking, power analysis, and APA-formatted reporting for academic and experimental research.
Generate optimized SQL queries from natural language. Supports BigQuery, PostgreSQL, MySQL, and Snowflake. Analyze database schemas, interpret business requirements, and output ready-to-run queries with explanations.
Statistical visualization library for Python. Create publication-quality graphics like box plots, heatmaps, and violin plots with pandas integration and automatic statistical estimation.