training-data-curation
Guidelines for curating high-quality datasets for LLM post-training (SFT/DPO/RLHF), covering data formats, quality filtering, and collection strategies.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
138 skills found
Guidelines for curating high-quality datasets for LLM post-training (SFT/DPO/RLHF), covering data formats, quality filtering, and collection strategies.
A multi-paradigm ETL pipeline agent supporting batch and streaming data processing, schema inference, and configurable DAG-based transformations for heterogeneous data sources.
Strategic test data generation, management, and privacy compliance for scalable, secure, and realistic quality engineering workflows.
Access AI-ready datasets, benchmarks, and molecular oracles for drug discovery, including ADME, toxicity, DTI, and molecular generation tasks.
Data Analysis Specialist for EDA, statistical modeling, SQL queries, and Python-based visualization. Turn raw datasets into actionable insights through rigorous quantitative methods.
High-performance document intelligence library for extracting text, tables, code, and metadata from 91+ file formats, with OCR and LLM-ready output.
Classical machine learning with scikit-learn. Use for classification, regression, clustering, dimensionality reduction, preprocessing, model evaluation, and building robust ML pipelines in Python.
Enhance image quality, resolution, and sharpness for screenshots and digital media. Perfect for professional documentation, blogs, and presentations.
Generate high-quality visual content, characters, and scenes using structured JSON prompts and automated Python execution for guided image synthesis.
Generate and edit images, diagrams, and infographics using Google's Gemini 3 Pro model. Supports text-to-image, style transformation, and data-accurate visual creation.
Generates data cleaning pipelines for pandas/polars/PySpark, handling missing values, duplicates, outliers, type conversions, and validation.
Process and manipulate images using ImageMagick. Supports resizing, format conversion, batch processing, and retrieving image metadata for developers and automated workflows.