training-data-curation
Guidelines for curating high-quality datasets for LLM post-training (SFT/DPO/RLHF), covering data formats, quality filtering, and collection strategies.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
168 skills found
Guidelines for curating high-quality datasets for LLM post-training (SFT/DPO/RLHF), covering data formats, quality filtering, and collection strategies.
Analyze local system hardware (RAM, CPU, GPU/VRAM) to receive expert recommendations for optimized local LLM models, quantization settings, and performance estimates.
A framework for building modular, reusable agent skills. Provides guidelines for structuring SKILL.md, bundled scripts, references, and assets to extend Claude's capabilities.
A framework for software teams and AI agents to prevent feature creep, enforce scope discipline, and ship focused MVPs by applying strict validation, backlog hygiene, and clear decision-making processes.
Preprocessing and cleaning astronomical light curves using Lightkurve. Tools for outlier removal, flattening, trend detrending, and quality flag handling for time-series analysis.
Classical machine learning with scikit-learn. Use for classification, regression, clustering, dimensionality reduction, preprocessing, model evaluation, and building robust ML pipelines in Python.
A versatile data analysis assistant for loading datasets, performing statistical calculations, visualizing trends, and generating professional summary reports.
Systematic methodology for reproducing published academic papers using provided data, including sample selection, statistical verification, and automated reporting.
Statistical modeling and econometrics library for Python. Performs OLS, GLM, mixed models, ARIMA, diagnostics, and inference for rigorous scientific analysis.
Explains complex concepts using master teaching frameworks like Feynman, Socratic, and Cognitive Load theory to ensure deep, clear understanding.
Classify and group meteorological and environmental variables into specific driver categories for consistent attribution analysis and environmental modeling.
Bayesian modeling and probabilistic programming with PyMC. Build hierarchical models, perform MCMC sampling (NUTS), variational inference, and conduct rigorous model comparison using LOO and WAIC.