debug-distributed
Debugging guide for AReaL distributed training issues, including hangs, NCCL errors, OOM, and numerical consistency in FSDP2/TP/CP/EP.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
132 skills found
Debugging guide for AReaL distributed training issues, including hangs, NCCL errors, OOM, and numerical consistency in FSDP2/TP/CP/EP.
End-to-end autonomous research agent: from idea generation and literature review to experiment execution, adversarial review loops, and paper writing.
Production-grade testing strategy implementing feature flags, canary releases, synthetic monitoring, and chaos engineering for continuous reliability in live environments.
Apply behavioral science, mental models, and psychological principles to marketing strategy, copywriting, and decision-making.
Access Y Combinator’s library of 443+ startup resources for expert advice on fundraising, co-founders, product development, growth, and scaling your business.
Self-modify your Milady agent by managing plugins. Edit code, rebuild, and restart the runtime to develop new capabilities or improve agent workflows locally.
A framework to transform experimental ML prototypes into robust, production-ready Python packages using src layout, hybrid architecture, and strict configuration management.
Monitor and manage margin-living strategy by tracking balances, interest costs, and coverage ratios. Provides automated scaling recommendations and safety alerts based on portfolio-to-margin thresholds.
Evaluate code generation models using BigCode Evaluation Harness. Benchmarks include HumanEval, MBPP, and MultiPL-E with pass@k metrics for multi-language coding models.
Classify and group meteorological and environmental variables into specific driver categories for consistent attribution analysis and environmental modeling.
Evidence-first literature collector for automated research pipelines. Scales paper pools to 1200+ with metadata normalization, provenance tracking, and multi-source ingestion.
Epistemic safety analysis for JSON data in prompts to prevent LLM hallucinations and reasoning errors when handling incomplete or large-scale datasets.