Engineering
debug-distributed
Debugging guide for AReaL distributed training issues, including hangs, NCCL errors, OOM, and numerical consistency in FSDP2/TP/CP/EP.
Installation
Agent type
Claude Code
Install Command (macOS)
curl -fsSL "https://mentalok.io/api/v1/skills/debug-distributed/install?os=mac&agent=claude" | bash
Install Command (Windows)
curl -L "https://mentalok.io/api/v1/skills/debug-distributed/install?os=windows&agent=claude" -o install-debug-distributed.bat && install-debug-distributed.bat
Download Installer
Download Skill Project