massgen-develops-massgen
Development guide for self-improving MassGen via programmatic automation testing and visual UI/UX evaluation.
Introduction
The massgen-develops-massgen skill is a comprehensive toolkit designed for AI agents to facilitate the iterative development, testing, and refinement of the MassGen multi-agent framework. By leveraging this skill, developers and autonomous agents can perform high-fidelity evaluations of the system's backend coordination logic and frontend terminal display quality. It provides two mutually exclusive operational workflows tailored for specific improvement goals, ensuring that agents can either simulate large-scale programmatic tasks or meticulously analyze the user interface layout, color rendering, and overall terminal UX.
-
Automation Mode enables programmatic execution of experiments using CLI parameters, configuration files, and background monitoring scripts to track performance metrics like token usage, coordination phase, and error rates.
-
Visual Evaluation workflows focus on the aesthetic and functional aspects of the terminal UI, ensuring that visual feedback, ANSI formatting, and dashboard layouts align with project standards.
-
Supports multi-agent collaboration analysis using Gemin-2.5, Claude, and other supported models, providing granular control over workflow concurrency and workspace isolation.
-
Facilitates detailed monitoring through real-time status.json file parsing, allowing for live telemetry on agent status, completion percentages, and voting outcomes.
-
Offers advanced background shell management capabilities for parallel processing, enabling agents to run independent monitors for error tracking, cost analysis, and coordination metrics without blocking the primary execution task.
-
To use Automation Mode, invoke the skill with the required configuration YAML file and parse the generated LOG_DIR output for artifact retrieval.
-
For programmatic integration, utilize custom tools such as custom_tool__start_background_tool and the corresponding status polling endpoints to manage task lifecycles.
-
Adhere to the defined exit code standards (0 for success, 1-4 for various failure modes) to interpret experiment results automatically.
-
Ensure workspace isolation is maintained during parallel testing to prevent cross-contamination of log directories and state files.
-
Consult the provided status.json reference to integrate custom monitors for token counts, cost management, and agent-specific error logs.
Repository Stats
- Stars
- 969
- Forks
- 151
- Open Issues
- 6
- Language
- Python
- Default Branch
- main
- Sync Status
- Idle
- Last Synced
- Apr 29, 2026, 12:21 PM