Research
research-pipeline avatar

research-pipeline

End-to-end autonomous research agent: from idea generation and literature review to experiment execution, adversarial review loops, and paper writing.

Introduction

The research-pipeline skill provides a comprehensive, end-to-end autonomous framework designed for researchers and engineers conducting ML research. It orchestrates a complete lifecycle that begins with idea discovery—leveraging literature surveys and novelty checks—and progresses through implementation, multi-seed experiment execution, adversarial auto-review loops, and final paper drafting. By integrating cross-model collaboration, it prevents local minima common in single-model research workflows.

  • Full cycle orchestration: Automates Workflow 1 (idea discovery), Workflow 2 (experimentation and recursive review), and Workflow 3 (PDF/LaTeX paper generation).

  • Adversarial review mechanism: Uses multi-agent setups (e.g., Codex executor with Claude/Gemini reviewers) to ensure critical feedback and high-quality experimental rigors.

  • Flexible configuration: Supports fine-grained controls such as AUTO_PROCEED for human-in-the-loop validation, REVIEWER_DIFFICULTY levels ranging from standard to adversarial, and target venues like ICLR or NeurIPS.

  • Persistent research memory: Maintains a research wiki and experiment bridge to track findings, claims, and relationships across complex project milestones.

  • Workflow adaptability: Works within popular agentic IDEs like Claude Code, Cursor, and Trae, or as a standalone CLI application, without rigid framework or database dependencies.

  • Use this skill when you need to move from a broad research direction to a polished, submission-ready PDF autonomously.

  • It accepts research topics as arguments; define constraints (e.g., venue, difficulty) via argument overrides or edit the local SKILL.md constants.

  • Monitor outputs in directories such as idea-stage/ and figures/ai_generated/ to track progress.

  • Configure experimental batch sizes using the /experiment-queue scheduler for large-scale sweeps.

  • Leverage human checkpoints (HUMAN_CHECKPOINT=true) for iterative refinement during adversarial stages.

Repository Stats

Stars
7,817
Forks
729
Open Issues
53
Language
Python
Default Branch
main
Sync Status
Idle
Last Synced
Apr 30, 2026, 10:40 AM
View on GitHub