autoresearch
Autonomous multi-team codebase improvement agent with specialized modes: narrow (goal-directed), broad (hypothesis-divergent), and sweep (quality-focused).
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
190 skills found
Autonomous multi-team codebase improvement agent with specialized modes: narrow (goal-directed), broad (hypothesis-divergent), and sweep (quality-focused).
Get deep, critical, NeurIPS/ICML-style peer reviews of your research, paper drafts, and experimental setups using external LLMs via Codex MCP.
Evaluate code generation models using BigCode Evaluation Harness. Benchmarks include HumanEval, MBPP, and MultiPL-E with pass@k metrics for multi-language coding models.
Expert automated code review for Go CLI applications, focusing on Cobra/urfave patterns, security, performance, idiomatic Go, and robust error handling.
6-phase read-only Python analysis workflow that identifies design principle violations, code smells, and modernization opportunities based on specific project types (POC to Open Source).
Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.
Framework for orchestrating long-running agentic tasks, evidence-based delivery, and automated QA gates following Simon Willison's iterative loop.
A structured file-based system for tracking todos, managing technical debt, and coordinating code review workflows directly within your repository.
Automates the creation and maintenance of CLAUDE.md files. It monitors codebase evolution and keeps project memory in sync with file changes, structure, and build commands.
Audit and synchronize the supported LLM model list in assets.py against the authoritative litellm registry.
Architectural governance and project standards for React 19 SPA development, ensuring consistency in stack integration, project structure, and agent execution rules.
Fetches expert perspectives from OpenAI Codex and Google Gemini for architecture, code reviews, and debugging, with transparent LLM synthesis.