eval
Evaluate Deca agent prompts and behavioral consistency through automated test runners, manual LLM judgment, and structured reporting.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
599 skills found
Evaluate Deca agent prompts and behavioral consistency through automated test runners, manual LLM judgment, and structured reporting.
Guide for implementing a new AI coding agent analyzer in Splitrail to track token usage, costs, and performance metrics.
Synchronize project documentation with code. Maintains feature specs, API contracts, and READMEs using init-project standards to ensure traceability and completeness.
Normalizes testing defect logs by correcting typos, abbreviations, and ambiguous descriptions based on product-specific codebooks and station validation.
Act as a skeptical technical recruiter to evaluate daily.dev Recruiter features. Review UI/UX, code, and workflows through the lens of a hiring platform built for high-quality developer-recruiter matching.
Manage AWS EC2 virtual machines, AMIs, and networking. Use for instance lifecycle management, security group configuration, key pair handling, and troubleshooting connectivity.
Optimize developer experience for multi-component solutions: standardize onboarding, inner-loop, debugging, and cross-platform setup to eliminate friction and tribal knowledge.
Manage Obsidian vault operations: file creation, YAML frontmatter, wiki-links, and templated note processing for PKM systems.
Activates Prometheus planning mode for structured requirement gathering, codebase research, and task planning within Claude Code.
Build professional, accessible, and responsive user interfaces using React, Next.js, and modern design systems like shadcn/ui. Focuses on developer tools, chat interfaces, and real-time streaming components.
Systematically trace code flows, locate implementations, diagnose performance issues, and map system architecture to understand complex codebases.
Generate finite-difference stencils, select optimal numerical schemes for PDEs/ODEs, and perform truncation error analysis to improve simulation accuracy.