Agent Skills Hub

Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.

Clear

137 skills found

EngineeringData AnalysisResearch
evaluation avatar

evaluation

Build systematic evaluation frameworks for AI agents using multi-dimensional rubrics, LLM-as-a-judge, and regression testing to measure performance, quality, and context engineering effectiveness.

Views: 2315,339
EngineeringAutomation
eval avatar

eval

Evaluate Deca agent prompts and behavioral consistency through automated test runners, manual LLM judgment, and structured reporting.

Views: 171
ProductivityContentEducation
prompt-rewriter avatar

prompt-rewriter

Advanced prompt rewriting and optimization service. Analyzes prompts for clarity, specificity, and structure, providing actionable improvements, variations for testing, and prompt engineering best practices.

Views: 204,453
EngineeringAutomation
eval-harness avatar

eval-harness

Official evaluation framework for AI agent sessions, implementing Evaluation-Driven Development (EDD) principles to ensure reliability.

Views: 30169,888
ResearchEducationContent
peer-review avatar

peer-review

Structured manuscript and grant review assistant utilizing checklist-based evaluation for methodology, statistical validity, and compliance with reporting standards like CONSORT and STROBE.

Views: 2719,688
ResearchEducationProductivity
scholar-evaluation avatar

scholar-evaluation

Systematically evaluate scholarly work using the ScholarEval framework, providing structured, quantitative, and qualitative assessment across research quality dimensions with actionable feedback.

Views: 819,706
EngineeringData AnalysisAutomation
trulens-evaluation-workflow avatar

trulens-evaluation-workflow

A systematic workflow to instrument, evaluate, and monitor LLM applications using TruLens, supporting frameworks like LangChain, LangGraph, and LlamaIndex.

Views: 113,286#trulens#llm#evaluation#workflow
ResearchContentEngineering
ai-writing-detection avatar

ai-writing-detection

Comprehensive AI-generated text detection framework. Features multi-layer analysis of vocabulary, structural patterns, model-specific fingerprints, and technical metadata artifacts to identify AI authorship.

Views: 121,108
EngineeringProductivity
context-compression avatar

context-compression

Optimize agent performance and token usage through advanced context compression, structured summarization, and task-oriented state management for long-running sessions.

Views: 19
EngineeringResearch
evaluating-code-models avatar

evaluating-code-models

Evaluate code generation models using BigCode Evaluation Harness. Benchmarks include HumanEval, MBPP, and MultiPL-E with pass@k metrics for multi-language coding models.

Views: 197,624#Evaluation#Code Generation#HumanEval#MBPP
ProductivityEngineeringData AnalysisContentResearch
ai-multimodal avatar

ai-multimodal

Process and generate multimedia with Google Gemini. Analyze audio, images, videos, and PDFs with high-context windows. Supports transcription, visual QA, OCR, and AI-driven image creation.

Views: 149
EngineeringProductivity
ai-collaboration-standards avatar

ai-collaboration-standards

Prevents AI hallucination and ensures evidence-based, verifiable outputs when analyzing code, reviewing technical documents, or providing recommendations.

Views: 2744