trulens-evaluation-workflow
A systematic workflow to instrument, evaluate, and monitor LLM applications using TruLens, supporting frameworks like LangChain, LangGraph, and LlamaIndex.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
292 skills found
A systematic workflow to instrument, evaluate, and monitor LLM applications using TruLens, supporting frameworks like LangChain, LangGraph, and LlamaIndex.
Standardize your internal communications using company-specific guidelines and templates for reports, newsletters, and project updates.
Multi-perspective AI consultation for technical architecture, complex refactoring, and structured debugging.
Project bootstrap for Claude Code with safety guardrails, git workflow automation, project auditing, and structured multi-phase planning.
A suite of professional tools for auditing, evaluating, chunking, and scaffolding production-ready RAG pipelines within Claude Code.
Guide for implementing a new AI coding agent analyzer in Splitrail to track token usage, costs, and performance metrics.
Provision and manage Railway database services (Postgres, Redis, MySQL, MongoDB) with automated configuration and environment wiring.
Build systematic evaluation frameworks for AI agents using multi-dimensional rubrics, LLM-as-a-judge, and regression testing to measure performance, quality, and context engineering effectiveness.
A structured repository of Agent Skills for context engineering, multi-agent architectures, and production-grade agent system optimization.
A friendly welcome skill that displays system OS details in ASCII art when triggered by casual greetings like 'hello' or 'hi'.
Stress-test existing product feature ideas by identifying risky assumptions across Value, Usability, Viability, and Feasibility using a multi-perspective devil's advocate framework.
A perspective engineering engine that researches, extracts mental models, and generates runnable persona skills based on deep expression DNA analysis.