trulens-evaluation-workflow
A systematic workflow to instrument, evaluate, and monitor LLM applications using TruLens, supporting frameworks like LangChain, LangGraph, and LlamaIndex.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
133 skills found
A systematic workflow to instrument, evaluate, and monitor LLM applications using TruLens, supporting frameworks like LangChain, LangGraph, and LlamaIndex.
Detects timing side-channel vulnerabilities in cryptographic code through static and dynamic analysis across multiple programming languages.
Production-grade observability stack featuring Prometheus metrics, Grafana dashboarding, PromQL query language, alerting rules, and AI-powered anomaly detection for cloud-native applications.
Implement Linkerd service mesh patterns for security, traffic policy management, and zero-trust networking in Kubernetes environments.
Symbol-level code understanding and navigation agent toolkit using LSP for precise code analysis, reference tracking, and surgical refactoring across 30+ programming languages.
A rigorous, four-phase methodology to enforce systematic root cause analysis before applying any code fixes.
Parallel task orchestration CLI for AI workers using isolated git workspaces.
Gate 2 development cycle skill that validates observability implementation, including structured logging, OpenTelemetry tracing, and instrumentation coverage, without modifying code.
Diagnose and debug Agent-to-Agent (A2A) communication, including orchestrator routing, transport connectivity, agent status, and log analysis for multi-agent systems.
Language-agnostic debugging framework: scientific method, stack trace analysis, logging strategies, and advanced techniques like Git bisect and rubber ducking.
Automated runtime observability changelog for Claude Code development sessions, tracking file changes, test results, and git commits.
Analyze and debug fast-agent session histories, tool execution logs, and conversation timing to resolve performance bottlenecks, tool loops, and unexpected session terminations.