trulens-evaluation-workflow
A systematic workflow to instrument, evaluate, and monitor LLM applications using TruLens, supporting frameworks like LangChain, LangGraph, and LlamaIndex.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
236 skills found
A systematic workflow to instrument, evaluate, and monitor LLM applications using TruLens, supporting frameworks like LangChain, LangGraph, and LlamaIndex.
Comprehensive mobile testing for iOS and Android, covering gestures, sensors, permissions, device fragmentation, and performance across 1000+ real and virtual devices.
A systematic code auditing framework for identifying technical debt, security vulnerabilities, dead code, and code quality issues in software projects.
Implement production-grade data quality validation using Great Expectations, dbt tests, and data contracts to ensure reliable pipelines.
Fast lookup for SaaS finance metrics, formulas, and benchmarks. Optimize your financial analysis with quick access to definitions, decision frameworks, and red flag indicators.
Manual testing suite for JUCE audio plugins. Features automated test execution, pluginval validation (strictness 10), and structured DAW testing checklists for stability and quality assurance.
Test Adobe EDS blocks interactively in the browser with Jupyter notebooks. Features ES6 imports, overlay previews, responsive device testing, and zero-dependency execution.
Automated single-cell RNA-seq quality control pipeline following scverse best practices. Performs MAD-based outlier detection, cell filtering, and diagnostic visualization for .h5ad and .h5 datasets.
Tools for deploying, managing, and monitoring DataRobot models, including prediction environment configuration, champion/challenger workflows, and deployment operations.
Conduct automated code reviews for local changes or remote GitHub Pull Requests. It analyzes code for correctness, maintainability, and standards using git and gh CLI integration.
Production-ready Go development support: concurrency patterns, idiomatic error handling, interface design, testing with testify, and Go best practices for scalable backend services.
A testing fixture for validating AI agent skill configurations and detecting rule violations.