evaluation
Build systematic evaluation frameworks for AI agents using multi-dimensional rubrics, LLM-as-a-judge, and regression testing to measure performance, quality, and context engineering effectiveness.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
208 skills found
Build systematic evaluation frameworks for AI agents using multi-dimensional rubrics, LLM-as-a-judge, and regression testing to measure performance, quality, and context engineering effectiveness.
Search, discover, and refine AI prompts using the prompts.chat library. Access thousands of community-curated prompts for ChatGPT, Claude, and other AI models.
Advanced visual regression testing with pixel-perfect and AI-powered diff analysis, cross-browser validation, and responsive design checks to prevent UI regressions in CI/CD pipelines.
Generate real-time AI podcast-style audio narratives using Azure OpenAI's GPT Realtime Mini model with WebSocket streaming, complete with PCM to WAV conversion and frontend playback integration.
Generates a random lucky number between 0 and 9999 for games, decision-making, or entertainment.
Your collaborative writing partner for research, outlining, drafting, and feedback. Perfect for technical documentation, blog posts, and articles with proper citation management.
Extract and document authentic writing voice from samples. Create comprehensive voice guides for AI training, ghostwriting, and brand consistency.
Enhance image quality, resolution, and sharpness for screenshots and digital media. Perfect for professional documentation, blogs, and presentations.
A strategic marketing ideation engine for SaaS founders. Generate actionable, stage-appropriate growth strategies, content tactics, and promotional ideas tailored to your budget and specific product context.
Automate Discord server operations including message management, channel organization, and role assignments via MCP.
Implement the 'Engineering as Marketing' growth strategy: build free SEO-driven utility tools to drive organic traffic, capture leads, and convert visitors into customers without ad spend.
Fetches expert perspectives from OpenAI Codex and Google Gemini for architecture, code reviews, and debugging, with transparent LLM synthesis.