data-cleaning-pipeline-generator
Generates data cleaning pipelines for pandas/polars/PySpark, handling missing values, duplicates, outliers, type conversions, and validation.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
302 skills found
Generates data cleaning pipelines for pandas/polars/PySpark, handling missing values, duplicates, outliers, type conversions, and validation.
Advanced web search and reasoning tool for OpenClaw agents. Features citation-heavy synthesis, multi-step reasoning, and live internet access via OpenRouter.
Free AI-powered web search via Exa MCP. Includes deep research, company/people lookup, and code context without API keys.
Generate high-quality visual content, characters, and scenes using structured JSON prompts and automated Python execution for guided image synthesis.
Universal CLI tool to convert and synchronize AI agent skills between Claude Code and Gemini CLI extensions.
Convert diverse file formats like PDFs, Office docs, images, audio, and web content into clean Markdown, specifically optimized for LLM ingestion, RAG pipelines, and automated text analysis workflows.
Fetch and parse transcripts from YouTube and Bilibili videos for summarization, QA, and content extraction using yt-dlp.
Extracts mathematical content like definitions, theorems, and proofs from documents (PDF, MD, TEX, TXT) using AI-based cleaning and conversion.
A unified interface for integrating and managing LLM chat providers like OpenAI, Anthropic, Google, Azure, and Bedrock within LangChain applications.
Open-source infrastructure for reliable, multi-destination event delivery. Route webhooks to HTTP, SQS, RabbitMQ, Pub/Sub, EventBridge, or Kafka with built-in retries and observability.
Pre-execution security guardrails for AI agents. Validates shell commands and file reads against 400+ security patterns to block destructive operations, credential theft, and unauthorized system access.
Proven patterns for extracting, caching, and processing analytics data from GA4 and GSC using MCP servers.