ocr
Extract text from images using the Tesseract OCR engine, supporting multiple languages, image preprocessing, and various formats.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
136 skills found
Extract text from images using the Tesseract OCR engine, supporting multiple languages, image preprocessing, and various formats.
Enhance image quality, resolution, and sharpness for screenshots and digital media. Perfect for professional documentation, blogs, and presentations.
Generate Bilibili-compatible video chapter lists from SRT subtitle files with strict format validation.
Find, review, and remove duplicate or near-duplicate images in FiftyOne datasets using computer vision similarity embeddings.
Google Gemini Image Generation API interface for text-to-image, editing, style templates, and automated retry workflows.
Classical machine learning with scikit-learn. Use for classification, regression, clustering, dimensionality reduction, preprocessing, model evaluation, and building robust ML pipelines in Python.
A powerful CLI tool for image compression and conversion, supporting batch processing, multiple engines (mozjpeg, pngquant, sharp, etc.), format conversion (WebP, AVIF), and recursive directory optimization.
Create and manage TikTok image carousels via the ViralBaby API. Automate image search, text overlays, and draft uploads for social media content creation.
Analyze geospatial data using GeoPandas with proper coordinate projections for accurate distance, filtering, and spatial relationship calculations.
Meta-skill for generating publication-ready scientific figures, multi-panel layouts, and journal-compliant visualizations using Python's matplotlib, seaborn, and plotly libraries.
Generate realistic virtual product try-on visualizations to help customers evaluate fit, drape, and scale before purchasing.
Generate and process 16-bit pixel art office assets for the Claude Office Visualizer using Nano Banana MCP and multi-pass ImageMagick workflows.