fiftyone-find-duplicates
Find, review, and remove duplicate or near-duplicate images in FiftyOne datasets using computer vision similarity embeddings.
Discover reusable agent skills, browse implementation details, and find the right skill for your workflow.
129 skills found
Find, review, and remove duplicate or near-duplicate images in FiftyOne datasets using computer vision similarity embeddings.
A rigorous, four-phase methodology to enforce systematic root cause analysis before applying any code fixes.
Comprehensive citation management: search academic databases, extract metadata from DOIs/PMIDs/arXiv, validate references, and generate perfectly formatted BibTeX for scientific manuscripts.
Specialized data engineering agent for designing ETL/ELT pipelines, defining data schemas, managing data quality, and implementing robust ingestion workflows.
Implement production-grade data quality validation using Great Expectations, dbt tests, and data contracts to ensure reliable pipelines.
A multi-paradigm ETL pipeline agent supporting batch and streaming data processing, schema inference, and configurable DAG-based transformations for heterogeneous data sources.
High-performance document intelligence library for extracting text, tables, code, and metadata from 91+ file formats, with OCR and LLM-ready output.
Proven patterns for extracting, caching, and processing analytics data from GA4 and GSC using MCP servers.
An Obsidian vault curator for identifying stub notes, detecting duplicates, fixing outdated information, and improving documentation quality in both English and Korean.
Normalizes testing defect logs by correcting typos, abbreviations, and ambiguous descriptions based on product-specific codebooks and station validation.
Python toolkit for mass spectrometry data processing. Enables spectral file importing (mzML, MGF, MSP), metadata harmonization, peak filtering, and calculating spectral similarity scores (cosine, modified cosine) for metabolomics.
AI-powered video editing agent for talking head videos, featuring speech-to-text, disfluency detection, and browser-based review workflows.