Research
ai-writing-detection avatar

ai-writing-detection

Comprehensive AI-generated text detection framework. Features multi-layer analysis of vocabulary, structural patterns, model-specific fingerprints, and technical metadata artifacts to identify AI authorship.

Introduction

This skill provides an expert-level knowledge base and analytical methodology for identifying AI-authored content across various large language models. It moves beyond simple heuristic checking by employing a multi-layered verification strategy that correlates linguistic markers with technical artifacts embedded in generated text. The skill is designed for researchers, content auditors, and technical writers who need to assess the provenance of digital information and maintain high standards of human-authored integrity.

  • Technical Artifact Scanning: Directly detects definitive markers like ChatGPT/GPT-4 output indicators (turn0search, oaicite, utm_source tracking), Grok-specific XML tags, and Markdown structural irregularities.

  • Linguistic Pattern Matching: Analyzes high-signal AI vocabulary (e.g., 'delve', 'tapestry', 'pivotal'), repetitive tricolon structures, negative parallelisms, and 'elegant variation' synonym cycling.

  • Model Fingerprinting: Identifies specific stylistic tendencies across major platforms, differentiating between the formal, cautious tone of Claude and the fact-dense, conversational synthesis characteristic of Gemini.

  • Multi-Layered Confidence Scoring: Evaluates text across nine distinct layers, including structural burstiness, formatting anomalies (Title Case overuse, inline-header lists), and citation integrity (checksum verification for DOIs/ISBNs).

  • False Positive Prevention: Provides guidance on distinguishing between human idiosyncratic writing and algorithmic patterns, ensuring balanced judgment.

  • Use this skill to audit documents, verify source reliability, or analyze the stylistic footprints of different LLM architectures.

  • Input typically consists of raw text strings; output provides a confidence-weighted analysis of potential AI-generated signals.

  • Always cross-reference multiple categories (Layer 1-9) before making definitive claims of AI authorship, as isolated patterns can occur in human creative writing.

  • The methodology is optimized for detecting common pitfalls like 'importance puffery', 'Challenges and Future' template structures, and placeholder reference values.

Repository Stats

Stars
1,108
Forks
100
Open Issues
4
Language
Python
Default Branch
main
Sync Status
Idle
Last Synced
May 1, 2026, 07:17 AM
View on GitHub