Data Analysis
EdgarTools avatar

EdgarTools

A comprehensive Python library for querying, parsing, and analyzing SEC EDGAR filings, financial statements, and institutional holdings as structured data objects.

Introduction

EdgarTools is an AI-native Python library designed to transform raw SEC EDGAR filing data into actionable, structured Python objects. It eliminates the complexities of manual web scraping, HTML parsing, or raw XML/XBRL handling, providing a consistent API for financial professionals, researchers, and AI agents. The tool is optimized for high-performance data extraction, allowing users to move from CIK lookup or ticker-based company identification to deep financial analysis in just a few lines of code.

  • Full support for 20+ filing types, including 10-K, 10-Q, 8-K, 13F, and Form 4 (insider transactions).

  • Advanced financial statement parsing that converts tables directly into pandas DataFrames for immediate analysis.

  • Native XBRL (eXtensible Business Reporting Language) support for cross-company comparison and low-level fact extraction.

  • Built-in MCP (Model Context Protocol) server for integration with LLM-based agents, allowing for autonomous reasoning over SEC filings.

  • High-performance HTML parsing engine that handles large documents efficiently with multi-strategy section detection.

  • Specialized modules for niche analysis, such as Business Development Companies (BDCs), institutional holdings, and insider ownership.

  • API discovery via the .docs interface on all objects, providing real-time assistance and method documentation within the development environment.

  • Users should call set_identity() with an email address to identify requests to the SEC EDGAR API.

  • The library is designed to be rate-limit aware; use built-in caching mechanisms to optimize performance and prevent network bottlenecks.

  • Input typically involves company tickers, CIK numbers, or specific accession numbers; outputs are typed Python objects, DataFrames, or cleaned text.

  • Ideal for quantitative researchers needing historical financial data, auditors tracking insider trades, and developers building financial AI agents.

  • The tool leverages lxml and PyArrow for efficient processing, making it suitable for large-scale data harvesting or production-ready financial applications.

Repository Stats

Stars
2,086
Forks
355
Open Issues
16
Language
Python
Default Branch
main
Sync Status
Idle
Last Synced
May 3, 2026, 09:34 PM
View on GitHub