SEC EDGAR Scraper — Filings, Full-Text & XBRL Financials
Extract SEC EDGAR filing metadata, full-text 10-K/10-Q sections, EDGAR full-text search results, and XBRL financial facts as clean, RAG-ready JSON. No API key required.
Key Features
Four modes in one actor — filing metadata, full document text, EDGAR full-text search, and XBRL financial facts
RAG-ready chunking — section, paragraph (~2000 chars), or none; every chunk tagged with its source Item and order
Automatic 'Item N' section parsing for 10-K/10-Q (Item 1A Risk Factors, Item 7 MD&A, and more)
XBRL facts with taxonomy, tag, label, unit, value, fiscal year/period, form, and accession number
Fact deduplication collapses XBRL restatements to one row per period, keeping the earliest disclosure
Company resolution by ticker, CIK, or name against SEC's official ticker file, with fuzzy fallback
EDGAR full-text search with quoted phrases, form-type, and date-range filters (2001 → present)
SEC-compliant rate limiting and User-Agent — no API key and no login required
Use Cases
- Financial RAG / LLM pipelines needing section-chunked, embedding-ready 10-K and 10-Q text
- Investment research — pull revenue, net income, and diluted EPS time series as clean XBRL rows
- Compliance & ownership monitoring — Form 4, SC 13D/13G, and risk-factor language across companies
- Fintech products that need structured EDGAR data without building a custom crawler
- Market and industry research via full-text search — who discloses a phrase, and since when
- BI and spreadsheet workflows consuming clean XBRL financial fact rows
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
mode | string | Yes | Extraction mode: filings, fulltext, search, or facts (default: filings). |
ticker | string | No | Stock ticker, e.g. AAPL. One of ticker/cik/companyName is required at runtime. |
cik | string | No | SEC Central Index Key, e.g. 320193 — takes precedence over ticker and companyName. |
companyName | string | No | SEC registrant name, exact or fuzzy-matched against the official ticker file. |
query | string | No | EDGAR full-text search expression, e.g. "supply chain disruption" — required for search mode. |
formTypes | array | No | Form types to include, e.g. 10-K, 10-Q, 8-K, S-1, DEF 14A, Form 4. Empty = all forms. |
chunking | string | No | Fulltext RAG strategy: section, paragraph (~2000 chars), or none (default: section). |
maxItems | number | No | Max items saved; each saved item is one billed event (default: 100). |
Output Example
1{
2 "itemType": "filing-fulltext",
3 "title": "Apple Inc. — 10-K 2025-10-31",
4 "company": "Apple Inc.",
5 "ticker": "AAPL",
6 "cik": "0000320193",
7 "formType": "10-K",
8 "filedAt": "2025-10-31",
9 "accessionNo": "0000320193-25-000123",
10 "documentUrl": "https://www.sec.gov/Archives/edgar/data/320193/...",
11 "sections": [
12 { "name": "Item 1A — Risk Factors", "charCount": 38241 },
13 { "name": "Item 7 — Management's Discussion and Analysis", "charCount": 21077 }
14 ],
15 "chunks": [
16 { "text": "The Company's business, reputation, results of operations...", "section": "Item 1A — Risk Factors", "order": 12 }
17 ],
18 "textLength": 220151
19}
Pricing
Pay-per-event — you’re billed only for items actually saved:
| Event | Price | What it covers |
|---|---|---|
| Filing metadata / search hit | $0.001 | One filing metadata record or full-text search result |
| Full-text filing | $0.005 | One filing extracted, section-parsed, and chunked for RAG |
| XBRL fact | $0.0002 | One XBRL financial fact row |
A 1,000-filing metadata pull is ~$1.00; 1,000 XBRL facts is ~$0.20. maxItems caps both volume and cost.
Tips
- Start with
factsmode for financials. Pulling XBRL rows (revenue, net income, EPS) is far cheaper and cleaner than parsing full filings when you only need the numbers. - Use
chunking: sectionfor RAG. It keeps each Item (Risk Factors, MD&A) intact so retrieval returns coherent, citable passages. - Full-text search covers 2001 onward. For older disclosures, resolve the company by CIK and pull filing metadata directly.
Frequently Asked Questions
Do I need an SEC API key?
How far back does the data go?
Is the output ready for LLMs and RAG?
Why are some filings skipped in fulltext mode?
Related Tools
ClinicalTrials.gov Scraper — Studies, Eligibility & Results
Search ClinicalTrials.gov studies, eligibility, and results.
Learn moreCourtListener Scraper — Opinions, Dockets & Full Text
US case law, dockets, and full opinion text for legal AI.
Learn moreUSAspending.gov Scraper — Federal Awards, Recipients & Aggregates
Federal awards, recipients, and spending aggregates from USAspending.gov.
Learn more