legal

CourtListener Scraper — Opinions, Dockets & Full Text

Query CourtListener for U.S. court opinions, RECAP dockets, oral arguments, judges, and citations — with complete opinion text and RAG-ready chunks. Filter by court, date, or keyword.

TypeScript Cheerio United States

Try on Apify

Key Features

Six search types in one actor — opinions, dockets (RECAP), RECAP documents, oral arguments, judges, and citation lookup

Full opinion text — follows search hits into the database for complete decisions, not 300-character snippets

RAG-ready chunking — ~2000-character paragraph chunks, each with an order index

Citation resolution — resolve up to 250 citation strings (e.g. 576 U.S. 644) to matched cases

Boolean full-text search plus court ID, filed-date range, and sort-order filters

Rate-limit-aware pacing with precise 429 back-off across CourtListener's response formats

Cursor-pagination streaming up to your configured item cap

Bring-your-own token, or use the built-in rotating token pool

Use Cases

Legal research — pull full-text case law by court, date range, or keyword
Litigation intelligence — track a court's docket activity by subject and date
Legal AI / RAG — build a chunked, embeddings-ready case-law corpus
Citation analysis — resolve citations from a brief to linked case records and cite counts
Empirical legal studies — judges, oral-argument metadata, and opinion trends
Legal journalism — monitor newly filed opinions or dockets in specific courts

Input Parameters

Parameter	Type	Required	Description
`searchType`	string	No	What to retrieve: opinions, dockets, recap_docs, oral_arguments, judges, or citation (default: opinions).
`query`	string	No	Full-text query with boolean operators — required unless using a court filter or citation lookup.
`citations`	array	No	Citation strings to resolve (up to 250) — required for citation mode.
`court`	string	No	CourtListener court ID, e.g. scotus, ca9, nyed.
`dateFrom`	string	No	Filed-after date (YYYY-MM-DD).
`includeFullText`	boolean	No	Fetch complete opinion text and RAG chunks for opinions (default: false).
`apiToken`	string	No	Your CourtListener API token; falls back to the built-in rotating pool if omitted.
`maxItems`	number	No	Max items saved; each is one billed event (default: 5).

Output Example

 1{
 2  "itemType": "legal",
 3  "searchType": "opinions",
 4  "id": "10380001",
 5  "title": "Climate United Fund v. Citibank, N.A.",
 6  "court": "Court of Appeals for the D.C. Circuit",
 7  "date": "2025-04-16",
 8  "url": "https://www.courtlistener.com/opinion/10380001/...",
 9  "citations": [],
10  "citeCount": 0,
11  "docketNumber": "23-5138",
12  "fullText": "...",
13  "chunks": [{ "text": "...", "order": 0 }],
14  "meta": { "courtId": "cadc", "clusterId": 10380001 }
15}

Pricing

Pay-per-event — you’re billed only for items saved:

Event	Price	What it covers
Opinion with full text	$0.005	One opinion saved with complete text + RAG chunks
Metadata item	$0.002	A docket, oral argument, judge, citation, or metadata-only record

A 1,000-opinion full-text corpus is ~$5.00; maxItems caps both volume and cost.

Tips

Turn on includeFullText only for opinions you’ll actually embed. It costs extra requests per opinion and is off by default — filter tightly by court and date first.
Use citation mode to enrich a brief. Paste the citation strings and get back linked case records and cite counts in one run.
Bring your own CourtListener token for larger jobs — the free tier’s low rate limit is the main throughput bottleneck.

Frequently Asked Questions

Do I need a CourtListener API token?

CourtListener requires one. Supply your own free token via apiToken, or rely on the actor's built-in rotating fallback pool. Bringing your own token lets you raise throughput on a paid/membership tier.

Is full opinion text available for every search type?

No. Full text plus chunking (includeFullText) applies to opinions only, and is off by default to keep runs fast. Dockets, RECAP documents, oral arguments, judges, and citation results return metadata only.

Which courts are covered?

Whatever CourtListener indexes — U.S. federal courts (SCOTUS, Courts of Appeals, and District courts via RECAP) plus many state courts, addressed by CourtListener's own court IDs like scotus, ca9, or nyed.

How fast can it run?

Throughput is bounded by your token's rate limit (CourtListener's free tier is a few requests per minute) with automatic 429 back-off; a membership token raises the ceiling.