YouTube Subtitle & Transcript Scraper — JSON, SRT, VTT, LLM
Extract YouTube subtitles and transcripts from videos, Shorts, playlists, and channels as JSON, SRT, VTT, plain text, or clean LLM-ready text. 100+ languages, rich metadata, no API key — and failed extractions are free.
Key Features
One input handles videos, Shorts, youtu.be links, playlists, and channels — mixed in a single run
Five output formats — JSON (timestamped), SRT, VTT, plain text, and LLM-ready (strips [Music], [Applause], and speaker labels)
100+ languages with a priority-ordered language list and toggleable auto-caption fallback
Rich metadata — title, channel, description, publish date, view count, thumbnail, duration, and available languages
Batch entire playlists and channels with a maxVideos cap and 1–10 concurrency
Residential proxy support plus optional cookies to reduce bot-check blocks
Multi-layer extraction — up to nine fallbacks across InnerTube clients, with a yt-dlp PO-token last resort
Circuit breaker and per-item error handling keep large batches running
Use Cases
- AI/ML teams building RAG or fine-tuning datasets from spoken video (LLM-ready text output)
- Content teams repurposing transcripts into blog posts, show notes, and social captions
- SEO marketers extracting searchable video text for indexing and keyword research
- Editors and publishers needing standard SRT/VTT subtitle files
- Researchers batch-collecting transcripts across an entire channel or playlist
- Developers needing structured, timestamped captions without a YouTube API key
Input Parameters
| Parameter | Type | Required | Description |
|---|---|---|---|
urls | array | No | YouTube URLs or bare IDs — videos, Shorts, youtu.be links, playlists, or channels. Required at runtime. |
outputFormat | string | No | Transcript format: json, srt, vtt, text, or llm (default: json). |
languages | array | No | Preferred subtitle languages in priority order, ISO 639-1 codes (default: en). |
includeAutoGenerated | boolean | No | Fall back to auto-generated captions when manual ones are missing (default: true). |
maxVideos | number | No | Cap on videos processed per run, e.g. for playlists/channels (default: 0 = unlimited). |
maxConcurrency | number | No | Videos processed in parallel, 1–10 (default: 3). |
proxyConfiguration | object | No | Proxy settings; defaults to Apify Residential pinned to the US. |
youtubeCookies | string | No | Optional YouTube cookies (Cookie header or cookies.txt) to reduce bot-check blocks. |
Output Example
1{
2 "videoId": "dQw4w9WgXcQ",
3 "url": "https://www.youtube.com/watch?v=dQw4w9WgXcQ",
4 "title": "Rick Astley - Never Gonna Give You Up (Official Video)",
5 "channelName": "Rick Astley",
6 "channelId": "UCuAXFkgsw1L7xaCfnd5JJOw",
7 "publishDate": "2009-10-25",
8 "viewCount": 1761003712,
9 "availableLanguages": ["en", "de-DE", "ja", "pt-BR", "es-419"],
10 "language": "en",
11 "isAutoGenerated": false,
12 "duration": 213,
13 "wordCount": 487,
14 "segmentCount": 61,
15 "text": "We're no strangers to love, you know the rules and so do I...",
16 "segments": [{ "text": "We're no strangers to love", "start": 18.64, "end": 21.88 }],
17 "error": null
18}
Pricing
Pay-per-event: billed once per successfully extracted transcript — failed videos are never charged. Feed a whole channel or playlist and pay only for the captions you actually get back.
Tips
- Use
outputFormat: llmfor RAG and fine-tuning — it removes non-speech annotations so your embeddings see clean prose. - Keep a residential proxy on. YouTube aggressively blocks datacenter IPs; the actor defaults to US residential for a reason.
- Start
maxConcurrencyat 1–3 for big jobs and raise it gradually — high concurrency increases rate-limit risk on large channel scrapes.
Frequently Asked Questions
Do I need a YouTube API key or to log in?
Which languages and caption types are supported?
Am I billed for videos that fail to extract?
Can I get clean, LLM-ready text for RAG?
Related Tools
Bluesky Scraper — Posts, Profiles, Feeds & Interactions
Scrape Bluesky posts, profiles, feeds, and interactions.
Learn more