Extracting subtitles or captions from a video can be helpful for things like accessibility and localization. Using FFmpeg you can easily extract subtitle tracks from a video file in a variety of formats. In this article we will demonstrate different methods for extracting subtitles and show how to work around subtle gotchas in text formatting.
If you're looking for somewhere to host and stream your videos for you, Mux's Video API has everything you need to manage video for your application.
Check out Mux's Video API!Why Extract Subtitles?
There are several reasons you may want to extract subtitles:
- Localization: When translating a video, having the subtitles as a separate file is easier for editors to work with.
- Accessibility: Adding subtitles to different platforms or reformatting them for screen readers often requires them in a separate file.
- Automation: If you're using AI tools to do sentiment analysis or things like automatic translation then you'll likely want to automate extracting the subtitles or captions.
Types of Subtitles in Video Files
Videos can contain different types of subtitles and only some of them are easily extracted:
Embedded subtitles
Subtitles embedded as part of the video file. These can be extracted without altering the video at all because they are stored as a text track alongside the other media tracks inside the video container. Because they are independent tracks extracting them is usually faster.
Soft subtitles
These are stored in a separate file alongside the video file and so are already separate from the video itself.
Hardcoded subtitles
These are burned directly into the video and can’t be extracted as a separate file. This means that they are literally part of the video image and don't exist as text in any parsable way. Extracting these types of subtitles would requires analyzing the frames themselves and attempting to convert the image into text. This is not something we'll cover in this article.
We'll be extracting embedded subtitles with the examples below.
Things to consider before extracting subtitles and captions
Videos can have multiple subtitle streams, each in a different language or format, so you’ll need to identify the stream that needs extracting and it's ID within the list of streams in the file. Running FFprobe against the file like this: ffprobe my-file.mp4 would output information about what streams are available and their respective ID's.
Some subtitles might use special character sets also, so it’s best to specify encoding where needed, especially with non-Latin languages. We'll show an example later on for how to do this.
Extracting Subtitles and captions with Mux
With any video Asset in Mux you can auto-generate subtitles and captions or add subtitles and captions manually. When the subtitles and captions tracks are ready you can download the transcript as shown here.
How to Extract Subtitles and captions with FFmpeg
Identifying which streams to extract
To identify the subtitle streams in a video, run:
ffprobe video.mp4This command lists all streams within the file, including video, audio, and subtitle streams. Subtitle streams will be marked as Stream #0:x, where x is the stream index.
Extracting subtitles to an SRT File
Once you know the subtitle stream index, you can extract it like this:
ffmpeg -i video.mp4 -map 0:s:0 subtitles.srt-map 0:s:0 specifies the subtitle stream index. The first 0 identifies the input file, which will always be 0 when working with a single input. s selects subtitle tracks and the last 0 identifies which subtitle stream ID to select for extraction.
Extracting subtitles to VTT (WebVTT)
ffmpeg -i video.mp4 -map 0:s:0 subtitles.vttExtracting subtitles to ASS (Advanced SubStation Alpha)
For more complex styling and positioning:
ffmpeg -i video.mp4 -map 0:s:0 subtitles.assDealing with Character Encoding
If the subtitle file’s encoding doesn’t render correctly, specify the character set with -sub_charenc. For example, to handle UTF-8:
ffmpeg -sub_charenc UTF-8 -i video.mp4 -map 0:s:0 subtitles.srtAutomating Subtitle Extraction for Multiple Streams
If a video has multiple subtitle streams, you can extract each with a loop in a bash script like this:
for i in $(ffprobe -v error -select_streams s -show_entries stream=index -of csv=p=0 video.mp4); do
ffmpeg -i video.mp4 -map 0:s:$i subtitles_$i.srt
doneThis command finds each subtitle stream, extracting them sequentially into separate .srt files.
Taking it further
Here's some more articles that you may find helpful for doing common tasks with FFmpeg:
- Extract audio from a video file with FFmpeg
- How to convert MP4 to HLS format with ffmpeg: A step-by-step guide
- Change video bitrate with FFmpeg
Subtitle extraction FAQs
What's the difference between SRT, VTT, and ASS subtitle formats?
SRT (SubRip) is the simplest and most widely supported format, containing just timestamps and plain text. VTT (WebVTT) is designed for web video and supports additional features like positioning, styling, and metadata. ASS (Advanced SubStation Alpha) offers the most sophisticated styling including fonts, colors, animations, and precise positioning. For web playback, VTT is preferred.
How do I know which subtitle stream to extract?
Run ffprobe input.mp4 to list all streams in the file. Subtitle streams appear as Stream #0:2(eng): Subtitle: subrip where the number after the colon is the stream index and the language code appears in parentheses. If multiple subtitle streams exist, they'll have different indices (0:2, 0:3, etc.) and potentially different language codes. Use -map 0:s:0 for the first subtitle stream, -map 0:s:1 for the second, and so on.
Can I extract subtitles from streaming video URLs?
Yes, if the streaming protocol includes subtitle tracks (like HLS or DASH manifests with subtitle renditions), FFmpeg can extract them by treating the URL as an input: ffmpeg -i "https://example.com/video.m3u8" -map 0:s:0 output.srt. However, hardcoded subtitles burned into the video stream cannot be extracted—only separate text-based subtitle tracks work with this method.
Why do my extracted subtitles have garbled characters?
This indicates a character encoding mismatch. The subtitle track uses a different character set than FFmpeg assumes. Try specifying the encoding with -sub_charenc utf-8 before the input file, or try other encodings like windows-1252 or iso-8859-1. Non-Latin languages (Arabic, Chinese, Japanese, etc.) particularly require UTF-8 encoding to display correctly.
Can I extract subtitles without re-encoding the video?
Yes. Extracting embedded subtitles is a simple stream copy operation that doesn't touch the video or audio—FFmpeg just extracts the text track from the container. This is extremely fast since no encoding or decoding happens. Use -c:s copy or let FFmpeg handle it automatically when you specify a subtitle output format like .srt or .vtt.
How do I extract all subtitle streams at once?
Use -map 0:s to select all subtitle streams, then specify multiple output files or use a pattern. For example: ffmpeg -i input.mp4 -map 0:s output_%d.srt creates numbered files for each stream. Alternatively, write a bash loop that calls FFmpeg once per stream index. The script example in the article demonstrates this approach for batch extraction.
What do I do if my video has no subtitle streams?
If ffprobe shows no subtitle streams, the video either has hardcoded subtitles (burned into the video image) or no subtitles at all. Hardcoded subtitles require OCR (optical character recognition) tools to extract, which is beyond FFmpeg's capabilities. Some AI services can perform this task, but it's significantly more complex and less accurate than extracting embedded subtitle tracks.