How to transcode video 100x faster; or, a Gordian knot cut

Five years ago, in a blog post titled “The fastest video publishing in the multiverse,” I made the following claim:

Mux Video is the first video streaming platform that doesn’t make users wait for transcoding. We can prepare short video in seconds instead of minutes, and feature-length films in a couple of minutes instead of half an hour. Over time we expect to get these times even lower.

That earlier blog post showed numbers. In our early testing, we were about 10x faster than others, but those were simulated tests. At that point, Mux Video wasn’t even really launched. We had no real scale.

Five years later, how are we doing?

Video duration	Median time to publish (all inputs)
0-1 minute	2.0 seconds
1-5 minutes	3.3 seconds
5-20 minutes	9.0 seconds
20-60 minutes	46.4 seconds
60+ minutes	93.5 seconds

This is fast. In comparison, a typical transcoder takes about half the duration of a video to transcode. So a 10-minute video might take 5 minutes to be ready for playback. At Mux, a typical 10-minute video takes 9 seconds.

That includes all videos submitted to Mux. We’re even faster with what we call “standard input.” Standard input refers to the dominant profile of files uploaded by users today, based on resolution (up to 2K), codecs (e.g., H.264), frame rate, bitrate, and so on. 82% of files ingested to Mux are standard input.

(This is generally user-controllable; see our Minimize processing time guide for tips on ensuring video publishes quickly.)

Video duration	Median time to publish (standard input)	95% time to publish (standard input)
0-1 minute	1.8 seconds	3.6 seconds
1-5 minutes	2.6 seconds	18.7 seconds
5-20 minutes	5.5 seconds	37.2 seconds
20-60 minutes	26.7 seconds	11.0 minutes
60+ minutes	50.1 seconds	16.1 minutes

The typical 60 minute “standard” file takes less than 1 minute to publish at Mux, and an extreme outlier (95%) takes 16 minutes, which is still twice as fast as a typical transcoder.

How do you publish video that quickly?

A “Gordian knot” is an intractable problem. Around 2500 years ago, an ox-cart was supposedly tied to a post in ancient Phrygia with a famously intricate knot, for somewhat inscrutable reasons. An oracle had said that whoever untied the knot would rule Asia, and so when Alexander the Great came to Phrygia, he obviously took on the challenge. When he couldn’t untie it the traditional way, he pulled out his sword to cut the knot. “Cutting the Gordian knot” means solving a seemingly impossible problem by reframing or sidestepping the problem.

Alexander the Great in Phrygia cutting the Gordian knot — Artist's depiction

A standard video publishing workflow looks like this:

Ingest pipeline with upload to storage to transcode to storage and playback pipeline with storage to CDN to playback

Steps in this diagram happen serially. Video is uploaded to storage, then sent to a transcoder, then sent back to storage for playback. This transcoding step is computationally intensive and slow. This isn’t ideal; what you really want, if you care about user experience, is instant transcoding. But instant transcoding is an intractable problem — a computational Gordian knot.

Before starting Mux, I co-founded a company called Zencoder, which was one of the first cloud-based video transcoding products. Our system worked exactly like that diagram.

Back at Zencoder, we had two ideas that we never implemented.

First: What if we break every incoming video into smaller chunks (say, 10 seconds), and process them in parallel?

Second: What if we could stream video into and out of a transcoder, so that instead of waiting for transcoding to finish, a video could be watched while it was being transcoded?

When we built Mux Video, we implemented both of these ideas. We call this just-in-time (JIT) transcoding. Effectively, when frames of video leave the transcoder, they’re streamed out before the rest of the file is complete. This means our publishing workflow looks like this:

Ingest pipeline with Upload directly to storage, playback pipeline with storage to transcode to CDN to playback

So how do you transcode video 100x faster? The answer is you don’t — you cheat. You sidestep the problem by eliminating the need to pre-transcode video in the first place.

What else do we do with JIT transcoding?

Just-in-time transcoding doesn’t only let us publish video quickly. It also gives Mux capabilities that no one else has. Examples:

Instant thumbnails: If you use a typical cloud transcoder, you need to specify in advance what thumbnails you want to extract from a video. If you ever want new thumbnails, you need to send the entire file back to the transcoder for a slow, expensive operation.

With Mux Video, you can instantly extract any thumbnail from a video using a simple GET request, like this:

GET image


GET https://image.mux.com/{PLAYBACK_ID}/thumbnail.{png|jpg|webp}?time=35

Instant GIF generation: Same story with animated GIFs.

GET gif


GET https://image.mux.com/{PLAYBACK_ID}/animated.{gif|webp}?start=5&end=10

Instant live clipping: Just-in-time transcoding means you can extract any clip from any video with basically zero delay. While a live event is running, you can select arbitrary portions of the video to turn into on-demand clips. View docs for live clipping.

Unified live and on-demand video: Because we treat files and live video in the same way, any live event that we ingest is available as an on-demand asset as soon as the event finishes. View docs for live streaming.

Cost efficiency: Last, but certainly not least: JIT technology means we don’t need to transcode segments of video that are never watched. This allows us to lower our ingest pricing. If you used (say) AWS transcoding to create an ABR ladder of 2 HD and 2-3 SD renditions, you’d pay between 25% and 100% more than our rates at Mux.

Why does this matter?

First, it doesn’t necessarily matter for every user. If you’re streaming long-form content that is uploaded by an in-house editor, it might not matter if a video takes 30 minutes to be ready to publish. You might upload a video long before you’re ready to actually publish it.

But if you’re dealing with user-generated content, it’s a big deal.

Fast time-to-publish is a better experience for users. No one likes waiting on a progress bar.

Fast publishing can increase user activity. When users have to wait a few minutes after uploading a video before they can take their next step (previewing a video, clicking “publish”), they’re more likely to abandon what they’re doing. If they get annoyed by the wait, they’re less likely to upload another video.

Instant clipping and thumbnails enable new workflows, like allowing users to choose their own thumbnails and poster images, or letting users change these over time.

Some content is more relevant when it’s fresh. If you’re dealing with breaking news or time-sensitive content, the difference between 2-second publishing and 2-minute publishing might be enormous.

Ultimately, if users are uploading video, fast publishing means a better user experience. What’s more important than that?

How to transcode video 100x faster; or, a Gordian knot cut

How do you publish video that quickly?

What else do we do with JIT transcoding?

Why does this matter?

Written By

Jon Dahl – Co-founder, CEO

Leave your wallet
where it is

Read more like this

Better CDN tracking: See the complete picture when your videos switch networks

Mux is 73% cheaper than S3 for video streaming

Using multimodal models and latest thumbnails to keep an eye on live streams

Check out our newsletter

How to transcode video 100x faster; or, a Gordian knot cut

LinkHow do you publish video that quickly?

LinkWhat else do we do with JIT transcoding?

LinkWhy does this matter?

Written By

Jon Dahl – Co-founder, CEO

Leave your wallet where it is

Read more like this

Better CDN tracking: See the complete picture when your videos switch networks

Mux is 73% cheaper than S3 for video streaming

Using multimodal models and latest thumbnails to keep an eye on live streams

Check out our newsletter

How do you publish video that quickly?

What else do we do with JIT transcoding?

Why does this matter?

Leave your wallet
where it is