April 4, 2023 (2 months ago)
In the previous installment of this series about using Mux Data and CMCD (Common Media Client Data) together, I walked through how you can configure CMCD properties in a few players and in Mux Data. By using CMCD with Mux Data, you can easily tie your viewer’s quality of experience analytics together with your CDN performance logs to achieve end-to-end visibility.
In this blog, I’ll be joined by Will Law from Akamai as we explore the next step down the path of data integration. We’ll walk you through how to join Mux Data view logs and CDN logs to do analysis that will help identify performance issues that could not have been easily solved without combining these datasets.
Content delivery networks (CDNs) respond to HTTP requests for objects. In the case of Akamai, that occurs at a peak rate of over 110 million requests per second, across the global network. The majority of those requests are for media objects, such as playlists, manifests, and audio and video segments. Each request is logged. The binary payload is opaque to the CDN. We know the size and the path of the object being requested, but we don’t know much about its media characteristics or the health of the player making the request. Consider a request for
(this link is just an example, it's not valid)
The CDN logs can’t tell us the duration of this object, whether it’s audio or video, whether it’s HLS or DASH, whether the player that requested it is healthy or struggling, and, if we deliver it in 2s, whether we are doing a good job or a bad job (if this is a 6s segment, then 2s delivery time is good; if it’s a 1s segment, then 2s delivery time would be terrible). The CDN by necessity presents log data in terms of “hits,” as this is the only dimension on which it has visibility. The screenshot below shows a CDN metrics report. We can see there were 8.5 million requests at the edge, but we cannot infer how many unique viewers there were, how many videos they watched, how long they watched, and whether they rebuffered.
Media players, however, have a notion of a playback session (or a “view” in Mux terminology), during which a contiguous piece of content is consumed by the end user. Each session can involve the request of hundreds or thousands of different objects from the CDN. Quality of Experience (QoE) metrics (such as Video Start Time, Rebuffer Rate, Average Bitrate, etc.) are presented as a function of a playback session. To solve this fundamental disconnect between player QoE and its “sessions” and CDNs and their “hits,” the CTA WAVE Project created the Common Media Client Data (CMCD) standard to convey mutually beneficial information between players, CDNs, and analytics systems.
Let’s walk through a demo scenario to illustrate the interaction of player and CDN log data via CMCD. An FFmpeg encoder and packager, hosted on an Akamai Linode Linux instance, produces a live HLS stream with 6 different bitrates, ranging from 500 kbps to 6 Mbps. This live stream is pushed to the Akamai Media Services Live origin and distributed via the Akamai Media Delivery CDN to a series of 10 web-based Akamai AMP players. Each player has a Mux analytics plugin added. As the media plays, live logs are sent via Akamai Datastream to a Mux S3 bucket running on Amazon AWS. Concurrently, the Mux plugin beacons back analytics data to the Mux Data platform, which also sends view session logs to S3. Various SQL tools are then used on AWS to analyze and visualize the data. We intentionally caused throughput distress to our players and then used the testbed to measure and investigate the consequences.
For a player, we used the web version of the Akamai AMP player. The media is intentionally embedded at a fixed size of 1920x1080 so that the player will always attempt to play a higher bitrate if it has sufficient throughput. IThe AMP player has an automatic data-saving mode, so if a player is embedded at a smaller size, it won’t play variants larger than the embedded size and won’t switch up when exposed to higher throughput.
A common sessionID is created and then shared between the CMCD implementation and the Mux plugin (the cmcd object definition below). This is a key point in this demo, as that session ID provides the join, later in the workflow, between the Datastream logs and the Mux Data logs.
CMCD can be sent either as query args or as headers. In this demo, we chose headers. Ten players were instantiated, each within five tabs on two different machines. Each machine provides the same public IP address to the CDN, meaning that from a CDN log perspective, if client IP addresses are being used to infer unique visitors (which is often done), this would incorrectly suggest that two users were playing the content, when in reality there were ten.
We created an Adaptive Media Delivery (AMD) property on Akamai. The property pointed at Media Services Live (MSL) as the origin of the live stream.
We then activated Datastream for this property, linked it to the configured stream, and set the CDN to deliver 100% of the loglines to the target destination.
Next, we added some Advanced Metadata, which captured the incoming CMCD data on both headers and query args and wrote it to the custom log field.
The Datastream output was configured as below. The destination was set to an Amazon S3 bucket, and JSON was selected as the log format. Field #35 — the custom log field — will hold the CMCD data once it is dispatched.
The actual JSON log object pushed to S3 will look like the sample shown below. The IP addresses have been blurred for privacy. The CMCD data is carried in the customField.
In Mux Data, a single video playback session is called a "View". The viewer QoE is aggregated across all the views that happened during the time period and are reported in the Metrics dashboard. The calculated metrics allow you to find and diagnose playback issues across the multitude of dimensions (CDNs, devices, types of video, ads, and more) that make up modern streaming platforms.
Once you isolate the population of viewers that are experiencing poor video playback, you can debug specific views to better understand specifically what happened. You can investigate the playback timeline and the metrics derived from the view.
Views, and all the associated metrics and metadata, can be exported from Mux Data to a system that allows you to ingest, store, and report on the analytics. Mux Streaming Exports sends the view logs to a streaming data service, such as Amazon Kinesis or Google Pub/Sub, as the views complete. These services can connect to just about every data system, either using an off-the-shelf connector or by writing a small amount of code to forward the logs to an external service.
A new Streaming Export can be created from the Streaming Exports tab on the Settings page of the Mux Dashboard.
Click the “New streaming export” button and configure the export settings.
You will need to do some configuration of the destination services. We’ve captured the basics for Kinesis and Pub/Sub. From where those blogs left off, Kinesis Firehose was configured to pull the view logs from Kinesis and store them in Amazon S3.
The logs will contain all the data from the view session. An excerpt is shown below:
Once the Akamai request logs and Mux Data view logs are in S3 buckets, Amazon Athena tables are configured for each log type, which allows the data to be queried and analyzed. We can use the common Session ID to join the view-level QoE metrics from Mux Data with the request-level data (including CMCD data) from Akamai.
For each view session, you can report on every request that reached the CDN and correlate poor scores to the request performance.
For example, to demonstrate how easy the shared session ID makes it, here is a query result for one view that shows the Overall Viewer Experience score from Mux Data with the bitrate and transfer time from Akamai requests joined together by the CMCD session id from both sources.
Now that we’ve got the logs in from both Mux Data and Akamai and they are queryable, it’s time to do some data analysis.
For this demonstration, we created a collection of video view sessions, each of us running multiple test streams from our different locations. We ran concurrent streams at different levels of available bandwidth: unthrottled, 4 Mbps, 2 Mbps, 1 Mbps, and 400 kbps to simulate real-world scenarios.
Historically, CDN logs have been blind to player sessionization, which makes it hard to diagnose issues that occur for a specific viewer. Other than our two different IPs, there is little information available to the CDN to disambiguate the streams and identify poorly performing streams.
Currently, teams doing this analysis will attempt to disambiguate single streams by clustering based on IP, device, and the request path to identify the video being watched. This matching is complicated in the simplest case and is basically impossible if you have viewers watching the same content from one location on the same devices. Additionally, Apple Private Relay and other proxy solutions, as well as VPNs, will intentionally obfuscate true client IPs.
With that in mind, we’ll start with throughput, which is the bandwidth available to download content as measured by the player. This data is available from CMCD, but without sessionization, the best you can do is report that someone (Will!) has a much better internet connection. It is difficult to find viewer-specific information when a customer complains about a poor experience.
With session ids from CMCD, you can see much more clearly the throughput available to viewers on the stream level.
The available throughput limits the quality level that the viewer will experience. The higher the available throughput, the higher the bitrate that can be delivered to the player. CDNs are usually blind to the actual bitrate of the segments they are delivering, but if the player is sending the rendered bitrate via CMCD with the media request, that data can be used by the CDN as well.
We can report on the specific time-weighted average bitrate that is requested by each session, and it’s a lot easier to see how the available throughput translates to the bitrates actually delivered during each session.
Now let’s look at how the client-side analytics from Mux Data can be used with CMCD data.
The most powerful use case is simply the ability to easily trace session logs from the client player to the CDN. Before CMCD, there was no standard way to provide a shared id across a player and CDN, so each team needed to build it into their players bespoke. Now it becomes much simpler.
Let’s say you find a period of high rebuffering. Mux Data allows you to triangulate a number of metadata properties (such as CDN, ISP, device, and video title) to identify the issue and find the views with the worst experience. Once you find the views in Mux Data, often the next step is to check the CDN logs to see if there are any obvious issues for those view sessions.
You can record the Mux Data View Session ID, which contains the CMCD Session ID, from each poorly performing view. Now it just takes a simple SQL query using the CMCD session id to identify the CDN logs for that specific session. No more fuzzy matching or custom solutions to tie player and CDN sessions together.
Easily accessing the sessionized CDN logs after identifying an issue in the playback analytics allows you to diagnose root causes more quickly, which translates to faster response times and lower support costs.
Video teams are always trying to improve the video experience in their application. To help measure the video playback experience, Mux calculates an “Overall Viewer Experience Score” based on a few categories of the playback experience, including Startup Time, Smoothness, Playback Success, and Video Quality. But it’s important to understand how the quality of the experience correlates with some of the capabilities of the device and the network the viewer is watching from.
This analysis is going to change with your specific audience, but for our demo sessions, we can see that the quality was largely determined by two dimensions. This chart shows the Mux Overall Viewer Experience Score, which is the score for each session, on the y-axis; throughput reported via CMCD is on the x-axis; and the size of the bubbles represents the Mux Rebuffer Percentage.
For this example, the Overall Viewer Experience Score is largely determined by the available throughput, which dictates the bitrates that can be delivered. Time spent rebuffering can significantly reduce the score as well.
What is the takeaway from this chart? You can’t change a viewer’s throughput, but you probably want to try as much as possible to reduce the bitrate without noticeably sacrificing quality. Perhaps consider using a more efficient codec or optimizing the encoder presets to allow for a small reduction in quality for a larger reduction in bitrate.
Of course, this is just a simple example. Your data will show different causes affecting your viewer experiences.
Akamai is rolling out a number of CMCD/Datastream2 enhancements in Q2-23. While in this version we used the custom log field to hold our CMCD data, in the next release of Datastream2, the CMCD metrics will be first-class citizens of the selectable log fields, with a UI similar to the one shown here. Additionally, the baseline metadata in AMD will be improved so that all CMCD variables are captured automatically with no advanced metadata required. Second, prefetching via CMCD@nor (next object request) hints will be enabled. Prefetching pulls the next segment to the edge in advance of the player requesting it, improving delivery performance especially in cases in which the origin is far from the edge. CMCD-based prefetching is robust over nonpredictable segment naming and also allows a player to signal an up-switch, avoiding the cold hit that normally results after up-switches with alternate prefetching schemes.
Mux is focused on making it easier to integrate video analytics data into more environments. The ability to export data in JSON format, in addition to the current protobuf support, is expected to be released in early Q2. Additional Streaming Exports destinations will be added over time, allowing for even faster integration with the services customers use for data warehousing and reporting. Let us know what service you use for your data analytics, and we’ll add it to the list for consideration. Mux SDKs do not normally need to be updated when players add CMCD features, but we will continue to monitor player functionality and add support as necessary.
If you have any questions about the methods, workflows, or results shown here, please don’t hesitate to get in touch with us. Both Akamai (Meeting Room W235LMR overlooking West Hall at LVCC) and Mux will be at NAB 2023, and we’d love to meet with you and answer your questions. We are hosting a Lunch + Learn event about CMCD on Tuesday, April 18 and would love to meet you there. CMCD is a simple yet powerful tool for linking client-side and server-side data, discovering relationships, debugging delivery problems, and ultimately improving the quality of every end user’s experience.Register for Lunch + Learn at NAB
No credit card to start. $20 in free credits when you're ready.
With advanced filtering, Mux Data has the reporting flexibility to match the complexity of the issues you're trying to resolve.
By Steven Lyons
Announcing the beta for the Live Stream Latency metric!
By Steven and John
Learn how you can use Mux's Redundant Streams feature to make your events more resilient to larger, internet wide service outages, such as CDN failures.
By Phil Cluff