AI Content Moderation for UGC Video: How to Build a Safe Upload Pipeline with Mux Robots

Every UGC video platform eventually encounters the same hard truth: users will upload content that should never see the light of day. Not as an edge case. Not eventually. Immediately, and at scale.

The first time you open uploads to the public, someone will test your limits. The question isn't whether harmful content will arrive — it's whether your platform has the infrastructure to catch it before it reaches an audience. Manual review queues don't scale. Bolting on third-party moderation APIs adds integration overhead, frame extraction pipelines, and latency you didn't plan for. And building your own ML models for nudity and violence detection is a multi-month project before you've written a single line of product code.

This guide covers how to build a complete, production-ready content moderation pipeline using Mux Robots, which includes a native moderate workflow that runs directly on your Mux video assets. No separate services. No custom ML infrastructure. No frame extraction. The pipeline looks like this: upload → transcode → moderate (async via webhook) → publish or flag for human review.

By the end, you'll have working code for every stage of that pipeline, a strategy for tuning thresholds to your platform's risk tolerance, and a three-tier routing system that sends only the genuinely ambiguous content to human reviewers.

Why Content Moderation Is Non-Negotiable Infrastructure

Before getting into implementation, it's worth being specific about why this matters — because "moderation is important" is easy to nod at and harder to prioritize when you're shipping features.

App store requirements are a hard gate. Apple and Google both require UGC platforms to have active content moderation in place for app approval. Without it, your app doesn't ship.

Legal exposure is real. CSAM reporting obligations exist regardless of your platform's size. Liability for user-uploaded content varies by jurisdiction but is a growing area of regulatory attention in the US, EU, and UK.

Advertisers require brand safety. If you're ad-supported or planning to be, buyers need confidence that their ads won't appear alongside harmful content. Without content classification, you can't make that guarantee.

One incident can define your platform. Trust takes months to build and seconds to destroy. A single viral clip of harmful content associated with your product creates a reputation problem that outlasts any feature launch.

And then there's the economics. Manual review at scale costs roughly $0.02–$0.10 per video depending on length and reviewer cost. AI screening first reduces human review volume by 90% or more, making the math work at any upload volume.

How Mux Robots Moderate Works

Mux Robots provides hosted AI workflows that run directly on your Mux video assets. The moderate workflow analyzes sampled frames from a video and returns confidence scores for two categories: sexual content and violence. Both categories are scored from 0.0 to 1.0, where higher values indicate higher confidence that the content is present.

You create a moderation job by POSTing to /robots/v0/jobs/moderate:

json


POST /robots/v0/jobs/moderate
{
  "asset_id": "DS00Spx1C00abYN",
  "settings": {
    "thresholds": {
      "sexual": 0.7,
      "violence": 0.8
    },
    "sampling_interval": 10
  }
}

The defaults (0.7 for sexual content, 0.8 for violence) are reasonable starting points for a general-audience platform. Lower values make the system stricter — a threshold of 0.3 means content scoring above 0.3 gets flagged, catching more borderline material at the cost of more false positives.

The sampling_interval parameter controls how often frames are sampled, in seconds. The minimum is 5 seconds, and the default is 10 seconds. For a 10-minute video at the default, that's 60 frames analyzed. You can also set max_samples to cap the total number of frames analyzed — useful for cost predictability when you're processing high volumes.

The response includes:

thumbnail_scores: an array of per-frame results, each with a timestamp and scores for each category
max_scores: the highest score seen across all sampled frames for each category
exceeds_threshold: a boolean indicating whether any frame crossed either threshold

Here's a strict-mode configuration appropriate for a children's platform:

json


POST /robots/v0/jobs/moderate
{
  "asset_id": "DS00Spx1C00abYN",
  "settings": {
    "thresholds": {
      "sexual": 0.3,
      "violence": 0.4
    },
    "sampling_interval": 5,
    "max_samples": 50
  }
}

Lower thresholds, more frequent sampling, and a capped sample count for cost control. This configuration will flag more borderline content — which is the right tradeoff when your audience is children and your platform's reputation depends on a zero-tolerance standard.

Building the Async Moderation Pipeline with Mux Webhooks

The right architectural pattern for upload moderation is async — you don't want to block the upload experience while waiting for moderation to complete. Instead, you trigger moderation as soon as the asset is ready, and gate publication on the result.

The full flow:

User uploads a video (via direct upload or server-side ingest)
Mux transcodes the asset and fires video.asset.ready
Your webhook handler triggers a Robots moderate job
Mux fires robots.job.moderate.completed when analysis is done
Your handler routes the asset to published, flagged for review, or auto-rejected

The asset starts private. It only becomes public if moderation passes.

First, let's handle webhooks to trigger moderation when a new upload is ready. Here's a Node.js webhook handler:

javascript


import Mux from "@mux/mux-node";

const mux = new Mux({
  tokenId: process.env.MUX_TOKEN_ID,
  tokenSecret: process.env.MUX_TOKEN_SECRET,
});

export async function handleMuxWebhook(req, res) {
  const event = req.body;

  if (event.type === "video.asset.ready") {
    const assetId = event.data.id;

    // Only moderate assets that are awaiting moderation
    // Use metadata to track moderation state
    if (event.data.passthrough?.includes("ugc_upload")) {
      await triggerModeration(assetId);
    }
  }

  res.sendStatus(200);
}

async function triggerModeration(assetId) {
  await mux.robots.moderate.create({
    asset_id: assetId,
    settings: {
      thresholds: {
        sexual: 0.7,
        violence: 0.8,
      },
      sampling_interval: 10,
    },
  });

  // Mark asset as pending moderation in your database
  await db.assets.update({
    where: { muxAssetId: assetId },
    data: { moderationStatus: "pending" },
  });
}

Now the handler that processes moderation results:

javascript


export async function handleModerationComplete(req, res) {
  const event = req.body;

  if (event.type === "robots.job.moderate.completed") {
    const { asset_id, results } = event.data;
    const routing = routeByModerationResult(results);

    await applyModerationDecision(asset_id, routing, results);
  }

  res.sendStatus(200);
}

function routeByModerationResult(results) {
  const { exceeds_threshold, max_scores } = results;
  const { sexual, violence } = max_scores;

  // Auto-reject: very high confidence scores
  if (sexual > 0.9 || violence > 0.95) {
    return "rejected";
  }

  // Human review: threshold exceeded but not extreme, or borderline scores
  if (exceeds_threshold || sexual > 0.5 || violence > 0.6) {
    return "review";
  }

  // Auto-approve: low scores across the board
  return "approved";
}

async function applyModerationDecision(assetId, routing, results) {
  switch (routing) {
    case "approved":
      // Make the asset public
      await mux.video.assets.update(assetId, {
        playback_policy: ["public"],
      });
      await db.assets.update({
        where: { muxAssetId: assetId },
        data: { moderationStatus: "approved", publishedAt: new Date() },
      });
      break;

    case "review":
      // Store results for human reviewers, asset stays private
      await db.assets.update({
        where: { muxAssetId: assetId },
        data: {
          moderationStatus: "review",
          moderationScores: results.max_scores,
          flaggedTimestamps: results.thumbnail_scores
            .filter((f) => f.sexual > 0.4 || f.violence > 0.5)
            .map((f) => ({ timestamp: f.timestamp, scores: f.scores })),
        },
      });
      await notifyReviewTeam(assetId);
      break;

    case "rejected":
      // Auto-reject and notify the uploader
      await db.assets.update({
        where: { muxAssetId: assetId },
        data: { moderationStatus: "rejected" },
      });
      await notifyUploader(assetId, "content_policy_violation");
      break;
  }
}

The asset's playback policy stays private until moderation passes. This is the critical detail: video that hasn't been screened is never publicly accessible, even if the upload succeeded.

Three-Tier Routing: Beyond Simple Pass/Fail

The exceeds_threshold boolean is useful for simple cases, but real moderation pipelines need more nuance. The three-tier routing approach — auto-approve, human review, auto-reject — is what makes AI moderation economically viable.

Think of it as a confidence filter:

Very high scores (sexual > 0.9, violence > 0.95): The model is highly confident this is harmful content. Auto-reject without human review. This tier should handle the clear-cut violations that don't need human eyes.
Middle scores (above threshold but not extreme): The model flagged something, but there's enough ambiguity that a human should look. This is where context matters — a frame that scores 0.75 for violence might be from a news clip, a documentary, or a video game.
Low scores (well below thresholds): Auto-approve and publish. The vast majority of your uploads will land here.

The key insight for human reviewers is the thumbnail_scores array. Each entry includes a timestamp and the scores for that frame. Instead of asking a reviewer to watch an entire 10-minute video, you surface only the flagged moments. A reviewer can jump to timestamp 3:42 where the score spiked rather than watching 8 minutes of benign content to reach 2 minutes of questionable material.

This is the mechanism that makes human review sustainable at scale. You're not reviewing videos — you're reviewing moments.

Tuning Thresholds for Your Platform

There's no universal right answer for moderation thresholds. The correct settings depend on your audience, your platform's risk tolerance, and your capacity for human review.

General social platforms (think short-form video for adults): Default thresholds (0.7/0.8) are a reasonable starting point. You'll iterate from there based on what you see in your review queue.

Education platforms: Tighten to 0.5/0.6. Educators and institutions expect a higher standard, and a single incident with a student audience creates disproportionate reputational damage.

Children's content: Strict mode (0.3/0.4) with max_samples: 50 for thorough coverage. The false positive rate will be higher, but that's the right tradeoff. More human review is preferable to any inappropriate content reaching a young audience.

Enterprise / B2B: Context-dependent, but generally tighter than consumer social. Enterprise procurement teams ask hard questions about content safety policies.

The operational feedback loop matters as much as the initial settings. Track what percentage of uploads get flagged over time — this is a health metric for your moderation system. If that number spikes, investigate whether thresholds need tightening or whether a specific content category is driving the change. If it drops to near zero, your thresholds may be too loose.

Track false positive rates from your human review queue too. If reviewers are approving 95% of what hits their queue, your thresholds are too aggressive. If they're approving less than 50%, tighten them further.

Sampling Strategy: Temporal Coverage vs. Cost Predictability

The sampling_interval and max_samples parameters solve different problems.

sampling_interval gives you consistent temporal coverage. At 10 seconds, every 10-second span of the video is represented. For a 5-minute video, that's 30 frames. This is good for videos where harmful content could appear anywhere — you want uniform coverage across the full duration.

max_samples gives you cost predictability. If you set max_samples: 30, you know exactly how many frames will be analyzed regardless of video length. This matters when you're processing high upload volumes and need predictable per-video costs. The tradeoff is that for long videos, the effective sampling interval increases — a 30-minute video with max_samples: 30 samples one frame per minute.

For most UGC platforms, combining both makes sense: set a sampling_interval that gives good coverage for typical video lengths, and a max_samples cap that prevents runaway costs on unusually long uploads.

Live Content and the Real-Time Challenge

Async moderation works well for uploaded video, but live streaming is a different problem. You can't hold a live stream for moderation — the content is real-time by definition.

The practical approaches for live content are:

Delay-based moderation windows: Introduce a 30–60 second delay in the player, giving a moderation system time to analyze recent frames and pull the stream before it reaches viewers. This is common for large live platforms and requires real-time frame analysis infrastructure.

Live-to-VOD moderation: When a live stream ends, the recording becomes a Mux asset. Run Robots moderate on that recording to catch harmful content in the VOD catalog even if it slipped through during the live broadcast.

Moderation signals for post-stream cleanup: Flag timestamps during live streams where signals suggest potential violations, then use those flags to review and redact the VOD recording before it's indexed or promoted.

The latest thumbnail API can also be used with multimodal models to build real-time frame monitoring during live streams — a complementary approach for platforms where live safety is a top concern.

Scaling the Pipeline

The webhook-driven architecture scales naturally. Each upload triggers its own independent moderation job. There's no shared queue to bottleneck, no coordination overhead between uploads. A platform processing 10 uploads per day and a platform processing 100,000 uploads per day use the same code.

A few patterns for operating at scale:

Creator tiering: New creators get stricter thresholds and more thorough sampling. Established creators with a history of clean content can use looser thresholds, reducing human review volume for your most prolific uploaders.

Parallel Robots workflows: The moderate job doesn't have to be the only Robot running on an asset. You can trigger moderate, summarize, and chapter generation in parallel — all three complete asynchronously and report back via webhooks. The asset stays private until moderation passes regardless of what the other jobs find.

Batch processing for existing catalogs: If you're adding moderation to a platform with an existing video catalog, the same Robots API works for historical assets. You can backfill moderation scores on your entire catalog to establish a safety baseline.

Content Moderation Is a Pipeline Problem, Not a Product Problem

The framing matters here. Content moderation isn't a feature you build — it's infrastructure you operate. The question isn't whether to do it, but how to integrate it without slowing down your core upload experience or your engineering team.

Mux Robots moderate runs directly on your assets. There's no separate service to integrate, no frame extraction pipeline to build, no ML model to train or maintain. The per-frame scores with timestamps give human reviewers exactly what they need to work efficiently. The webhook-driven architecture means the upload experience is never blocked. The three-tier routing system keeps human review focused on genuine ambiguity.

The pattern — upload → asset.ready → trigger moderate → moderate.completed → publish or flag — scales from your first hundred uploads to your first million without architectural changes.

If you're building a UGC platform or adding user uploads to an existing product, this is the right time to build this pipeline. The cost of retrofitting content moderation after an incident is orders of magnitude higher than building it in from day one — in engineering time, in legal exposure, and in the trust you'll spend the next year trying to rebuild.

Start with the Mux Robots documentation and the moderation workflow reference. The Mux UGC solutions page covers the broader context for building on Mux for user-generated content. And if you want to understand how to listen for and verify webhooks — the foundation the entire async pipeline runs on — that's the right place to start.

Back to Articles

Table of Contents

AI Content Moderation for UGC Video: How to Build a Safe Upload Pipeline with Mux Robots