# Find key moments
Identify the most compelling moments in a video using the Mux Robots API.
Identify the most compelling moments in a video. This workflow analyzes both audio and visual content to find segments that stand out for their hook strength, clarity, emotional intensity, novelty, or soundbite quality. It's useful for generating highlight reels, social media clips, or preview content. See the <ApiRefLink href="/docs/api-reference/robots/find-key-moments">Find Key Moments API reference</ApiRefLink> for the full endpoint specification. See [Mux Robots pricing](/docs/pricing/overview#mux-robots-pricing) for unit costs.

## Create a `find-key-moments` job

```bash
curl https://api.mux.com/robots/v0/jobs/find-key-moments \
  -H "Content-Type: application/json" \
  -X POST \
  -d '{
    "parameters": {
      "asset_id": "YOUR_ASSET_ID",
      "max_moments": 5
    }
  }' \
  -u ${MUX_TOKEN_ID}:${MUX_TOKEN_SECRET}
```

<Callout type="info">
  This request is **asynchronous**. The `POST` returns immediately with the job in `pending` status and does not include results. **We strongly recommend listening for the [`robots.job.find_key_moments.completed` webhook](/docs/guides/robots#webhooks)** — the payload contains the full completed job, so no follow-up API call is needed. If webhooks aren't an option, you can poll `GET /robots/v0/jobs/find-key-moments/{JOB_ID}` with the `id` from the response until the status is `completed`.
</Callout>

<Callout type="info">
  Key moment extraction uses transcript cues from the asset to identify compelling segments. Make sure your asset has captions, either [auto-generated](/docs/guides/add-autogenerated-captions-and-use-transcripts) or [manually added](/docs/guides/add-subtitles-to-your-videos), before creating a find-key-moments job.
</Callout>

## Parameters

| Parameter | Type | Description |
| :-- | :-- | :-- |
| `asset_id` | string | **Required.** The Mux asset ID of the video to analyze. |
| `max_moments` | integer | Maximum number of key moments to extract (1-10). Defaults to 5. |
| `target_duration_ms` | object | Preferred highlight duration range in milliseconds. Both `min` and `max` are required when provided. |
| `target_duration_ms.min` | integer | **Required.** Preferred minimum highlight duration in milliseconds. |
| `target_duration_ms.max` | integer | **Required.** Preferred maximum highlight duration in milliseconds. |
| `output_steering` | object | Curated controls that guide moment selection, titles, audience, and concepts without changing the output schema. See [Output steering](#output-steering). |

### Output steering

Use `output_steering` when you want best-effort control over which moments are selected and how they're described. These fields guide the workflow but do not guarantee exact output.

| Field | Type | Description |
| :-- | :-- | :-- |
| `selection_strategy` | string | Preferred definition of a strong standalone moment. Supported values: `standalone_hooks`, `educational_takeaways`, `story_beats`, `product_moments`, and `speaker_highlights`. |
| `title_style` | string | Preferred style for generated moment titles. Supported values: `descriptive`, `punchy`, `educational`, and `social`. |
| `audience` | string | Intended audience used to guide moment selection and titles. |
| `brand_terms` | array of strings | Preferred brand or domain terms to use when supported by the source content. |
| `rubric_priorities` | array of strings | Up to 4 rubric dimensions used as tie-breakers after applying the selection strategy. Supported values: `clarity_in_isolation`, `emotional_intensity`, `novelty`, and `soundbite_quality`. |
| `topic_taxonomy` | object | Controlled vocabulary used to steer notable audible concepts without changing the response schema. |
| `topic_taxonomy.name` | string | Optional customer-facing name for the taxonomy. |
| `topic_taxonomy.values` | array | Controlled vocabulary values. Each value has a required `label` and optional `description` and `aliases`. |
| `topic_taxonomy.allow_other` | boolean | When `true`, non-taxonomy values may be used when no taxonomy value applies. |

```json
{
  "parameters": {
    "asset_id": "YOUR_ASSET_ID",
    "max_moments": 5,
    "output_steering": {
      "selection_strategy": "standalone_hooks",
      "title_style": "social",
      "audience": "developers scrolling a social feed",
      "brand_terms": ["Mux Video", "Mux Data"],
      "rubric_priorities": ["soundbite_quality", "emotional_intensity"],
      "topic_taxonomy": {
        "name": "Themes",
        "values": [
          {
            "label": "Video as data",
            "description": "Treating video content as structured, queryable information",
            "aliases": ["structured video", "queryable video"]
          },
          {
            "label": "Developer experience"
          }
        ],
        "allow_other": true
      }
    }
  }
}
```

## Output

The `outputs` object is included in the job once its status is `completed`. You'll receive it on the [`robots.job.find_key_moments.completed`](/docs/guides/robots#webhooks) webhook (recommended), or you can fetch it with `GET /robots/v0/jobs/find-key-moments/{JOB_ID}`. It contains:

| Field | Type | Description |
| :-- | :-- | :-- |
| `moments` | array | Extracted key moments, ordered by position in the video. |
| `moments[].start_ms` | number | Moment start time in milliseconds. |
| `moments[].end_ms` | number | Moment end time in milliseconds. |
| `moments[].overall_score` | number | Weighted quality score (0.0-1.0) based on hook strength, clarity, emotional intensity, novelty, and soundbite quality. |
| `moments[].title` | string | Short catchy title for the moment (3-8 words). |
| `moments[].audible_narrative` | string | One-sentence summary of what is being said. |
| `moments[].notable_audible_concepts` | array | Key audible concepts (2-5 word phrases). |
| `moments[].visual_narrative` | string | One-sentence summary of what is visually happening. Present for video assets only. |
| `moments[].notable_visual_concepts` | array | Scored visual concepts extracted from sampled frames (video assets only). Each has `concept`, `score`, and `rationale`. |
| `moments[].cues` | array | Contiguous transcript segments with `start_ms`, `end_ms`, and `text`. |

## Example response

This is the payload delivered to the [`robots.job.find_key_moments.completed`](/docs/guides/robots#webhooks) webhook, and the same shape you get from `GET /robots/v0/jobs/find-key-moments/{JOB_ID}`:

```json
{
  "data": {
    "id": "rjob_mno345",
    "workflow": "find-key-moments",
    "status": "completed",
    "units_consumed": 1,
    "parameters": {
      "asset_id": "YOUR_ASSET_ID",
      "max_moments": 3
    },
    "outputs": {
      "moments": [
        {
          "start_ms": 12400,
          "end_ms": 28900,
          "overall_score": 0.92,
          "title": "The Future of Video Data",
          "audible_narrative": "The speaker explains how AI transforms video from passive content into structured, queryable data.",
          "notable_audible_concepts": ["video as data", "AI transformation", "structured information"],
          "visual_narrative": "The speaker gestures at a diagram showing video processing pipeline stages.",
          "notable_visual_concepts": [
            { "concept": "pipeline diagram", "score": 0.87, "rationale": "Directly illustrates the concept being discussed" }
          ],
          "cues": [
            { "start_ms": 12400, "end_ms": 16200, "text": "What's exciting is that video isn't just content anymore." },
            { "start_ms": 16200, "end_ms": 22100, "text": "Every video you upload is a dataset waiting to be queried." }
          ]
        }
      ]
    }
  }
}
```
