Creates a new job that generates high-quality captions for a Mux Video asset and uploads the resulting VTT as a text track.
Arbitrary string stored with the job and returned in responses. Useful for correlating jobs with your own systems.
The Mux asset ID of the video to caption.
BCP 47 language code of the audio (e.g. "en", "es"). The language will be auto-detected when omitted.
When true, any existing text track with the same language code is deleted before uploading the new caption track. When false (default), the request is rejected if a matching track already exists.
Custom name for the uploaded Mux text track. Defaults to "{Language} (Generated)" using the resolved language code.
When true, speaker labels are identified and added to each caption cue. Useful for interviews, podcasts, and multi-speaker content.
When true, word-level timestamps are exported as a JSON file accessible via temporary_words_url in the job outputs. The URL expires 7 days after the job completes. Billed at a higher unit rate.
Whether to upload the generated VTT to the Mux asset as a new text track. Defaults to true. When false, no track is created and replace_existing must also be false; the generated SRT remains available via temporary_srt_url.
{
"parameters": {
"asset_id": "mux_asset_123abc",
"language_code": "en",
"replace_existing": false,
"include_speakers": false,
"include_words": false,
"upload_to_mux": true
}
}{
"data": {
"id": "rjob_example123",
"workflow": "generate-premium-captions",
"status": "pending",
"units_consumed": 0,
"created_at": 1700000000,
"updated_at": 1700000060,
"parameters": {
"asset_id": "mux_asset_123abc",
"language_code": "en",
"replace_existing": false,
"include_speakers": false,
"include_words": false,
"upload_to_mux": true
}
}
}