Subtitle and Caption Sources
Hybrik can use subtitle or closed caption files for several operations. These include:
- Burning closed captions or subtitles into the video
- Embedding closed captions into a video
- Adding subtitles as an target for Dash or HLS streaming
- Converting from one subtitle or caption type to another
Supported Source Types
Hybrik can accept the following sources, but there may be edge cases which are not supported. For example, Hybrik supports TTML
which and elements of most TTML
profiles such as IMSC1
, DFXP
, and SMPTE-TT
. There may be variants each of these formats that work and variants that don’t. For example, Hybrik may accept a DFXP
profile of TTML
but will not support any image or binary data. In general, if you are curious if your subtitle format will work, we recommend testing it in a job.
Formats
- Closed Caption Sources
- Scenarist Closed Caption (.scc extension)
- Embedded CEA-608
- Subtitle Sources
- Timed Text Markup Language, or TTML (.xml)
- SubRip (.srt)
- WebVTT (.vtt)
The contents array
The contents
array is part of a source where you can provide Hybrik with more parameters about your source. At it’s simplest, you can say that your source contains audio, or video, or both.
"contents": [
{
"kind": "video"
},
{
"kind": "audio"
}
]
But you can also specify other details in the contents
array. For example, you’ll see that when we secify a closed_caption
source, we can specify the format of the captions or subtitles. More specific examples will follow.
Specifying a Sidecar Closed Caption File
You can specify a closed caption file as a sidecar file in a asset_complex
alongside a video. Read our Tutorial on Complex Assets for full examples. Typically, additional parameters about that asset are specified inside that component’s contents
array.
{
"uid": "sources",
"kind": "source",
"payload": {
"kind": "asset_complex",
"payload": {
"asset_versions": [
{
"asset_components": [
{
"kind": "name",
"name": "{{source_name}}",
"location": {
"storage_provider": "s3",
"path": "{{source_path}}"
},
"contents": [
{
"kind": "video"
},
{
"kind": "audio"
}
]
},
{
"component_uid": "caption1",
"kind": "name",
"name": "{{captions_name}}",
"location": {
"storage_provider": "s3",
"path": "{{captions_path}}"
},
"contents": [
{
"kind": "closed_caption",
"payload": {
"format": "scc"
}
}
]
}
]
}
]
}
}
},
Specifying a Sidecar Subtitle File
Much like how as a closed caption sidecar file is specified, you can do the same with a subtitle sidecar file:
{
"uid": "sources",
"kind": "source",
"payload": {
"kind": "asset_complex",
"payload": {
"asset_versions": [
{
"asset_components": [
{
"kind": "name",
"name": "{{source_name}}",
"location": {
"storage_provider": "s3",
"path": "{{source_path}}"
},
"contents": [
{
"kind": "video"
},
{
"kind": "audio"
}
]
},
{
"component_uid": "caption1",
"kind": "name",
"name": "{{captions_name}}",
"location": {
"storage_provider": "s3",
"path": "{{captions_path}}"
},
"contents": [
{
"kind": "subtitle",
"payload": {
"format": "auto"
}
}
]
}
]
}
]
}
}
},
Embedded Closed Captions as a Source
Hybrik has limited support for embedded closed captions as a source. Hybrik can generally support embedded CEA-608 closed captions for embedding or converting to another text format such as a subtitle file output. You can tell Hybrik that a source has closed captions in the contents
array if they are not auto-detected:
{
"uid": "source",
"kind": "source",
"payload": {
"kind": "asset_url",
"payload": {
"storage_provider": "s3",
"url": "{{source}}",
"contents": [
{
"kind": "video"
},
{
"kind": "audio"
},
{
"kind": "closed_caption"
}
]
}
}
}
Mapping Multiple Languages to CEA-608 Fields and CEA-708 Services
When you want to support multiple languages, you typically embed your primary language on CEA-608 CC1 which is mapped to CEA-708 Service 1. Your secondary language is typically mapped to CEA-608 CC3 which is mapped to CEA-708 Service 2. You can specify the first embedded language with scc0
and the second language as scc1
within the contents
::payload
of each asset_component
in your source. Here is a snippet:
"contents": [
{
"kind": "closed_caption",
"payload": {
"format": "scc0"
}
}
]
Sync to Timecode
If your caption/subtitle source is aligned to your video source’s timecode track, you will want to enable timecode synchronization. The option to do this in Hybrik is "sync_to_timecode": true
, which will align the subtitles to the video’s timecode.
This example shows sync_to_timecode
in a closed_caption
source but this is also valid in subtitle
sources.
"contents": [
{
"kind": "closed_caption",
"payload": {
"format": "scc",
"sync_to_timecode": true
}
}
]
Offset Subtitle Sources
If you need to offset your subtitles by some number of seconds, that can be specified with "delay_sec": -30
which would start the subtitles 30
seconds earlier:
"contents": [
{
"kind": "subtitle",
"payload": {
"format": "auto",
"delay_sec": -30
}
Subtitle and Caption source parameters
These are some parameters that you can specify in a source component’s contents
array
Closed Caption source parameters
These options are valid for closed_caption
sources such as an scc file.
Name | Type | Description |
---|---|---|
mode |
enumdisabled enabled auto |
Optional. disable : disable all tracks of this media type, enabled : use this track (fail if it doesn’t exist), auto : use if exists. Default is enabled . If you have captions in a video that you want to suppress, this can be set to disabled on video in the contents array. |
track_name |
string | Set closed caption track name. |
format |
enumscc scc0 scc1 |
The format of the closed caption track. |
category |
enumdefault sdh forced described_music_and_sound |
Set the category of the closed caption track. |
delay_sec |
integer | Optional. By how many seconds to delay the closed caption track from the start of the video track. Can be a positive value as well as negative (start closed caption track before video.) |
sync_to_timecode |
boolean | Optional. When set to true the closed caption track will be synchronized with timecode information (for example from a timecode track, or timecode embedded in the video track) from the video track. If false, the closed caption track will start at the beginning of the video track. Default is false . |
source_timecode_selector |
enumfirst highest lowest mxf gop sdti smpte material_package source_package |
Selects which metadata track to be used for time code data. Not all options are valid with all codecs/containers. Default is first . |
timecode_format |
enumdf ndf auto |
Optional. Override timecode metadata type as drop frame or non-drop frame , or keep the existing type of the track for auto . |
timecode_frame_rate |
enum59.94 29.97 23.98 |
Optional. Override framerate of the timecode metadata. |
ingest_repeat_rate |
integer | Optional. Minimum: 0 , maximum: 2 . How often to re-read the closed caption track when processing the asset_complex source. |
language |
string | Select closed caption language. |
track_group_id |
string | This indicates which Group this track belongs to. Multiple tracks with the same content but different bitrates would have the same track_group_id . |
layer_id |
string | This indicates which Layer this tracks belongs to. For example, this allows bundling one video layer and multiple audio layers with same bitrates but different languages. |
layer_affinities |
array | This indicates which other layers this layer can be combined with. For example, to combine audio and video layers. |
Subtitle Source Parameters
These options are valid on subtitle source files
Name | Type | Description |
---|---|---|
format |
enumttml imsc1 srt stl scc scc0 scc1 webvtt auto |
The format of the subtitle track. |
category |
enumdefault sdh forced described_music_and_sound |
Set the category of the subtitle track. |
delay_sec |
integer | Optional. By how many seconds to delay the subtitle track from the start of the video track. Can be a positive value as well as negative (start subtitle track before video.) |
sync_to_timecode |
boolean | Optional. When set to true the subtitle track will be synchronized with timecode information (for example from a timecode track, or timecode embedded in the video track) from the source. If false, the subtitle track will run synchronously to the video track. Default is false . |
source_timecode_selector |
enumfirst highest lowest mxf gop sdti smpte material_package source_package |
Specifies the metadata track to be used for time code data. Default is first . |
timecode_frame_rate |
enum59.94 29.97 23.98 |
Optional. Override framerate of the timecode metadata. |
language |
string | Select subtitle language. |
track_group_id |
string | This indicates which Group this track belongs to. Multiple tracks with the same content but different bitrates would have the same track_group_id. |
layer_id |
string | This indicates which Layer this tracks belongs to. For example, this allows bundling one video layer and multiple audio layers with same bitrates but different languages. |
layer_affinities |
array | This indicates which other layers this layer can be combined with. For example, to combine audio and video layers. |
See our examples for some example operations.
Examples
- Burning in Subtitles
- Embedding Captions
- Embedding Multi-Language Captions
- Converting TTML to SCC
- Read more on subtitles/captions and Packaging tasks