Still Image Extraction Tutorial – Thumbnails, Keyframes etc.
Many systems that consume video assets (i.e. VOD and MAM systems) also use metadata such as still images extracted from the source video material. This tutorial will demonstrate how to extract still images alongside video transcoding operations, and explain the various options such as type, location, and quantity of still images extracted. Some examples include:
- Generating
jpg
orpng
type image files - Specifying the frame size of extracted image files
- Extracting one or many images
- Selecting specific frames to extract
Within the Hybrik job JSON, you can specify each of these types of still image extractions using the image_sequence
object alongside your normal video output target
s, as we will show you in this short tutorial. You should already be familiar with Hybrik’s JSON structure before going further. If not, please take a few minutes to read the Hybrik JSON Tutorial before continuing.
The Image Extraction Operation
The image extraction operation is specified by adding it to the targets
element inside of a transcode
task. Normal (video) targets elements look similar to this:
{
"uid": "transcode_task",
"kind": "transcode",
"payload": {
"location": {
"storage_provider": "s3",
"path": "{{destination}}"
},
"targets": [
{
"file_pattern": "{source_basename}_h264.mp4",
"existing_files": "replace",
"container": {
"kind": "mp4"
},
"video": {
"codec": "h264",
"bitrate_kb": 5000
},
"audio": [
{
"codec": "aac_lc",
"sample_rate": 48000
}
]
}
]
}
}
By adding an image_sequence
object in a video
target, you can specify how you would like Hybrik to extract still frames from the source content. Here is the same example targets
element, but this time in addition to the output H.264 MP4 file we have added a specification that we also want to create an image file from the frame that starts at 5 seconds into the video.:
{
"uid": "transcode_task",
"kind": "transcode",
"payload": {
"location": {
"storage_provider": "s3",
"path": "{{destination}}"
},
"targets": [
{
"file_pattern": "{source_basename}_h264.mp4",
"existing_files": "replace",
"container": {
"kind": "mp4"
},
"video": {
"codec": "h264",
"bitrate_kb": 5000
},
"audio": [
{
"codec": "aac_lc",
"sample_rate": 48000
}
]
},
{
"file_pattern": "{source_basename}_thumbnail_single_5s.png",
"existing_files": "replace",
"video": {
"codec": "png",
"width": 256,
"height": 144,
"image_sequence": {
"total_number": 1,
"offset_sec": 5
}
}
}
]
}
}
Image File Types and Quality
Hybrik will create either jpg
or png
output files, selected by using the codec
element. The example above creates PNG output files, and the example below shows how to change the codec
element to create jpg
files instead. It also shows how to change the JPEG quality setting by using qscale
, as well as the frame size of the output image files:
{
"file_pattern": "{source_basename}_thumbnail_single_5s.jpeg",
"existing_files": "replace",
"video": {
"codec": "jpeg",
"width": 512,
"height": 384,
"qscale": 6,
"image_sequence": {
"total_number": 1,
"offset_sec": 5
}
}
}
Another method of controlling image quality and frame size are in the example below. ffmpeg_args": "-q:v 4
is used in this case to set the image quality. The effective range -q:v
for JPEG is 2-31
with 31
being the worst quality. Recommended values are between 2-5. par: 1
sets the pixel aspect ratio and is only required if just the width
value is specified, without the corresponding height
value.
{
"uid": "thumbnails",
"file_pattern": "{source_basename}_thumbnails_%05d.jpg",
"existing_files": "replace",
"ffmpeg_args": "-q:v 4",
"video": {
"frame_rate": "1/30",
"width": 420,
"par": 1,
}
}
Output Filename Formatting
If instead of extracting a single still image you want to extract a number of still images, you need to specify not only how many still images files you want, but also how the multiple output image files should be named in the filesystem. To accomplish that, you edit both the total_number
element as well as adding a parameter to the file_pattern
output filename string.
In this example the %03d
part of the string is formatted like a programming language string format specifier (see the “C” programming language function printf()
for example). The %
indicates the start of a formatting string, the 03
specifies that file names should include three digits, and include leading zeros (000
, 001
, 002
, etc.) and the d
indicates it should be an integer decimal number. Unless you also specify a value for start_number
, filenames are numbered starting with 0
.
Image Sequence Variations
Offset From the Start
If you don’t wish to start at the beginning of your file, maybe the beginning is black, you can offset the point at which you begin extracting frames. The options to do so are:
offset_sec
- specify an offset to begin extracting frames in seconds from the beginning of the file
relative_offset
- specify an offset to begin extracting as a percentage from the beginning of the file
It is important to understand that if your image_sequence
is based on a percentage (total_number
), the percentage is applied to the “Remaining Period”, not the total file duration. In the above image, the offset is about 15% and if the total_number
was set to 5, those 5 images would be extracted evenly from the remaining period.
Extract Frame Every N Seconds
Below, instead of creating a single output still image from the specified source video frame, we create a still image every 10
seconds utilizing frame_rate
. When you use a fraction like 1/x
, you are saying extract 1
frame every x
seconds.
{
"file_pattern": "{source_basename}_thumbnails_every_10seconds_framerate_%05d.png",
"existing_files": "replace",
"video": {
"codec": "png",
"width": 256,
"height": 144,
"frame_rate": "1/10",
"image_sequence": {
"start_number": 0
}
}
},
Another example where we extract an image every 10 seconds with our first frame extracted with an offset of 10 seconds into the file.
{
"file_pattern": "{source_basename}_thumbnails_every_10seconds_offset_10sec_%05d.png",
"existing_files": "replace",
"video": {
"codec": "png",
"width": 256,
"height": 144,
"frame_rate": "1/10",
"image_sequence": {
"start_number": 0,
"offset_sec": 10
}
}
},
Extract Frame Every N Percent
Instead of specifying the number of seconds you can also specify an elapsed percentage of the total duration of the source file using the total_number
parameter. Below we show how to start extracting frames every 10%
of source file. total_number
will divide the total duration evenly.
10
will result in 10 images every10%
5
will result in 5 images every20%
1
will result in 100 images every1%
{
"file_pattern": "{source_basename}_thumbnails_every_20percent_%05d.jpg",
"existing_files": "replace",
"video": {
"codec": "jpeg",
"width": 256,
"height": 144,
"qscale": 6,
"image_sequence": {
"start_number": 0,
"total_number": 5
}
}
},
The above code snippet would result in an extraction every 20% as pictured below:
Alternatively, if you told Hybrik to extract 20 images (eg, every 5%
) with a relative_offset
of 20%
, those 20 images would be extracted evenly from the last 80% of the file.
{
"file_pattern": "{source_basename}_thumbnails_every_20percent_relative_offset_20percent-%04d.jpg",
"existing_files": "replace",
"video": {
"codec": "jpeg",
"width": 256,
"height": 144,
"qscale": 6,
"image_sequence": {
"start_number": 1,
"relative_offset": 0.2,
"total_number": 20
}
}
},