Delete Task
Introduction
The ability to delete a file or series of files as part of a Hybrik job can be very useful. Sometimes you might create some intermediate outputs or maybe a task generate some temporary files that you don’t ultimately want.
The delete_asset
task does just this, and can be inserted anywhere in your Hybrik job workflow. There are two basic use cases:
- Delete entire folder
- Delete a single file
WARNING: Deleting objects from cloud storage is a PERMANENT OPERATION and CANNOT BE UNDONE unless the object is stored on a bucket with Object Versioning enabled.
Object Versioning is NOT ENABLED BY DEFAULT on Amazon S3, Google Cloud Storage, or Azure Storage. To learn more about enabling Object Versioning (allowing deleted files to be recovered), check the instructions for your storage provider below:
- Amazon S3
- Google Cloud Storage
- Azure Storage (referred to as “Soft Delete”)
Delete a single file
Below is the json snippet for deleting a single file (temp.mp4
):
{
"uid": "delete_file_by_name",
"kind": "delete_asset",
"payload": {
"asset_selector": "config",
"asset": {
"kind": "asset_url",
"payload": {
"storage_provider": "s3",
"url": "s3://foo/bar/baz/temp.mp4"
}
}
}
}
Delete an entire folder
Deleting an entire directory, possibly with multiple files and/or sub-folders, can be a dangerous operation. When deleting an entire folder, you have to confirm the folder name using the delete_folder_acknowledgement
parameter. Given the path:
s3://foo/bar/baz/temp
delete_folder_acknowledgement
would need to be set to temp
. All files within s3://foo/bar/baz/temp
will be recursively deleted, including all sub-folders and files. Your IAM user will need appropriate permissions to remove files or your task will fail.
{
"uid": "delete_temp_folder",
"kind": "delete_asset",
"payload": {
"asset_selector": "config",
"location": {
"storage_provider": "s3",
"path": "s3://foo/bar/baz/temp"
},
"delete_folder_acknowledgement": "temp"
}
}
Delete Source Files in Other Tasks
You may also delete sources or intermediate files in other tasks such as transcode
, package
, or copy
. For example, if you generate mp4s in a transcode task which are then transmuxed to fmp4 in a downstream package
task, you may wish to remove the intermediate mp4 files output from the transcode
task.
To do this, use the options.delete_sources
flag in the task:
"options": {
"delete_sources": true
},
Below is a json snippet for deleting input files in package
task:
{
"uid": "package_hls",
"kind": "package",
"payload": {
"kind": "hls",
"options": {
"delete_sources": true
},
"location": {
"storage_provider": "s3",
"path": "s3:///hls_manifests",
"attributes": [
{
"name": "ContentType",
"value": "application/x-mpegURL"
}
]
},
"file_pattern": "master_manifest.m3u8",
"segmentation_mode": "fmp4",
"segment_duration_sec": "",
"force_original_media": false,
"media_location": {
"storage_provider": "s3",
"path": "s3:///hls_media",
"attributes": [
{
"name": "ContentType",
"value": "video/MP2T"
}
]
},
"media_file_pattern": "{source_basename}.mp4",
"hls": {
"media_playlist_location": {
"storage_provider": "s3",
"path": "s3:///hls_manifests"
}
}
}
}
In the above task, options.delete_sources
is set to true
. That will cause deletion of the input source files for the package
task.
Please note:
- When
delete_sources
is set totrue
, all the files from the preceding task will be deleted. This may not be desirable if you are packaging directly from source files. - You will probably not want to delete the source files if the
force_original_media
is set totrue
. This tells the packager not to demux/remux the media files and assumes they are already fragmented.
Example Jobs
In our sample jobs, we first burn in subtitles into a video, then extract some thumbnail images with the burned-in text. Then we delete the temporary video file that was used to extract the images.