Compressions¶
Deep Lake can read, compress, decompress and recompress data to different formats. The supported htype-compression configurations are given below.
| Sample Type | Htype | Compressions |
|---|---|---|
| Image | image | bmp, dib, gif, ico,jpeg, jpeg2000, pcx,png, ppm, sgi, tga,tiff, webp, wmf, xbm,eps, fli, im, msp,mpo, apng |
| Video | video | mp4, mkv, avi |
| Audio | audio | flac, mp3, wav |
| Dicom | dicom | dcm |
| Point Cloud | point_cloud | las |
| Mesh | mesh | ply |
| Other | bbox, text, list, json, generic, etc. | lz4 |
Sample Compression¶
If sample compression is specified when creating tensors, samples will
be compressed to the given format if possible. If given data is already compressed and matches the provided sample_compression,
it will be stored as is. If left as None, given samples are uncompressed.
Note
For audio and video, we don’t support compressing raw frames but only reading compressed audio and video data.
Examples:
>>> ds.create_tensor("images", htype="image", sample_compression="jpg")
>>> ds.create_tensor("videos", htype="video", sample_compression="mp4")
>>> ds.create_tensor("point_clouds", htype="point_cloud", sample_compression="las")
Chunk Compression¶
If chunk compression is specified when creating tensors, addded samples will be
clubbed together and compressed to the given format chunk-wise. If given data is already compressed, it will be uncompressed and then
recompressed chunk-wise.
Note
Chunk-wise compression is not supported for audio, video and point_cloud htypes.
Examples:
>>> ds.create_tensor("images", htype="image", chunk_compression="jpg")
>>> ds.create_tensor("boxes", htype="bbox", chunk_compression="lz4")
Note
See deeplake.read() to learn how to read data from files and populate these tensors.