[Libav-user] Encoding depth video from Kinect

Daniel Henell henell at gmail.com
Thu Feb 9 10:15:38 CET 2012


I'm working on a project where we would like to transfer RGBD video
captured with a Kinect from our users to a server hosted by us for
research purposes. It will be used by users in homes with normal
Internet connections and thus it has be compressed somehow so that it
won't take hours to upload a video. We are not looking for real-time
streaming but we would like it to be as quick as possible.

We use OpenNI to capture the video and it gives us a 640x480 24-bit
RGB image and a 640x280 16-bit depth image each frame.

The depth values are stored in mm and the max range of the depth
sensor is around 5 m so the full 16-bit depth is not used. It's also
possible to get the raw depth values (without calibration and
conversion to mm) and I think the raw data is around 12-14 bits per

The RGB video is encoded with x264 and that is working perfect.
However there doesn't seem to be any video codec that can handle >8
bit grayscale video.

I have done quite a lot of testing to compress the depth by

 - mapping depth <-> rgb value and encode as lossless x264
 - compressing as jpeg-2000 lossless and lossy
 - compressing group of frames with bzip2
 - compressing group of frames with zlib
 - compression using the algorithm presented in "Low-Complexity,
Near-Lossless Coding of Depth Maps from Kinect-Like Depth Cameras",
Mehrotra et al. (Microsoft Research)

Of the above bzip2 gives the best compression rates but is also very
slow. It's actually so CPU intensive that it's interfering with the
OpenNI driver so we get flickers in the video if we compress while
capturing. zlib has the best performance but not very good compression

I'm wondering if there is any way of doing this that I haven't thought
of yet. For example if there is a video codec that could handle 16-bit
grayscale. Someone must have been doing something similar before?

Best regards,

Daniel Henell

More information about the Libav-user mailing list