Stereo Matching
|Depth Estimation
License: Unknown


Time-of-Flight sensors and stereo vision systems are two of the most diffused depth acquisition devices for commercial and industrial applications. They share complementary strengths and weaknesses. For this reason, the combination of data acquired from these devices can improve the final depth estimation performance. We introduces a dataset acquired with a multi-camera system composed by a Kinect v2 ToF sensor, an Intel RealSense R200 active stereo sensor and a ZED passive stereo camera system. The acquired scenes include indoor settings with different external lighting conditions. The depth ground truth has been acquired for each scene of the dataset using a line laser. The data can be used for developing fusion and denoising algorithms for depth estimation and test with different lighting conditions. A subset of the data has already been used for the experimental evaluation of the stereo-ToF fusion method of Agresti et al.

Data Collection

The multi camera acquisition system used to acquire the proposed dataset is arranged as in the figure below. The reference system is the ZED camera in the center, underneath the ZED there is the Kinect and above there is the RealSense R200. The three cameras are kept in place by a plastic mount specifically designed to fit them. The depth camera of the Kinect is approximately horizontally aligned with the left camera of the ZED with 40 mm vertical displacement, while the color camera is approximately in between the passive stereo pair. The RealSense R200 is placed approximately 20 mm above the ZED camera, with the two IR and color camera inside the baseline of the passive stereo pair.


The subjects of the 10 scenes in the REAL3EXT dataset try to stress various flows of the stereo and ToF systems. Critical points are for example lack of texture for the passive stereo system and the presence of low reflect elements and external illumination for the active sensors. The scenes are composed by flat surfaces with and without textures, plants and objects of various material such as plastic, paper and cotton fabric. These are characterized by various specularity properties as reflective and glossy surfaces and rough materials. Each scene was recorded under 4 different external lighting conditions, which are the following: with no external light; with regular lighting; with stronger light; with an additional incandescent light source. Each lighting condition can highlight the weakness and strength of the different depth estimation algorithms. We added the acquisitions with the additional incandescent light source since its spectrum, in the IR wavelength, covers the working range of the active depth cameras and it is a known problem for those devices.

Data Format

File Format

The dataset is in zip file form. It contains one folder for each of the 10 scenes containing a folder for each of the considered external illumination condition. Each of these sub-folders contains the following data:

  • left color image from the ZED stereo system (zed_left.png)
  • right color image from the ZED stereo system (zed_right.png)
  • depth map, measured in millimeters, from the Kinect v2 ToF sensor (kinect_depth.mat)
  • amplitude map from the Kinect v2 ToF sensor (kinect_amplitude.mat)
  • color image from the Kinect v2 color camera (kinect_color.png)
  • left IR image from the R200 active stereo system (r200_left.png)
  • right IR image from the R200 active stereo system (r200_right.png)
  • color image from the R200 color camera (r200_color.png)
  • depth ground truth, measured in millimeters, from the left camera in the ZED stereo system (gt.mat)

Finally the calibrationREAL.xml file contains the intrinsic and extrinsic parameters of the employed setup. The format of the calibration data is the one used by the OpenCV computer vision library, refer to the documentation of OpenCV for more details.


Please use the following citation when referencing the dataset:

title={A multi-camera dataset for depth estimation in an indoor scenario},
author={Giulio Marin and Gianluca Agresti and L. Minto and P. Zanuttigh},
journal={Data in Brief},
Data Summary
Depth, Image,
Provided by
Multimedia Technology and Telecommunications Lab
Multimedia Technology and Telecommunications Lab is a lab of University of Padova
Start Building AI Now