The dataset contains a collection of 68 pedestrian sequences collected from a stationary and moving vehicle. Four different pedestrian motion types are considered: crossing, stopping, starting to walk and bending-in.There is not more than one pedestrian in the sequences; pedestrians are not occluded.
- original stereo pairs (8 bit PGM, 1176x640)
- calibration data
- ground truth (GT) annotations,
- pedestrian detector measurements and
- vehicle data (speed, yaw-rate)
- event tags and time-to-event labels (TTE in frames).
In particular, we provide trajetory data (GT annotations,
detector measurements) in (u, d) form, where u is the lateral
image coordinate and d is disparity. With the static ground
plane assumption this allows the computation of the
pedestrian location on the ground plane (X, Z), also provided.
Pedestrian bounding boxes (i.e. measured lateral image location) were computed with a HOG/linSVM-based detector. Disparity d was computed by means of (H. Hirschmueller, Stereo processing by semi-global matching and mutual information, IEEE PAMI, 30(2):328-341, 2008). The availability of the stereo images allows the experimentation with a different pedestrian detector and/or stereo algorithm.
The sequences have a total of 19612 stereo image pairs. 12485 images contain manually labeled pedestrian bounding boxes and 9366 images containing pedestrian detector measurements. In our paper, we only evaluate in a distance range of 5-50 and require “valid” disparity measurements. This selection leads to 9152 ground truth and 7937 measurement objects.