graviti
Products
Resources
About us
KITTI-raw
2D Box
3D Box
Autonomous Driving
|...
License: CC BY-NC-SA 3.0

Overview

The dataset comprises the following information, captured and synchronized at 10 Hz:

  • Raw (unsynced+unrectified) and processed (synced+rectified) grayscale stereo sequences (0.5
    Megapixels, stored in png format)
  • Raw (unsynced+unrectified) and processed (synced+rectified)
    color stereo sequences (0.5 Megapixels, stored in png format)
  • 3D Velodyne point clouds (100k points per frame, stored as binary float matrix)
  • 3D GPS/IMU data (location, speed, acceleration, meta information, stored as text file)
  • Calibration (Camera, Camera-to-GPS/IMU, Camera-to-Velodyne, stored as text file)
  • 3D object tracklet labels (cars, trucks, trams, pedestrians, cyclists, stored as xml file)

Here, "unsynced+unrectified"
refers to the raw input frames where images are distorted and the frame indices do not correspond,
while "synced+rectified" refers to the processed data where images have been rectified and
undistorted and where the data frame numbers correspond across all sensor streams.

Data Collection

Our recording platform is a Volkswagen Passat B6,
which has been modified with actuators for the pedals (acceleration and brake) and the steering
wheel. The data is recorded using an eight core i7 computer equipped with a RAID system, running
Ubuntu Linux and a real-time database. We use the following sensors:

The laser scanner spins at 10 frames per second, capturing approximately 100k points per cycle.
The vertical resolution of the laser scanner is 64. The cameras are mounted approximately level
with the ground plane. The camera images are cropped to a size of 1382 x 512 pixels using libdc's
format 7 mode. After rectification, the images get slightly smaller. The cameras are triggered
at 10 frames per second by the laser scanner (when facing forward) with shutter time adjusted
dynamically (maximum shutter time: 2 ms). Our sensor setup with respect to the vehicle is illustrated
in the following figure. Note that more information on calibration parameters is given in the
calibration files and the development kit (see raw data
section).

img
img

Citation

Please use the following citation when referencing the dataset:

@INPROCEEDINGS{[Geiger2012CVPR](http://www.cvlibs.net/publications/Geiger2012CVPR.pdf),
 author = {[Andreas Geiger](http://www.cvlibs.net/) and [Philip Lenz](http://www.mrt.kit.edu/mitarbeiter_lenz.php)
and [Raquel Urtasun](http://ttic.uchicago.edu/~rurtasun)},
 title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
 booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
 year = {2012}
}

License

CC BY-NC-SA 3.0

Data Summary
Type
Point Cloud, Image,
Amount
--
Size
442.3GB
Provided by
Max Planck Institute for Intellgent Systems
Our goal is to understand the principles of Perception, Action and Learning in autonomous systems that successfully interact with complex environments and to use this understanding to design future systems
| Amount -- | Size 442.3GB
KITTI-raw
2D Box 3D Box
Autonomous Driving
License: CC BY-NC-SA 3.0

Overview

The dataset comprises the following information, captured and synchronized at 10 Hz:

  • Raw (unsynced+unrectified) and processed (synced+rectified) grayscale stereo sequences (0.5
    Megapixels, stored in png format)
  • Raw (unsynced+unrectified) and processed (synced+rectified)
    color stereo sequences (0.5 Megapixels, stored in png format)
  • 3D Velodyne point clouds (100k points per frame, stored as binary float matrix)
  • 3D GPS/IMU data (location, speed, acceleration, meta information, stored as text file)
  • Calibration (Camera, Camera-to-GPS/IMU, Camera-to-Velodyne, stored as text file)
  • 3D object tracklet labels (cars, trucks, trams, pedestrians, cyclists, stored as xml file)

Here, "unsynced+unrectified"
refers to the raw input frames where images are distorted and the frame indices do not correspond,
while "synced+rectified" refers to the processed data where images have been rectified and
undistorted and where the data frame numbers correspond across all sensor streams.

Data Collection

Our recording platform is a Volkswagen Passat B6,
which has been modified with actuators for the pedals (acceleration and brake) and the steering
wheel. The data is recorded using an eight core i7 computer equipped with a RAID system, running
Ubuntu Linux and a real-time database. We use the following sensors:

The laser scanner spins at 10 frames per second, capturing approximately 100k points per cycle.
The vertical resolution of the laser scanner is 64. The cameras are mounted approximately level
with the ground plane. The camera images are cropped to a size of 1382 x 512 pixels using libdc's
format 7 mode. After rectification, the images get slightly smaller. The cameras are triggered
at 10 frames per second by the laser scanner (when facing forward) with shutter time adjusted
dynamically (maximum shutter time: 2 ms). Our sensor setup with respect to the vehicle is illustrated
in the following figure. Note that more information on calibration parameters is given in the
calibration files and the development kit (see raw data
section).

img
img

Citation

Please use the following citation when referencing the dataset:

@INPROCEEDINGS{[Geiger2012CVPR](http://www.cvlibs.net/publications/Geiger2012CVPR.pdf),
 author = {[Andreas Geiger](http://www.cvlibs.net/) and [Philip Lenz](http://www.mrt.kit.edu/mitarbeiter_lenz.php)
and [Raquel Urtasun](http://ttic.uchicago.edu/~rurtasun)},
 title = {Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite},
 booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
 year = {2012}
}

License

CC BY-NC-SA 3.0

2
Start building your AI now
graviti
wechat-QR
Long pressing the QR code to follow wechat official account

Copyright@Graviti