graviti
Products
Resources
About us
SemanticKITTI
3D Semantic Segmentation
Autonomous Driving
|...
License: CC BY-NC-SA 4.0

Overview

We present a large-scale dataset based on the KITTI Vision Benchmark
and we used all sequences provided by the odometry task.
We provide dense annotations for each individual scan of sequences 00-10, which enables the
usage of multiple sequential scans for semantic scene interpretation, like semantic segmentation
and semantic scene completion.

The remaining sequences, i.e., sequences 11-21, are used as
a test set showing a large variety of challenging traffic situations and environment types.
Labels for the test set are not provided and we use an evaluation service that scores submissions
and provides test set results.

Classes

The dataset contains 28 classes including classes distinguishing
non-moving and moving objects. Overall, our classes cover traffic participants, but also functional
classes for ground, like parking areas, sidewalks.

img

Folder structure and format

Semantic Segmentation and Panoptic Segmentation

img

We provide for each scan XXXXXX.bin of the velodyne folder in the sequence folder of
the original KITTI Odometry Benchmark, a file XXXXXX.label in the labels folder that contains
for each point a label in binary format. The label is a 32-bit unsigned integer (aka uint32_t)
for each point, where the lower 16 bits correspond to the label. The upper 16 bits encode the
instance id, which is temporally consistent over the whole sequence, i.e., the same object
in two different scans gets the same id. This also holds for moving cars, but also static objects
seen after loop closures.

We furthermore provide the poses.txt file that contains the poses,
which we used to annotate the data, estimated by a surfel-based SLAM approach (SuMa).

Semantic Scene Completion

img

We provide for each scan XXXXXX.bin of the velodyne folder in the sequence folder of
the original KITTI Odometry Benchmark, we provide in the voxel folder:

  • a file XXXXXX.bin
    in a packed binary format that contains for each voxel if that voxel is occupied by laser measurements.
    This is the input to the semantic scene completion task and it corresponds to the voxelization
    of a single LiDAR scan.
  • a file XXXXXX.label that contains for each voxel of the completed
    scene a label in binary format. The label is a 16-bit unsigned integer (aka uint16_t) for
    each voxel.
  • a file XXXXXX.invalid in a packed binary format that contains for each voxel
    a flag indicating if that voxel is considered invalid, i.e., the voxel is never directly seen
    from any position to generate the voxels. These voxels are also not considered in the evaluation.
  • a file XXXXXX.occluded in a packed binary format that contains for each voxel a flag that
    specifies if this voxel is either occupied by LiDAR measurements or occluded by a voxel in
    line of sight of all poses used to generate the completed scene.

The blue files (img)
are only given for the training data and the label file must be predicted for the semantic
segmentation task.

To allow a higher compression rate, we store the binary flags in a custom
format, where we store the flags as bit flags,i.e., each byte of the file corresponds to 8
voxels in the unpacked voxel grid. Please see the development kit
for further information on how to efficiently read these files using numpy.

See also our development
kit
for further information on the labels
and the reading of the labels using Python. The development kit also provides tools for visualizing
the point clouds.

Citation

Please use the following citation when referencing the dataset:

@inproceedings{behley2019iccv,
  author = {J. Behley and M. Garbade and A. Milioto and J. Quenzel and S. Behnke and C. Stachniss
and J. Gall},
  title = {{SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences}},
  booktitle = {Proc. of the IEEE/CVF International Conf.~on Computer Vision (ICCV)},
  year = {2019}
}

But also cite the original KITTI Vision Benchmark:

@inproceedings{geiger2012cvpr,
  author = {A. Geiger and P. Lenz and R. Urtasun},
  title = {{Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite}},
  booktitle = {Proc.~of the IEEE Conf.~on Computer Vision and Pattern Recognition (CVPR)},
  pages = {3354--3361},
  year = {2012}
}

License

CC BY-NC-SA 4.0

Data Summary
Type
Point Cloud,
Amount
--
Size
833.47MB
Provided by
University of Bonn
The dataset is result of a collaboration between the Photogrammetry & Robotics Group, the Computer Vision Group, and the Autonomous Intelligent Systems Group, which are all part of the University of Bonn.
| Amount -- | Size 833.47MB
SemanticKITTI
3D Semantic Segmentation
Autonomous Driving
License: CC BY-NC-SA 4.0

Overview

We present a large-scale dataset based on the KITTI Vision Benchmark
and we used all sequences provided by the odometry task.
We provide dense annotations for each individual scan of sequences 00-10, which enables the
usage of multiple sequential scans for semantic scene interpretation, like semantic segmentation
and semantic scene completion.

The remaining sequences, i.e., sequences 11-21, are used as
a test set showing a large variety of challenging traffic situations and environment types.
Labels for the test set are not provided and we use an evaluation service that scores submissions
and provides test set results.

Classes

The dataset contains 28 classes including classes distinguishing
non-moving and moving objects. Overall, our classes cover traffic participants, but also functional
classes for ground, like parking areas, sidewalks.

img

Folder structure and format

Semantic Segmentation and Panoptic Segmentation

img

We provide for each scan XXXXXX.bin of the velodyne folder in the sequence folder of
the original KITTI Odometry Benchmark, a file XXXXXX.label in the labels folder that contains
for each point a label in binary format. The label is a 32-bit unsigned integer (aka uint32_t)
for each point, where the lower 16 bits correspond to the label. The upper 16 bits encode the
instance id, which is temporally consistent over the whole sequence, i.e., the same object
in two different scans gets the same id. This also holds for moving cars, but also static objects
seen after loop closures.

We furthermore provide the poses.txt file that contains the poses,
which we used to annotate the data, estimated by a surfel-based SLAM approach (SuMa).

Semantic Scene Completion

img

We provide for each scan XXXXXX.bin of the velodyne folder in the sequence folder of
the original KITTI Odometry Benchmark, we provide in the voxel folder:

  • a file XXXXXX.bin
    in a packed binary format that contains for each voxel if that voxel is occupied by laser measurements.
    This is the input to the semantic scene completion task and it corresponds to the voxelization
    of a single LiDAR scan.
  • a file XXXXXX.label that contains for each voxel of the completed
    scene a label in binary format. The label is a 16-bit unsigned integer (aka uint16_t) for
    each voxel.
  • a file XXXXXX.invalid in a packed binary format that contains for each voxel
    a flag indicating if that voxel is considered invalid, i.e., the voxel is never directly seen
    from any position to generate the voxels. These voxels are also not considered in the evaluation.
  • a file XXXXXX.occluded in a packed binary format that contains for each voxel a flag that
    specifies if this voxel is either occupied by LiDAR measurements or occluded by a voxel in
    line of sight of all poses used to generate the completed scene.

The blue files (img)
are only given for the training data and the label file must be predicted for the semantic
segmentation task.

To allow a higher compression rate, we store the binary flags in a custom
format, where we store the flags as bit flags,i.e., each byte of the file corresponds to 8
voxels in the unpacked voxel grid. Please see the development kit
for further information on how to efficiently read these files using numpy.

See also our development
kit
for further information on the labels
and the reading of the labels using Python. The development kit also provides tools for visualizing
the point clouds.

Citation

Please use the following citation when referencing the dataset:

@inproceedings{behley2019iccv,
  author = {J. Behley and M. Garbade and A. Milioto and J. Quenzel and S. Behnke and C. Stachniss
and J. Gall},
  title = {{SemanticKITTI: A Dataset for Semantic Scene Understanding of LiDAR Sequences}},
  booktitle = {Proc. of the IEEE/CVF International Conf.~on Computer Vision (ICCV)},
  year = {2019}
}

But also cite the original KITTI Vision Benchmark:

@inproceedings{geiger2012cvpr,
  author = {A. Geiger and P. Lenz and R. Urtasun},
  title = {{Are we ready for Autonomous Driving? The KITTI Vision Benchmark Suite}},
  booktitle = {Proc.~of the IEEE Conf.~on Computer Vision and Pattern Recognition (CVPR)},
  pages = {3354--3361},
  year = {2012}
}

License

CC BY-NC-SA 4.0

1
Start building your AI now
graviti
wechat-QR
Long pressing the QR code to follow wechat official account

Copyright@Graviti