A2D2
3D Semantic Segmentation
3D Instance Segmentation
2D Box
2D Polygon
3D Box
Autonomous Driving
|...
License: CC BY-ND 4.0

Overview

We have published the Audi Autonomous Driving Dataset (A2D2) to support startups and academic researchers working on autonomous driving. Equipping a vehicle with a multimodal sensor suite, recording a large dataset, and labelling it, is time and labour intensive. Our dataset removes this high entry barrier and frees researchers and developers to focus on developing new technologies instead. The dataset features 2D semantic segmentation, 3D point clouds, 3D bounding boxes, and vehicle bus data.

Data Collection

Sensor setup

Our sensor suite consists of six cameras, five LiDAR sensors, and an automotive gateway for recording bus data. This configuration provides 360° coverage of the environment with camera and LiDAR. The bus data give information about vehicle state and driver control input.

Sensors

Five LiDAR sensors

  • Up to 100 m range
  • +/- 3 cm accuracy
  • 16 channels
  • 10 Hz rotation rate
  • 360° horizontal field of view
  • +/- 15° vertical field of view

Front centre camera

  • 1920 × 1208 resolution
  • 60° horizontal field of view
  • 38° vertical field of view
  • 30 fps framerate

Surround cameras (5x)

  • 1920 × 1208 resolution
  • 120° horizontal view angle
  • 73° vertical view angle
  • 30 fps framerate

Bus gateway

  • Connected to built-in car gateway
  • Connection to all car buses and their sensors
  • Timestamping and forwarding of sensor data via Ethernet

Overview of sensor carrier with sensors (top view)Overview of sensor carrier with sensors (top view)

Other hardware

Our vehicle is equipped with additional hardware for recording data from the sensor suite and vehicle bus. The cameras are connected to an embedded computer via LVDS, while the LiDAR sensors are connected via a 1G-Ethernet switch. Each LiDAR sensor is connected to a GNSS receiver which acts as a clock. A further GNSS clock serves as a time master for the gateway and embedded computer. The bus gateway connects to the embedded computer via 1G-Ethernet. All data is stored on a crash-safe network storage device, equipped with 48 TB of SSD storage, and accessed via 10G-Ethernet.

Overview of the recording hardware and its setupOverview of the recording hardware and its setup

Sensor synchronization

All sensor signals are timestamped in UTC format. Camera images are timestamped when they arrive at the embedded computer, which is synchronised to the time master. Bus data are timestamped at the gateway, which is also synchronised to the time master. LiDAR signals are timestamped at the sensors, which get their time from GNSS.

Calibration

LiDAR-to-Vehicle

The LiDAR sensor pose relative to the vehicle is determined by direct measurement of positions and orientation when mounted on the vehicle.

Camera-to-Vehicle

Camera poses with respect to the vehicle are determined by direct in-situ measurements of position and orientation.

LiDAR-to-LiDAR optimization

We use one LiDAR as a reference and initialise the other LiDAR sensor poses to their measured positions and orientations. Next, an Iterative Closest Point algorithm is used to refine the poses of the other LiDAR sensors within the vehicle coordinate system. This registration uses a recording of a static environment with a static ego vehicle and does not require any fiducial targets.

Camera-to-LiDAR optimization

The camera poses are optimized using camera and LiDAR recordings of fiducial targets (e.g. checkerboards). Additionally a low speed driving scene is used to improve calibration of sensor orientation. This process uses features (e.g. edges) in camera and LiDAR data to optimize relative poses.

Data Annotation

Semantic segmentation

The dataset features 41,280 frames with semantic segmentation in 38 categories. Each pixel in an image is given a label describing the type of object it represents, e.g. pedestrian, car, vegetation, etc.

Point cloud segmentation

Point cloud segmentation

Point cloud segmentation is produced by fusing semantic pixel information and LiDAR point clouds. Each 3D point is thereby assigned an object type label. This relies on accurate camera-LiDAR registration.

3D bounding boxes

3D bounding boxes

c3D bounding boxes are provided for 12,499 frames. LiDAR points within the field of view of the front camera are labelled with 3D bounding boxes. We annotate 14 classes relevant to driving, e.g. cars, pedestrians, buses, etc.

3D bounding boxes

Citation

Please use the following citation when referencing the dataset:

@article{geyer2020a2d2,
  title={A2d2: Audi autonomous driving dataset},
  author={Geyer, Jakob and Kassahun, Yohannes and Mahmudi, Mentar and Ricou, Xavier and Durgesh,
Rupesh and Chung, Andrew S and Hauswald, Lorenz and Pham, Viet Hoang and M{\"u}hlegg, Maximilian
and Dorn, Sebastian and others},
  journal={arXiv preprint arXiv:2004.06320},
  year={2020}
}

License

CC BY-ND 4.0

Data Summary
Type
Point Cloud, Image,
Amount
--
Size
--
Provided by
Audi AG
The AUDI AG stands for sporty vehicles, high build quality and progressive design – for “Vorsprung durch Technik.” The Audi Group is among the world’s leading producers of premium cars.
Issue
Start Building AI Now