Overview
What is Argoverse?
- One dataset with 3D tracking annotations for 113 scenes
- One dataset with 324,557 interesting vehicle trajectories extracted from over 1000 driving hours
- Two high-definition (HD) maps with lane centerlines, traffic direction, ground height, and more
- One API to connect the map data with sensor information
Data collection
Where was the data collected?
The data in Argoverse comes from a subset of the area in which Argo AI’s self-driving test vehicles are operating in Miami and Pittsburgh — two US cities with distinct urban driving challenges and local driving habits. We include recordings of our sensor data, or "log segments," across different seasons, weather conditions, and times of day to provide a broad range of real-world driving scenarios.
Total lane coverage: 204 linear kilometers in Miami and 86 linear kilometers in Pittsburgh.
Miami
Beverly Terrace, Edgewater, Town Square
Pittsburgh
Downtown, Strip District, Lower Lawrenceville
How was the data collected?
We collected all of our data using a fleet of identical Ford Fusion Hybrids, fully integrated with Argo AI self-driving technology. We include data from two LiDAR sensors, seven ring cameras and two front-facing stereo cameras. All sensors are roof-mounted:
LiDAR
- 2 roof-mounted LiDAR sensors
- Overlapping 40° vertical field of view
- Range of 200m
- On average, our LiDAR sensors produce a point cloud with ~ 107,000 points at 10 Hz
Localization
We use a city-specific coordinate system for vehicle localization. We include 6-DOF localization for each timestamp, from a combination of GPS-based and sensor-based localization methods.
Cameras
- Seven high-resolution ring cameras (1920 x 1200) recording at 30 Hz with a combined 360° field of view
- Two front-view facing stereo cameras (2056 x 2464) sampled at 5 Hz
Calibration
Sensor measurements for each driving session are stored in “logs.”v For each log, we provide intrinsic and extrinsic calibration data for LiDAR and all nine cameras.
Data Annotation
Argoverse contains amodal 3D bounding cuboids on all objects of interest on or near the drivable area. By “amodal” we mean that the 3D extent of each cuboid represents the spatial extent of the object in 3D space — and not simply the extent of observed pixels or observed LiDAR returns, which is smaller for occluded objects and ambiguous for objects seen from only one face.
Our amodal annotations are automatically generated by fitting cuboids to each object’s LiDAR returns observed throughout an entire tracking sequence. If the full spatial extent of an object is ambiguous in one frame, information from previous or later frames can be used to constrain the shape. The size of amodal cuboids is fixed over time. A few objects in the dataset dynamically change size (e.g. a car opening a door) and cause imperfect amodal cuboid fit.
To create amodal cuboids, we identify the points that belong to each object at every timestep. This information, as well as the orientation of each object, come from human annotators.
We provide ground truth
labels for 15 object classes. Two of these classes include static and dynamic objects that
lie outside of the key categories we defined, and are called *ON_ROAD_OBSTACLE*
and *OTHER_MOVER*
.
The distribution of these object classes across all of the annotated objects in Argoverse
3D Tracking looks like this:
Citation
Please use the following citation when referencing the dataset:
@INPROCEEDINGS { Argoverse,
author = {Ming-Fang Chang and John W Lambert and Patsorn Sangkloy and Jagjeet Singh
and Slawomir Bak and Andrew Hartnett and De Wang and Peter Carr
and Simon Lucey and Deva Ramanan and James Hays},
title = {Argoverse: 3D Tracking and Forecasting with Rich Maps},
booktitle = {Conference on Computer Vision and Pattern Recognition (CVPR)},
year = {2019}
}