2D Instance Segmentation
2D Semantic Segmentation
Autonomous Driving
License: Custom


The Cityscapes Dataset focuses on semantic understanding of urban street scenes. In the following, we give an overview on the design choices that were made to target the dataset’s focus.


Polygonal annotations

  • Dense semantic segmentation
  • Instance segmentation for vehicle and people



  • 50 cities
  • Several months (spring, summer, fall)
  • Daytime
  • Good/medium weather conditions
  • Manually selected frames
    • Large number of dynamic objects
    • Varying scene layout
    • Varying background


  • 5 000 annotated images with fine annotations (examples)
  • 20 000 annotated images with coarse annotations (examples)


  • Preceding and trailing video frames. Each annotated image is the 20th image from a 30 frame video snippets (1.8s)
  • Corresponding right stereo views
  • GPS coordinates
  • Ego-motion data from vehicle odometry
  • Outside temperature from vehicle sensor

Extensions by other researchers

  • Bounding box annotations of people
  • Images augmented with fog and rain

Benchmark suite and evaluation server

  • Pixel-level semantic labeling
  • Instance-level semantic labeling
  • Panoptic semantic labeling

Labeling Policy

Labeled foreground objects must never have holes, i.e. if there is some background visible ‘through’ some foreground object, it is considered to be part of the foreground. This also applies to regions that are highly mixed with two or more classes: they are labeled with the foreground class. Examples: tree leaves in front of house or sky (everything tree), transparent car windows (everything car).

Class Definitions

Please click on the individual classes for details on their definitions.

Group Classes
flat road · sidewalk · parking+ · rail track+
human person* · rider*
vehicle car* · truck* · bus* · on rails* · motorcycle* · bicycle*· caravan*+ · trailer*+
construction building · wall · fence · guard rail+ · bridge+ · tunnel+
object pole · pole group+ · traffic sign · traffic light
nature vegetation · terrain
sky sky
void ground+ · dynamic+ · static+

* Single instance annotations are available. However, if the boundary between such instances cannot be clearly seen, the whole crowd/group is labeled together and annotated as group, e.g. car group.

+ This label is not included in any evaluation and treated as void (or in the case of license plate as the vehicle mounted on).


Please use the following citation when referencing the dataset:

  title={The cityscapes dataset for semantic urban scene understanding},
  author={Cordts, Marius and Omran, Mohamed and Ramos, Sebastian and Rehfeld, Timo and Enzweiler,
Markus and Benenson, Rodrigo and Franke, Uwe and Roth, Stefan and Schiele, Bernt},
  booktitle={Proceedings of the IEEE conference on computer vision and pattern recognition},



Data Summary
Provided by
Daimler AG R&D
We are one of the biggest producers of premium cars and the world's biggest manufacturer of commercial vehicles with a global reach. We provide financing, leasing, fleet management, insurance and innovative mobility services.
Start Building AI Now