Overview
REALISM From sunlight to sensor
Synscapes is created with an end-to-end approach to realism, accurately capturing the effects of everything from illumination by sun and sky, to the scene's geometric and material composition, to the optics, sensor and processing of the camera system. Synscapes was created in collaboration between 7DLabs Inc., and researchers at Linköping University.
25,000 procedural & unique images
The images in the dataset do not follow a driven path through a single virtual world. Instead, an entirely unique scene was procedurally generated for each of the twenty-five thousand images. As a result, the dataset contains a wide range of variations and unique combinations of features.
Physically based rendering, lights and materials
Synscapes was created using the same physically based rendering techniques that power high-end visual effects in the film industry. Unbiased path tracing tracks the propagation of light using radiometric properties from the sun and the sky, modeling its interaction with surfaces using physically based reflectance models, and ensuring that each image is representative of the real world.
Optical simulation
No optical system is perfect, and the effects of light scattering in a camera's lens can have a large impact on an image's appearance. In particular, when the sun is visible either directly in view, or through bright specular highlights, the image's contrast is significantly reduced. Synscapes models this effect using a long-tail point spread function (PSF).
Sensor simulation and processing
As light strikes a digital sensor, photons are converted into current, and the signal is converted into an image. Synscapes models this process in detail, providing an accurate representation of the the following:
- Motion blur is present in each image, both due to the speed of the ego vehicle where the camera is mounted, as well as that of surrounding vehicles.
- Auto-exposure ensures that each image is well lit, but can also be "tricked" in high contrast scenes, much as a real sensor system could.
- A 10 bit sensor is simulated, with physically plausible shot noise.
- The simulated sensor output is produced by applying a camera response curve, whose result is subsequently quantized to 8 bit PNG-format images.
Multi-dimensional distribution
Synscapes was constructed in such a way that each parameter varies independently, providing a broad distribution across all dimensions of variation. As explored further in our white paper, this property allows for analysis of a neural network's performance by evaluating it selectively on different parts of the dataset, for example the 10% nearest to sunrise or the 5% with the narrowest sidewalk.
In the graph above, the dataset is segmented into 10 subsets according to the sun_height metadata property. Still, each subset exhibits a consistent distribution across all other dimensions, with ego_speed and sidewalk_width illustrated.
Data annotation
RGB data
<img src="https://tutu.s3.cn-northwest-1.amazonaws.com.cn/open+dataset/md_images/178.jpg" alt="img"style="zoom:33%;"/>
Class as single-channel PNG (visualized in color below)
The class annotations follows the Cityscapes convention.
Instance as PNG
The instance id can be found as R + G * 256 + B * 256^2.
Depth as floating point OpenEXR
Stores the planar depth (not distance) in meters.
Instruction
Dataset Layout
Synscapes is organized into the following directories:
|-- img
| |-- class [1-25000].png
| |-- depth [1-25000].exr
| |-- instance [1-25000].png
| |-- rgb [1-25000].png
| |-- rgb-2k [1-25000].png
|-- meta [1-25000].json
Image resolution
Synscapes' native resolution is 1440x720, stored in the img/rgb folder. In order to best support training with architectures designed for Cityscapes, we also include an up-scaled version at 2048x1024 resolution in img/rgb-2k. Note that this up-scaling precedes the sensor simulation stage, ensuring pixel noise is present at the appropriate scale.
Camera metadata
The camera's position and field of view is as follows:
"camera": {
"extrinsic": {
"pitch": 0.038,
"roll": -0.0,
"x": 1.7,
"y": 0.1,
"yaw": -0.0195,
"z": 1.22
},
"intrinsic": {
"fx": 1590.83437,
"fy": 1592.79032,
"resx": 1440,
"resy": 720,
"u0": 771.31406,
"v0": 360.79945
}
}
Instance metadata
2D Bounding Boxes
3D Bounding Boxes in ego-vehicle coordinates
Occlusion
(fraction of object hidden behind other objects)
Truncation (fraction of object outside field of view)
Note: 'class' is also recorded in the JSON file, to facilitate instance-to-class mapping without having to refer to the PNG file.
Scene Metadata
altitude_variation The largest altitude difference in the scene in meters.
curb_height The height of the sidewalk curb in meters.
dist***{mean,stddev} For each actor class, contains the mean and standard deviation of distance for all visible instances.
ego_speed The speed in m/s traveled by the ego vehicle at the time of image capture.
fence_{presence,height} Indicates whether fences are present in the image. Note that due to occlusion, it may be hidden behind another object. Height is measured in meters.
median_presence Whether the road median is present.
num_* For each actor class, contains the number of visible instances.
parking_{presence,angle} Whether a parking lane is present, and whether cars park at 0 (parallel), 45 or 90 degrees.
rel_dist_to_isect Relative distance to nearest intersection. 0.0 indicates ego vehicle is inside the intersection, 1.0 indicates it is one city block away from the next intersection.
road_material_type Integer representing the material used for the road surface.
sidewalk_width The width of the sidewalk in meters
sky_contrast Contains the logarithm of the sky's contrast, measured as max/mean. Values around 2-3 indicate fully overcast sky, 5-6 indicate direct sunlight.
sun_height The normalized angular height of the sun. 0.0 indicates sunset/sunrise, 1.0 indicates zenith.
wall_{presence,height} Whether the wall class is present, with height in meters.
Citation
Please use the following citation when referencing the dataset:
@article{wrenninge2018synscapes,
title={Synscapes: A photorealistic synthetic dataset for street scene parsing},
author={Wrenninge, Magnus and Unger, Jonas},
journal={arXiv preprint arXiv:1810.08705},
year={2018}
}