PlatformMarketplaceSolutionsResourcesOpen DatasetsCommunityCompany

[GENERAL] A Review on 2d Semantic Segmentation Terms

Published at2022-02-18
A2D2 Datasets

Semantic segmentation, besides Object detection, is another one of the most classic topics in Computer Vision. Different from object detection, semantic segmentation aims to label each pixel in the image with its corresponding class. There are three major criterion to assess semantic segmentation algorithm, namely memory usage, accuracy and execution time. Shortly, We will walk you through some most popular terms to evaluate accuracy.

Before we proceed to the terms, let’s assume that there are classes, where indicates the number of pixels that are from class but are mis-classified into class. With that in mind, we can further define to be the number of true positives, to be the number of false negatives and to be the number of false positives.

Pixel Accuracy

Pixel accuracy is maybe the most straight forward criteria to evaluate how algorithm works. It is defined as the ratio of the number of correctly-classified pixels to total pixels.

Mean Pixel Accuracy

MPA, short for Mean Pixel Accuracy, computes the average of all classes’ pixel accuracies, namely

Mean Intersection over Union

Before we define MIoU, we first define IoU,

The definition of MIoU is just the average of each classes’ IoU.

Frequency Weighted Intersection over Union

As the name may suggest, if we weigh every class by its frequency, we can get the


Among all metrics defined above, MIoU is the most widely used metric due to its representativeness and simplicity. It takes all the classes into account and thus is less biased. Due to their simplicity, other metrics, like Pixel Accuracy , are also very popular, but they are easy to be affected and become biased if in the image there are classes that have quite a proportion of the image. But when each classes have almost same share of the image, Pixel accuracy might be some really good criteria to choose.

Meanwhile, we also have to keep it in mind that our algorithm could only be as good as our datasets. If our annotation is not accurate, our algorithm could not work properly. Be sure to check your datasets and make sure their quality is good.

Semantic segmentation is is built on a large amount of data, so its training process relies on large amount of datasets. Graviti open dataset platform provides many well-known datasets for free. It eliminates the hassle of downloading datasets slowly by training on the cloud. Graviti also provides data version control and extraordinary visualization modules. To manage large datasets, you can use data hosting on Graviti.