This article was originally published by Ta-Ying Cheng on Towards Data Science.
Data augmentations are tricks almost every machine learning engineer uses to boost their results. Simple tricks such as flipping the image can easily get a few percent accuracies for your image classification model without doing much of the fine-tuning.
However, these straightforward techniques such as flipping, rotating, and jittering images are not what we are going to talk about today. Instead, we are going to talk about the 4 unorthodox techniques that have been introduced in the deep learning era that have shown to be more than promising in improving image-related tasks such as classification, detection, and segmentation. Be prepared as some of these augmentation techniques may seeming too odd to be true!
Jump ahead to a specific section.
Cutout is perhaps the most intuitive technique out of the 4 introduced in this article. The goal is to ‘cut out’ a part of the image and use the remaining as a new image data with the same label (see Figure 1). As augmentation techniques such as jittering and adding colours to make images more challenging have shown to be beneficial, it should be no different for the method cutout to work.
This data augmentation strategy introduced in 2017 is so simple one often doubts its ability at first glance. Zhang et al. in their paper first proposed this technique: we interpolate two images, and we interpolate the corresponding labels to use as the new label.
With numerous tests across different datasets, we can actually see that this simple technique boosted performances for various model backbones. The result is conjectured to be owing to the soft labeling mixup creates on datasets such that a wider distribution of data is seen during training.
Now, if you are surprised that mixup works, you would be in awe that CutMix works even better! Instead of mixing up every pixel, Yun et al. decide to cut a part of one image and paste it to another, and the cut-paste ratio is used as the new label for the generated image (see Figure 1). Again, this technique is so simple and easy to implement yet powerful in dealing with image classification tasks.
On a side note, the recent paper Attentive CutMix brings in the attentive feature map to decide where to cut and paste, which shows even better effects.
Ghiasi et al. decided to transfer the success of CutMix into instance segmentation, by randomly copying an instance from one image to another to allow the image to properly segment it out. Likewise, the results have shown such augmentation technique to be powerful in boosting performances.
Testing all these data augmentation techniques are straightforward with frameworks such as PyTorch. One could easily the alter the image and corresponding labels inside the data loader to achieve all the above data augmentation techniques.
However, since we have to dive into the data loaders we would not be able to directly use the data loaders given by PyTorch. One platform I found useful is the Graviti Open Dataset platform that connects to numerous academically renowned datasets (e.g., CIFAR10, ImageNet, which saves the need of looking into what datasets are often used for particular tasks. I would also recommend the tutorial by PyTorch itself on data loaders when adding your augmentations.
So there you have it! Hopefully, these add a few more tricks you can play with to boost your image models to the next level!
Thank you for making it this far 🙏! I will be posting more on different areas of computer vision/deep learning, so join and subscribe if you are interested to know more!