Authors Vladimir Haltakov
Imagine we want to detect all pixels belonging to a traffic light from a self-driving car's camera. We train a model with 99.88% performance. Pretty cool, right?
Actually, this model is useless β
Let me explain π

The problem is the data is severely imbalanced - the ratio between traffic light pixels and background pixels is 800:1.
If we don't take any measures, our model will learn to classify each pixel as background giving us 99.88% accuracy. But it's useless!
What can we do? π
Let me tell you about 3 ways of dealing with imbalanced data:
βͺοΈ Choose the right evaluation metric
βͺοΈ Undersampling your dataset
βͺοΈ Oversampling your dataset
βͺοΈ Adapting the loss
Let's dive in π
1οΈβ£ Evaluation metrics
Looking at the overall accuracy is a very bad idea when dealing with imbalanced data. There are other measures that are much better suited:
βͺοΈ Precision
βͺοΈ Recall
βͺοΈ F1 score
I wrote a whole thread on
How to evaluate your ML model? \U0001f4cf
— Vladimir Haltakov (@haltakov) August 31, 2021
Your accuracy is 97%, so this is pretty good, right? Right? No! \u274c
Just looking at the model accuracy is not enough. Let me tell you about some other metrics:
\u25aa\ufe0f Recall
\u25aa\ufe0f Precision
\u25aa\ufe0f F1 score
\u25aa\ufe0f Confusion matrix
Let's start \U0001f447
2οΈβ£ Undersampling
The idea is to throw away samples of the overrepresented classes.
One way to do this is to randomly throw away samples. However, ideally, we want to make sure we are only throwing away samples that look similar.
Here is a strategy to achieve that π

Check out this thread for short reviews of some interesting Machine Learning and Computer Vision papers. I explain the basic ideas and main takeaways of each paper in a Twitter thread.
π I'm adding new reviews all the time! π
AlexNet - the paper that started the deep learning revolution in Computer Vision!
It's finally time for some paper review! \U0001f4dc\U0001f50d\U0001f9d0
— Vladimir Haltakov (@haltakov) September 28, 2020
I promised the other day to start posting threads with summaries of papers that had a big impact on the field of ML and CV.
Here is the first one - the AlexNet paper! pic.twitter.com/QNLPIMZSIa
DenseNet - reducing the size and complexity of CNNs by adding dense connections between layers.
ML paper review time - DenseNet! \U0001f578\ufe0f
— Vladimir Haltakov (@haltakov) October 15, 2020
This paper won the Best Paper Award at the 2017 Conference on Computer Vision and Pattern Recognition (CVPR) - the best conference for computer vision problems.
It introduces a new CNN architecture where the layers are densely connected. pic.twitter.com/DuHytaoXia
Playing for data - generating synthetic GT from a video game (GTA V) and using it to improving semantic segmentation models.
Time for another ML paper review - generating synthetic ground truth data from video games! \U0001f3ae
— Vladimir Haltakov (@haltakov) October 5, 2020
I love this paper, because it pushes the boundaries of creating realistic synthetic ground truth data and shows that you can use it for training and improve your model.
Details \U0001f447 pic.twitter.com/fBgORYG8Lz
Transformers for image recognition - a new paper with the potential to replace convolutions with a transformer.
Another paper review, but a little different this time... \U0001f937\u200d\u2642\ufe0f
— Vladimir Haltakov (@haltakov) October 5, 2020
The paper is not published yet, but is submitted for review at ICLR 2021. It is getting a lot of attention from the CV/ML community, though, and many speculate that it is the end of CNNs... \U0001f447https://t.co/bh6wUxYfxu pic.twitter.com/dZGBYB8A5U