BC DS

Do you know what's better than a machine learning model?

Two models.

More than one model working together to solve a problem is called "an emsemble." A simple way to build this is having each model vote for an answer.

But there's a problem with this approach: ↓

I'm gonna focus here on image classification.

Let's assume you built two different models:

• Model 1: A ResNet model.
• Model 2: A one-shot model (Siamese network.)

They both solve the same problem, so you want to combine their results to pick the right answer.
The problem is that you have two models, so voting is not trivial.

What happens in this case?

• Model 1's answer: Class A
• Model 2's answer: Class B

Which one do you select?
Notice that this problem is not limited to an even number of models.

You could have 3 models, each giving you a different answer.

How do you decide which answer to choose?
There are multiple ways to approach this problem. I'll mention a few different ideas on this thread.

Important: Some of these ideas might not be feasible depending on your context. They have worked for me before on different situations, but every problem is different.
Here is a solution:

• Take 6 months' worth of data
• Compute the prior probability of every class
• Run the data through your ensemble
• Track the results of the models
• Use performance and priors to weight these results

Let's try to break these down.
The prior probability of each class tells us how likely we are to get one specific result from a model.

If I tell you that I saw a plane, you would believe me. But how about if I tell you I saw a UFO?

Planes have a higher chance of being the correct answer.
The second component is the performance of each model on every class.

For example, Model 1 might be really good at identifying planes, but Model 2 may constantly make mistakes.

This should tell us how much we should believe the results from each model.
A third component may be the score assigned by the model.

In the case of the ResNet model, the softmax probability. In the case of the one-shot model, the similarity score.
These three different features can help us evaluate each answer and decide which one is more likely to be correct.

The ensemble then becomes:

• Model 1
• Model 2
• Model 3 ← This one is the new model deciding which answer to pick.
Keep in mind that introducing a third model adds complexity to the system.

Sometimes, a simple heuristic might be a good enough solution.

It's our job to weigh the pros and cons. Better performance is just one side of the equation.
If you enjoy these threads, follow me @svpino as I help you deconstruct machine learning and turn it into Your Next Big Thing™.

Do you have any experience dealing with ensemble voting? Any other ideas that come to mind on how to tackle this problem?

More from Santiago

10 machine learning YouTube videos.

On libraries, algorithms, and tools.

(If you want to start with machine learning, having a comprehensive set of hands-on tutorials you can always refer to is fundamental.)

🧵👇

1⃣ Notebooks are a fantastic way to code, experiment, and communicate your results.

Take a look at @CoreyMSchafer's fantastic 30-minute tutorial on Jupyter Notebooks.

https://t.co/HqE9yt8TkB


2⃣ The Pandas library is the gold-standard to manipulate structured data.

Check out @joejamesusa's "Pandas Tutorial. Intro to DataFrames."

https://t.co/aOLh0dcGF5


3⃣ Data visualization is key for anyone practicing machine learning.

Check out @blondiebytes's "Learn Matplotlib in 6 minutes" tutorial.

https://t.co/QxjsODI1HB


4⃣ Another trendy data visualization library is Seaborn.

@NewThinkTank put together "Seaborn Tutorial 2020," which I highly recommend.

https://t.co/eAU5NBucbm
You gotta think about this one carefully!

Imagine you go to the doctor and get tested for a rare disease (only 1 in 10,000 people get it.)

The test is 99% effective in detecting both sick and healthy people.

Your test comes back positive.

Are you really sick? Explain below 👇

The most complete answer from every reply so far is from Dr. Lena. Thanks for taking the time and going through


You can get the answer using Bayes' theorem, but let's try to come up with it in a different —maybe more intuitive— way.

👇


Here is what we know:

- Out of 10,000 people, 1 is sick
- Out of 100 sick people, 99 test positive
- Out of 100 healthy people, 99 test negative

Assuming 1 million people take the test (including you):

- 100 of them are sick
- 999,900 of them are healthy

👇

Let's now test both groups, starting with the 100 people sick:

▫️ 99 of them will be diagnosed (correctly) as sick (99%)

▫️ 1 of them is going to be diagnosed (incorrectly) as healthy (1%)

👇

More from Ds

1/

Get a cup of coffee.

In this thread, I'll walk you through 2 probability concepts: Standard Deviation (SD) and Mean Absolute Deviation (MAD).

This will give you insight into Fat Tails -- which are super useful in investing and in many other fields.


2/

Recently, I watched 2 probability "mini-lectures" on YouTube by Nassim Taleb.

One ~10 min lecture covered SD and MAD. The other ~6 min lecture covered Fat Tails.

In these ~16 mins, @nntaleb shared so many useful nuggets that I had to write this thread to unpack them.

3/

For those curious, here are the YouTube links to the lectures:

SD and MAD (~10 min):
https://t.co/0TwubymdE6

Fat Tails (~6 min):

4/

The first thing to understand is the concept of a Random Variable.

In essence, a Random Variable is a number that depends on a random event.

For example, when we roll a die, we get a Random Variable -- a number from the set {1, 2, 3, 4, 5, 6}.

5/

Every Random Variable has a Probability Distribution.

This tells us all the possible values the Random Variable can take, and their respective probabilities.

For example, when we roll a fair die, we get a Random Variable with this Probability Distribution:

You May Also Like

I'm going to do two history threads on Ethiopia, one on its ancient history, one on its modern story (1800 to today). 🇪🇹

I'll begin with the ancient history ... and it goes way back. Because modern humans - and before that, the ancestors of humans - almost certainly originated in Ethiopia. 🇪🇹 (sub-thread):


The first likely historical reference to Ethiopia is ancient Egyptian records of trade expeditions to the "Land of Punt" in search of gold, ebony, ivory, incense, and wild animals, starting in c 2500 BC 🇪🇹


Ethiopians themselves believe that the Queen of Sheba, who visited Israel's King Solomon in the Bible (c 950 BC), came from Ethiopia (not Yemen, as others believe). Here she is meeting Solomon in a stain-glassed window in Addis Ababa's Holy Trinity Church. 🇪🇹


References to the Queen of Sheba are everywhere in Ethiopia. The national airline's frequent flier miles are even called "ShebaMiles". 🇪🇹
🌺कैसे बने गरुड़ भगवान विष्णु के वाहन और क्यों दो भागों में फटी होती है नागों की जिह्वा🌺

महर्षि कश्यप की तेरह पत्नियां थीं।लेकिन विनता व कद्रु नामक अपनी दो पत्नियों से उन्हे विशेष लगाव था।एक दिन महर्षि आनन्दभाव में बैठे थे कि तभी वे दोनों उनके समीप आकर उनके पैर दबाने लगी।


प्रसन्न होकर महर्षि कश्यप बोले,"मुझे तुम दोनों से विशेष लगाव है, इसलिए यदि तुम्हारी कोई विशेष इच्छा हो तो मुझे बताओ। मैं उसे अवश्य पूरा करूंगा ।"

कद्रू बोली,"स्वामी! मेरी इच्छा है कि मैं हज़ार पुत्रों की मां बनूंगी।"
विनता बोली,"स्वामी! मुझे केवल एक पुत्र की मां बनना है जो इतना बलवान हो की कद्रू के हज़ार पुत्रों पर भारी पड़े।"
महर्षि बोले,"शीघ्र ही मैं यज्ञ करूंगा और यज्ञ के उपरांत तुम दोनो की इच्छाएं अवश्य पूर्ण होंगी"।


महर्षि ने यज्ञ किया,विनता व कद्रू को आशीर्वाद देकर तपस्या करने चले गए। कुछ काल पश्चात कद्रू ने हज़ार अंडों से काले सर्पों को जन्म दिया व विनता ने एक अंडे से तेजस्वी बालक को जन्म दिया जिसका नाम गरूड़ रखा।जैसे जैसे समय बीता गरुड़ बलवान होता गया और कद्रू के पुत्रों पर भारी पड़ने लगा


परिणामस्वरूप दिन प्रतिदिन कद्रू व विनता के सम्बंधों में कटुता बढ़ती गयी।एकदिन जब दोनो भ्रमण कर रहीं थी तब कद्रू ने दूर खड़े सफेद घोड़े को देख कर कहा,"बता सकती हो विनता!दूर खड़ा वो घोड़ा किस रंग का है?"
विनता बोली,"सफेद रंग का"।
तो कद्रू बोली,"शर्त लगाती हो? इसकी पूँछ तो काली है"।