BC DS

The two main machine learning techniques used in the industry today:

1. Gradient Boosted Trees
2. Deep Learning

Focus your time learning Scikit-Learn, XGBoost, and a Deep Learning library like Keras or PyTorch and you'll get the most for your time.

If you need to deal with structured data: Scikit-Learn + XGBoost.

If you need to deal with unstructured data (perceptual tasks): Keras or PyTorch.
Of course, there are many more techniques that will be helpful, but if you focus on these you'll be maximizing your ability to deliver value in the short term.

More from Santiago

You gotta think about this one carefully!

Imagine you go to the doctor and get tested for a rare disease (only 1 in 10,000 people get it.)

The test is 99% effective in detecting both sick and healthy people.

Your test comes back positive.

Are you really sick? Explain below 👇

The most complete answer from every reply so far is from Dr. Lena. Thanks for taking the time and going through


You can get the answer using Bayes' theorem, but let's try to come up with it in a different —maybe more intuitive— way.

👇


Here is what we know:

- Out of 10,000 people, 1 is sick
- Out of 100 sick people, 99 test positive
- Out of 100 healthy people, 99 test negative

Assuming 1 million people take the test (including you):

- 100 of them are sick
- 999,900 of them are healthy

👇

Let's now test both groups, starting with the 100 people sick:

▫️ 99 of them will be diagnosed (correctly) as sick (99%)

▫️ 1 of them is going to be diagnosed (incorrectly) as healthy (1%)

👇

More from Ds

1/

Get a cup of coffee.

In this thread, I'll walk you through 2 probability concepts: Standard Deviation (SD) and Mean Absolute Deviation (MAD).

This will give you insight into Fat Tails -- which are super useful in investing and in many other fields.


2/

Recently, I watched 2 probability "mini-lectures" on YouTube by Nassim Taleb.

One ~10 min lecture covered SD and MAD. The other ~6 min lecture covered Fat Tails.

In these ~16 mins, @nntaleb shared so many useful nuggets that I had to write this thread to unpack them.

3/

For those curious, here are the YouTube links to the lectures:

SD and MAD (~10 min):
https://t.co/0TwubymdE6

Fat Tails (~6 min):

4/

The first thing to understand is the concept of a Random Variable.

In essence, a Random Variable is a number that depends on a random event.

For example, when we roll a die, we get a Random Variable -- a number from the set {1, 2, 3, 4, 5, 6}.

5/

Every Random Variable has a Probability Distribution.

This tells us all the possible values the Random Variable can take, and their respective probabilities.

For example, when we roll a fair die, we get a Random Variable with this Probability Distribution:

You May Also Like