BC DS

The two main machine learning techniques used in the industry today:

1. Gradient Boosted Trees
2. Deep Learning

Focus your time learning Scikit-Learn, XGBoost, and a Deep Learning library like Keras or PyTorch and you'll get the most for your time.

If you need to deal with structured data: Scikit-Learn + XGBoost.

If you need to deal with unstructured data (perceptual tasks): Keras or PyTorch.
Of course, there are many more techniques that will be helpful, but if you focus on these you'll be maximizing your ability to deliver value in the short term.

More from Santiago

More from Ds

1/

Get a cup of coffee.

In this thread, I'll walk you through 2 probability concepts: Standard Deviation (SD) and Mean Absolute Deviation (MAD).

This will give you insight into Fat Tails -- which are super useful in investing and in many other fields.


2/

Recently, I watched 2 probability "mini-lectures" on YouTube by Nassim Taleb.

One ~10 min lecture covered SD and MAD. The other ~6 min lecture covered Fat Tails.

In these ~16 mins, @nntaleb shared so many useful nuggets that I had to write this thread to unpack them.

3/

For those curious, here are the YouTube links to the lectures:

SD and MAD (~10 min):
https://t.co/0TwubymdE6

Fat Tails (~6 min):

4/

The first thing to understand is the concept of a Random Variable.

In essence, a Random Variable is a number that depends on a random event.

For example, when we roll a die, we get a Random Variable -- a number from the set {1, 2, 3, 4, 5, 6}.

5/

Every Random Variable has a Probability Distribution.

This tells us all the possible values the Random Variable can take, and their respective probabilities.

For example, when we roll a fair die, we get a Random Variable with this Probability Distribution:

You May Also Like