by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong

https://t.co/zSpp67kJSg

Note: this is probably the place you want to start. Start slowly and work on some examples. Pay close attention to the notation and get comfortable with it.

Here is a compilation of resources (books, videos & papers) to get you going.

(Note: It's not an exhaustive list but I have carefully curated it based on my experience and observations)

by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong

https://t.co/zSpp67kJSg

Note: this is probably the place you want to start. Start slowly and work on some examples. Pay close attention to the notation and get comfortable with it.

by Christopher Bishop

Note: Prior to the book above, this is the book that I used to recommend to get familiar with math-related concepts used in machine learning. A very solid book in my view and it's heavily referenced in academia.

by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie

Mote: machine learning deals with data and in turn uncertainty which is what statistics teach. Get comfortable with topics like estimators, statistical significance,...

by E. T. Jaynes

Note: In machine learning, we are interested in building probabilistic models and thus you will come across concepts from probability theory like conditional probability and different probability distributions.

by Dr. Sam Cooper & Dr. David Dye

https://t.co/OYaqzlXmJG

Note: backpropagation is a key algorithm for training deep neural nets that rely on Calculus. Get familiar with concepts like chain rule, Jacobian, gradient descent,.

by Terence Parr & Jeremy Howard

https://t.co/Gk96dRsX5t

Note: In deep learning, you need to understand a bunch of fundamental matrix operations. If you want to dive deep into the math of matrix calculus this is your guide.

by Dr. Sam Cooper & Dr. David Dye

https://t.co/lNYLiMKLma

Note: a great companion to the previous video lectures. Neural networks perform transformations on data and you need linear algebra to get better intuitions.

As I tidy the notes, I need to figure out how to best publish them. Here are the topics covered so far:

I know there are a lot of you interested in these from what I gathered 1 month ago. I want to make sure they are high quality before publishing, so I will spend some time working on that. Stay

I've been writing notes for the latest Deep Learning for NLP course by Stanford.

— elvis (@omarsar0) January 14, 2022

For fun, I also started to add my own code snippets into the notes. I think this is a more efficient way to study: theory + code.

Plan to share these notes soon. Stay tuned! pic.twitter.com/hWzZDORbl6

Below is the course I've been auditing. My advice is you take it slow, there are some advanced concepts in the lectures. It took me 1 month (~3 hrs a day) to take rough notes for the first 15 lectures. Note that this is one semester of

I'm super excited about this project because my plan is to make the content more accessible so that a beginner can consume it more easily. It's tiring but I will keep at it because I know many of you will enjoy and find them useful. More announcements coming soon!

NLP is evolving so fast, so one idea with these notes is to create a live document that could be easily maintained by the community. Something like what we did before with NLP Overview: https://t.co/Y8Z1Svjn24

Let me know if you have any thoughts on this?

2/ In this gif, narrow relu networks have high probability of initializing near the 0 function (because of relu) and getting stuck. This causes the function distribution to become multi-modal over time. However, for wide relu networks this is not an issue.

3/ This time-evolving GP depends on two kernels: the kernel describing the GP at init, and the kernel describing the linear evolution of this GP. The former is the NNGP kernel, and the latter is the Neural Tangent Kernel (NTK).

4/ Once we have these two kernels, we can derive the GP mean and covariance at any time t via straightforward linear algebra.

5/ So it remains to calculate the NNGP kernel and NT kernel for any given architecture. The first is described in https://t.co/cFWfNC5ALC and in this thread

As a dean of a major academic institution, I could not have said this. But I will now. Requiring such statements in applications for appointments and promotions is an affront to academic freedom, and diminishes the true value of diversity, equity of inclusion by trivializing it. https://t.co/NfcI5VLODi

— Jeffrey Flier (@jflier) November 10, 2018

We know that elite institutions like the one Flier was in (partial) charge of rely on irrelevant status markers like private school education, whiteness, legacy, and ability to charm an old white guy at an interview.

Harvard's discriminatory policies are becoming increasingly well known, across the political spectrum (see, e.g., the recent lawsuit on discrimination against East Asian applications.)

It's refreshing to hear a senior administrator admits to personally opposing policies that attempt to remedy these basic flaws. These are flaws that harm his institution's ability to do cutting-edge research and to serve the public.

Harvard is being eclipsed by institutions that have different ideas about how to run a 21st Century institution. Stanford, for one; the UC system; the "public Ivys".