1/ A ∞-wide NN of *any architecture* is a Gaussian process (GP) at init. The NN in fact evolves linearly in function space under SGD, so is a GP at *any time* during training. https://t.co/v1b6kndqCk With Tensor Programs, we can calculate this time-evolving GP w/o trainin any NN

2/ In this gif, narrow relu networks have high probability of initializing near the 0 function (because of relu) and getting stuck. This causes the function distribution to become multi-modal over time. However, for wide relu networks this is not an issue.
3/ This time-evolving GP depends on two kernels: the kernel describing the GP at init, and the kernel describing the linear evolution of this GP. The former is the NNGP kernel, and the latter is the Neural Tangent Kernel (NTK).
4/ Once we have these two kernels, we can derive the GP mean and covariance at any time t via straightforward linear algebra.
5/ So it remains to calculate the NNGP kernel and NT kernel for any given architecture. The first is described in https://t.co/cFWfNC5ALC and in this thread
https://t.co/6RO7VZDQNZ
6/ The NTK for any architecture is calculated in https://t.co/v1b6kndqCk and in this thread
https://t.co/OOoOMdPOsR

More from Data science

I have always emphasized on the importance of mathematics in machine learning.

Here is a compilation of resources (books, videos & papers) to get you going.

(Note: It's not an exhaustive list but I have carefully curated it based on my experience and observations)

📘 Mathematics for Machine Learning

by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong

https://t.co/zSpp67kJSg

Note: this is probably the place you want to start. Start slowly and work on some examples. Pay close attention to the notation and get comfortable with it.


📘 Pattern Recognition and Machine Learning

by Christopher Bishop

Note: Prior to the book above, this is the book that I used to recommend to get familiar with math-related concepts used in machine learning. A very solid book in my view and it's heavily referenced in academia.


📘 The Elements of Statistical Learning

by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie

Mote: machine learning deals with data and in turn uncertainty which is what statistics teach. Get comfortable with topics like estimators, statistical significance,...


📘 Probability Theory: The Logic of Science

by E. T. Jaynes

Note: In machine learning, we are interested in building probabilistic models and thus you will come across concepts from probability theory like conditional probability and different probability distributions.

You May Also Like

#24hrstartup recap and analysis

What a weekend celebrating makers looks like.

A thread

👇Read on

Let's start with a crazy view of what @ProductHunt looked like on Sunday

Download image and upload

A top 7 with:
https://t.co/6gBjO6jXtB @Booligoosh
https://t.co/fwfKbQha57 @stephsmithio
https://t.co/LsSRNV9Jrf @anthilemoon
https://t.co/Fts7T8Un5M @J_Tabansi
Spotify Ctrl @shahroozme
https://t.co/37EoJAXEeG @kossnocorp
https://t.co/fMawYGlnro

If you want some top picks, see @deadcoder0904's thread,

We were going to have a go at doing this, but he nailed it.

It also comes with voting links 🖐so go do your


Over the following days the 24hr startup crew had more than their fair share of launches

Lots of variety: web, bots, extensions and even native apps

eg. @jordibruin with
These 10 threads will teach you more than reading 100 books

Five billionaires share their top lessons on startups, life and entrepreneurship (1/10)


10 competitive advantages that will trump talent (2/10)


Some harsh truths you probably don’t want to hear (3/10)


10 significant lies you’re told about the world (4/10)