This post is pretty bizarre, but it manages to hit on so many false beliefs that I've seen hurt junior data scientists that it deserves some explicit
More from Data science
1/ A ∞-wide NN of *any architecture* is a Gaussian process (GP) at init. The NN in fact evolves linearly in function space under SGD, so is a GP at *any time* during training. https://t.co/v1b6kndqCk With Tensor Programs, we can calculate this time-evolving GP w/o trainin any NN
2/ In this gif, narrow relu networks have high probability of initializing near the 0 function (because of relu) and getting stuck. This causes the function distribution to become multi-modal over time. However, for wide relu networks this is not an issue.
3/ This time-evolving GP depends on two kernels: the kernel describing the GP at init, and the kernel describing the linear evolution of this GP. The former is the NNGP kernel, and the latter is the Neural Tangent Kernel (NTK).
4/ Once we have these two kernels, we can derive the GP mean and covariance at any time t via straightforward linear algebra.
5/ So it remains to calculate the NNGP kernel and NT kernel for any given architecture. The first is described in https://t.co/cFWfNC5ALC and in this thread

2/ In this gif, narrow relu networks have high probability of initializing near the 0 function (because of relu) and getting stuck. This causes the function distribution to become multi-modal over time. However, for wide relu networks this is not an issue.
3/ This time-evolving GP depends on two kernels: the kernel describing the GP at init, and the kernel describing the linear evolution of this GP. The former is the NNGP kernel, and the latter is the Neural Tangent Kernel (NTK).
4/ Once we have these two kernels, we can derive the GP mean and covariance at any time t via straightforward linear algebra.

5/ So it remains to calculate the NNGP kernel and NT kernel for any given architecture. The first is described in https://t.co/cFWfNC5ALC and in this thread
I have always emphasized on the importance of mathematics in machine learning.
Here is a compilation of resources (books, videos & papers) to get you going.
(Note: It's not an exhaustive list but I have carefully curated it based on my experience and observations)
📘 Mathematics for Machine Learning
by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong
https://t.co/zSpp67kJSg
Note: this is probably the place you want to start. Start slowly and work on some examples. Pay close attention to the notation and get comfortable with it.
📘 Pattern Recognition and Machine Learning
by Christopher Bishop
Note: Prior to the book above, this is the book that I used to recommend to get familiar with math-related concepts used in machine learning. A very solid book in my view and it's heavily referenced in academia.
📘 The Elements of Statistical Learning
by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
Mote: machine learning deals with data and in turn uncertainty which is what statistics teach. Get comfortable with topics like estimators, statistical significance,...
📘 Probability Theory: The Logic of Science
by E. T. Jaynes
Note: In machine learning, we are interested in building probabilistic models and thus you will come across concepts from probability theory like conditional probability and different probability distributions.
Here is a compilation of resources (books, videos & papers) to get you going.
(Note: It's not an exhaustive list but I have carefully curated it based on my experience and observations)
📘 Mathematics for Machine Learning
by Marc Peter Deisenroth, A. Aldo Faisal, and Cheng Soon Ong
https://t.co/zSpp67kJSg
Note: this is probably the place you want to start. Start slowly and work on some examples. Pay close attention to the notation and get comfortable with it.

📘 Pattern Recognition and Machine Learning
by Christopher Bishop
Note: Prior to the book above, this is the book that I used to recommend to get familiar with math-related concepts used in machine learning. A very solid book in my view and it's heavily referenced in academia.

📘 The Elements of Statistical Learning
by Jerome H. Friedman, Robert Tibshirani, and Trevor Hastie
Mote: machine learning deals with data and in turn uncertainty which is what statistics teach. Get comfortable with topics like estimators, statistical significance,...

📘 Probability Theory: The Logic of Science
by E. T. Jaynes
Note: In machine learning, we are interested in building probabilistic models and thus you will come across concepts from probability theory like conditional probability and different probability distributions.
You May Also Like
THREAD: 12 Things Everyone Should Know About IQ
1. IQ is one of the most heritable psychological traits – that is, individual differences in IQ are strongly associated with individual differences in genes (at least in fairly typical modern environments). https://t.co/3XxzW9bxLE
2. The heritability of IQ *increases* from childhood to adulthood. Meanwhile, the effect of the shared environment largely fades away. In other words, when it comes to IQ, nature becomes more important as we get older, nurture less. https://t.co/UqtS1lpw3n
3. IQ scores have been increasing for the last century or so, a phenomenon known as the Flynn effect. https://t.co/sCZvCst3hw (N ≈ 4 million)
(Note that the Flynn effect shows that IQ isn't 100% genetic; it doesn't show that it's 100% environmental.)
4. IQ predicts many important real world outcomes.
For example, though far from perfect, IQ is the single-best predictor of job performance we have – much better than Emotional Intelligence, the Big Five, Grit, etc. https://t.co/rKUgKDAAVx https://t.co/DWbVI8QSU3
5. Higher IQ is associated with a lower risk of death from most causes, including cardiovascular disease, respiratory disease, most forms of cancer, homicide, suicide, and accident. https://t.co/PJjGNyeQRA (N = 728,160)
1. IQ is one of the most heritable psychological traits – that is, individual differences in IQ are strongly associated with individual differences in genes (at least in fairly typical modern environments). https://t.co/3XxzW9bxLE

2. The heritability of IQ *increases* from childhood to adulthood. Meanwhile, the effect of the shared environment largely fades away. In other words, when it comes to IQ, nature becomes more important as we get older, nurture less. https://t.co/UqtS1lpw3n

3. IQ scores have been increasing for the last century or so, a phenomenon known as the Flynn effect. https://t.co/sCZvCst3hw (N ≈ 4 million)
(Note that the Flynn effect shows that IQ isn't 100% genetic; it doesn't show that it's 100% environmental.)

4. IQ predicts many important real world outcomes.
For example, though far from perfect, IQ is the single-best predictor of job performance we have – much better than Emotional Intelligence, the Big Five, Grit, etc. https://t.co/rKUgKDAAVx https://t.co/DWbVI8QSU3

5. Higher IQ is associated with a lower risk of death from most causes, including cardiovascular disease, respiratory disease, most forms of cancer, homicide, suicide, and accident. https://t.co/PJjGNyeQRA (N = 728,160)

Funny, before the election I recall lefties muttering the caravan must have been a Trump setup because it made the open borders crowd look so bad. Why would the pro-migrant crowd engineer a crisis that played into Trump's hands? THIS is why. THESE are the "optics" they wanted.
This media manipulation effort was inspired by the success of the "kids in cages" freakout, a 100% Stalinist propaganda drive that required people to forget about Obama putting migrant children in cells. It worked, so now they want pics of Trump "gassing children on the border."
There's a heavy air of Pallywood around the whole thing as well. If the Palestinians can stage huge theatrical performances of victimhood with the willing cooperation of Western media, why shouldn't the migrant caravan organizers expect the same?
It's business as usual for Anarchy, Inc. - the worldwide shredding of national sovereignty to increase the power of transnational organizations and left-wing ideology. Many in the media are true believers. Others just cannot resist the narrative of "change" and "social justice."
The product sold by Anarchy, Inc. is victimhood. It always boils down to the same formula: once the existing order can be painted as oppressors and children as their victims, chaos wins and order loses. Look at the lefties shrieking in unison about "Trump gassing children" today.
Funny there are those who think these migrant caravans were a FANTASTIC idea that's going to take the immigration issue away from you.
— Brian Cates (@drawandstrike) November 26, 2018
Like several weeks watching a rampaging horde storm the fences & throw rocks at our border patrol agents & getting gassed = great optics!
This media manipulation effort was inspired by the success of the "kids in cages" freakout, a 100% Stalinist propaganda drive that required people to forget about Obama putting migrant children in cells. It worked, so now they want pics of Trump "gassing children on the border."
There's a heavy air of Pallywood around the whole thing as well. If the Palestinians can stage huge theatrical performances of victimhood with the willing cooperation of Western media, why shouldn't the migrant caravan organizers expect the same?
It's business as usual for Anarchy, Inc. - the worldwide shredding of national sovereignty to increase the power of transnational organizations and left-wing ideology. Many in the media are true believers. Others just cannot resist the narrative of "change" and "social justice."
The product sold by Anarchy, Inc. is victimhood. It always boils down to the same formula: once the existing order can be painted as oppressors and children as their victims, chaos wins and order loses. Look at the lefties shrieking in unison about "Trump gassing children" today.