Ever heard of Autoencoders?

The first time I saw a Neural Network with more output neurons than in the hidden layers, I couldn't figure how it would work?!

#DeepLearning #MachineLearning
Here's a little something about them: 🧵👇

Autoencoders are unsupervised neural networks whose architecture you can picture as two funnels connect from the narrow ends.

These networks are primary focus for compression tasks of data in Machine Learning.
We feed them the data so that they can learn the most important features, a smaller representation while keep the integrity of the data.

Later when someone needs, can just take that small representation and recreate the original, just like a zip file.📥
Being unsupervised, they require no labels.
Our inputs and outputs are same and a simple euclidean distance can be used as a loss function for measuring the reconstruction.

Of course, we wouldn't expect a perfect reconstruction.
We can think of an autoencoder having two components, encoder and decoder, represented by the below equations:

We are just trying to minimize the L here. All the backpropagation rules still hold.
Advantages over PCA:

▫️ Can learn non-linear transformations, with non-linear activation functions and multiple layers.

▫️ Doesn't have to learn only from dense layers, can learn from convolutional layers too, better for images, videos right?
▫️ More efficient to learn several layers with auto-encoders rather than one huge transformation with PCA

▫️ Can make use of pre-trained layers from another model to apply transfer learning to enhance the encoder /decoder
Some Common Applications:

🔸 Image Colouring
🔸 Feature Variation
🔸 Dimensionality Reduction
🔸 Denoising Image
🔸 Watermark Removal
Some famous types of autoencoders:

🔹 Convolution Autoencoders
🔹 Sparse Autoencoders
🔹 Deep Autoencoders
🔹 Contractive Autoencoders
Here's the first implementation that I did for dimensionality reduction a couple years, minimal code.
🔗https://t.co/AfAdbA6zMi

More from Machine learning

Really enjoyed digging into recent innovations in the football analytics industry.

>10 hours of interviews for this w/ a dozen or so of top firms in the game. Really grateful to everyone who gave up time & insights, even those that didnt make final cut 🙇‍♂️ https://t.co/9YOSrl8TdN


For avoidance of doubt, leading tracking analytics firms are now well beyond voronoi diagrams, using more granular measures to assess control and value of space.

This @JaviOnData & @LukeBornn paper from 2018 referenced in the piece demonstrates one method
https://t.co/Hx8XTUMpJ5


Bit of this that I nerded out on the most is "ghosting" — technique used by @counterattack9 & co @stats_insights, among others.

Deep learning models predict how specific players — operating w/in specific setups — will move & execute actions. A paper here: https://t.co/9qrKvJ70EN


So many use-cases:
1/ Quickly & automatically spot situations where opponent's defence is abnormally vulnerable. Drill those to death in training.
2/ Swap target player B in for current player A, and simulate. How does target player strengthen/weaken team? In specific situations?
This is a Twitter series on #FoundationsOfML.

❓ Today, I want to start discussing the different types of Machine Learning flavors we can find.

This is a very high-level overview. In later threads, we'll dive deeper into each paradigm... 👇🧵

Last time we talked about how Machine Learning works.

Basically, it's about having some source of experience E for solving a given task T, that allows us to find a program P which is (hopefully) optimal w.r.t. some metric


According to the nature of that experience, we can define different formulations, or flavors, of the learning process.

A useful distinction is whether we have an explicit goal or desired output, which gives rise to the definitions of 1️⃣ Supervised and 2️⃣ Unsupervised Learning 👇

1️⃣ Supervised Learning

In this formulation, the experience E is a collection of input/output pairs, and the task T is defined as a function that produces the right output for any given input.

👉 The underlying assumption is that there is some correlation (or, in general, a computable relation) between the structure of an input and its corresponding output and that it is possible to infer that function or mapping from a sufficiently large number of examples.

You May Also Like