Ever heard of Autoencoders?

The first time I saw a Neural Network with more output neurons than in the hidden layers, I couldn't figure how it would work?!

#DeepLearning #MachineLearning
Here's a little something about them: 🧵👇

Autoencoders are unsupervised neural networks whose architecture you can picture as two funnels connect from the narrow ends.

These networks are primary focus for compression tasks of data in Machine Learning.
We feed them the data so that they can learn the most important features, a smaller representation while keep the integrity of the data.

Later when someone needs, can just take that small representation and recreate the original, just like a zip file.📥
Being unsupervised, they require no labels.
Our inputs and outputs are same and a simple euclidean distance can be used as a loss function for measuring the reconstruction.

Of course, we wouldn't expect a perfect reconstruction.
We can think of an autoencoder having two components, encoder and decoder, represented by the below equations:

We are just trying to minimize the L here. All the backpropagation rules still hold.
Advantages over PCA:

▫️ Can learn non-linear transformations, with non-linear activation functions and multiple layers.

▫️ Doesn't have to learn only from dense layers, can learn from convolutional layers too, better for images, videos right?
▫️ More efficient to learn several layers with auto-encoders rather than one huge transformation with PCA

▫️ Can make use of pre-trained layers from another model to apply transfer learning to enhance the encoder /decoder
Some Common Applications:

🔸 Image Colouring
🔸 Feature Variation
🔸 Dimensionality Reduction
🔸 Denoising Image
🔸 Watermark Removal
Some famous types of autoencoders:

🔹 Convolution Autoencoders
🔹 Sparse Autoencoders
🔹 Deep Autoencoders
🔹 Contractive Autoencoders
Here's the first implementation that I did for dimensionality reduction a couple years, minimal code.
🔗https://t.co/AfAdbA6zMi

More from Machine learning

You May Also Like

A brief analysis and comparison of the CSS for Twitter's PWA vs Twitter's legacy desktop website. The difference is dramatic and I'll touch on some reasons why.

Legacy site *downloads* ~630 KB CSS per theme and writing direction.

6,769 rules
9,252 selectors
16.7k declarations
3,370 unique declarations
44 media queries
36 unique colors
50 unique background colors
46 unique font sizes
39 unique z-indices

https://t.co/qyl4Bt1i5x


PWA *incrementally generates* ~30 KB CSS that handles all themes and writing directions.

735 rules
740 selectors
757 declarations
730 unique declarations
0 media queries
11 unique colors
32 unique background colors
15 unique font sizes
7 unique z-indices

https://t.co/w7oNG5KUkJ


The legacy site's CSS is what happens when hundreds of people directly write CSS over many years. Specificity wars, redundancy, a house of cards that can't be fixed. The result is extremely inefficient and error-prone styling that punishes users and developers.

The PWA's CSS is generated on-demand by a JS framework that manages styles and outputs "atomic CSS". The framework can enforce strict constraints and perform optimisations, which is why the CSS is so much smaller and safer. Style conflicts and unbounded CSS growth are avoided.