THREAD: How is it possible to train a well-performing, advanced Computer Vision model 𝗼𝗻 𝘁𝗵𝗲 𝗖𝗣𝗨? 🤔

At the heart of this lies the most important technique in modern deep learning - transfer learning.

Let's analyze how it

2/ For starters, let's look at what a neural network (NN for short) does.

An NN is like a stack of pancakes, with computation flowing up when we make predictions.

How does it all work?
3/ We show an image to our model.

An image is a collection of pixels. Each pixel is just a bunch of numbers describing its color.

Here is what it might look like for a black and white image
4/ The picture goes into the layer at the bottom.

Each layer performs computation on the image, transforming it and passing it upwards.
5/ By the time the image reaches the uppermost layer, it has been transformed to the point that it now consists of two numbers only.

The outputs of a layer are called activations, and the outputs of the last layer have a special meaning... they are the predictions!
6/ For a NN distinguishing between cats and dogs, when presented with an image of a cat we want the 𝚌𝚊𝚝 neuron to light up!

We would like for it to have a high value, and for other activations in the last layer to be small...

So far so good! But what about transfer learning?
7/ Consider the lower levels of our stack of pancakes! This is where the bulk of the computation happens.

We know that these layers evolve during training to become feature detectors.

What do we mean by that?
8/ One layer may have tiny sliding windows that are good at detecting lines.

A layer above might have windows that construct shapes from these lines.

We might have a window light up when it sees a square, another when it sees a colorful blob.
9/ As we move up the stack, the features that windows can detect become more complex, building on the work of the layers below.

Maybe one sliding window will combine lines and detect text... maybe another one will learn to detect faces.

Does all of this sound like a hard task?
10/ Absolutely! A network needs to see a lot of pictures to learn all of that.

But, presumably, once we detect all these lower-level features, we can combine them in a plethora of interesting ways? 🤔
11/ We can take all the lines, and the blobs, and the faces, or whatever the lower layers of the network can see, and combine them to predict cats and dogs!

Or trains, planes, and ships. Or blood cell boundaries. Or aneurysms in x-rays. The possibilities are endless!
12/ This is precisely what transfer learning is!

We let researchers, large corporations, spend millions of dollars to train very complex models.

And then we get to build on top of their work! 😇

But so much for the theory. How does it all work in practice?
13/ In our example, we took a pretrained model that was trained on a subset of Imagenet consisting of 1.2 million images across 1000 classes!

The @fastdotai framework downloaded the model for us and removed the top of it (the part responsible for predicting 1 of 1000 classes).
14/ It created a new head for our model, one tailored to the classes in the new dataset.

During training, we kept nearly the entire model frozen, and only trained the uppermost part, making use of all the lower level features that were being detected.

Ingenius! 😁
15/ The concept of transfer learning, of utilizing a model trained on one task to perform another one, applies to other scenarios as well, including NLP (models that act on text).

We will hopefully get a chance to explore all of them 🙂
16/ I plan to explain all the concepts in modern AI in a similar fashion, assuming people find this useful 🙂

If you enjoyed this thread, let me know please and help me reach others who might also be interested 😊🙏

And the visualizations of what the layers can detect?
17/ They come from this seminal paper - Visualizing and Understanding Convolutional Networks

Next stop - deciphering how it all works in code and finding ways to further improve our model!

Stay tuned for more 🙂

More from Tech

(1) Some haters of #Cardano are not only bag holders but also imperative developers.

If you are an imperative programmers you know that Plutus is not the most intuitive -> (

It is, however, intuitive for people with IT financial background, e.g. banks


IELE + k framework will be a real game changer because there will be DSLs (Domain Specific Languages) in any programming language supported by K framework. The only issue is that we need to wait for all this

(3) Good news is that the moment we get IELE integrated into Cardano, we get some popular langs. To my knowledge we should get from day one: Solidity and Rust, maybe others as well?

List of langs:, some commits from many years ago..

@rv_inc ?


(a) Last but not least, marketing to people with Haskell, functional programming with experience and decision makers in banks is a tricky one, how do you market but not tell them you want to replace them. In the end one strategy is to pitch new markets, e.g. developing world

(b) As banks realize what is happening they maybe more inclined to join - not because they would like to but because they will have to - in such cases some development talent maybe re-routed to Plutus / Cardano / Algorand / Tezos

You May Also Like

The chorus of this song uses the shlokas taken from Sundarkand of Ramayana.

It is a series of Sanskrit shlokas recited by Jambavant to Hanuman to remind Him of his true potential.

1. धीवर प्रसार शौर्य भरा: The brave persevering one, your bravery is taking you forward.

2. उतसारा स्थिरा घम्भीरा: The one who is leaping higher and higher, who is firm and stable and seriously determined.

3. ुग्रामा असामा शौर्या भावा: He is strong, and without an equal in the ability/mentality to fight

4. रौद्रमा नवा भीतिर्मा: His anger will cause new fears in his foes.

5.विजिटरीपुरु धीरधारा, कलोथरा शिखरा कठोरा: This is a complex expression seen only in Indic language poetry. The poet is stating that Shivudu is experiencing the intensity of climbing a tough peak, and likening

it to the feeling in a hard battle, when you see your enemy defeated, and blood flowing like a rivulet. This is classical Veera rasa.

6.कुलकु थारथिलीथा गम्भीरा, जाया विराट वीरा: His rough body itself is like a sharp weapon (because he is determined to win). Hail this complete

hero of the world.

7.विलयगागनथाला भिकारा, गरज्जद्धरा गारा: The hero is destructive in the air/sky as well (because he can leap at an enemy from a great height). He can defeat the enemy (simply) with his fearsome roar of war.