BC MACHINE LEARNING

Saved by @zmbnski

Roger Grosse
@RogerGrosse 6 years, 7 months ago 3405 views

Save to PDF Share See On Twitter

Important paper from Google on large batch optimization. They do impressively careful experiments measuring # iterations needed to achieve target validation error at various batch sizes. The main "surprise" is the lack of surprises. [thread]

https://t.co/7QIx5CFdfJ

The paper is a good example of lots of elements of good experimental design. They validate their metric by showing lots of variants give consistent results. They tune hyperparamters separately for each condition, check that optimum isn't at the endpoints, and measure sensitivity.

They have separate experiments where the hold fixed # iterations and # epochs, which (as they explain) measure very different things. They avoid confounds, such as batch norm's artificial dependence between batch size and regularization strength.

When the experiments are done carefully enough, the results are remarkably consistent between different datasets and architectures. Qualitatively, MNIST behaves just like ImageNet.

Importantly, they don't find any evidence for a "sharp/flat optima" effect whereby better optimization leads to worse final results. They have a good discussion of experimental artifacts/confounds in past papers where such effects were reported.

The time-to-target-validation is explained purely by optimization considerations. There's a regime where variance dominates, and you get linear speedups w/ batch size. Then there's a regime where curvature dominates and larger batches don't help. As theory would predict.

Incidentally, this paper must have been absurdly expensive, even by Google's standards. Doing careful empirical work on optimizers requires many, many runs of the algorithm. (I think surprising phenomena on ImageNet are often due to the difficulty of running proper experiments.)

More from Machine learning

Santiago
@svpino

11 key concepts of Machine Learning.

— Supervised Learning Edition —

🧵👇

😜

Before starting, remember that, if you follow me, one of your enemies will be immediately destroyed (and you'll get to read more of these threads, of course.)

And if you don't follow me, well, you just hurt my feelings.

😜

1. Labels

(Also referred to as "y")

The label is the piece of information that we are predicting.

For example:

- the animal that's shown in a picture
- the price of a house
- whether a message is spam or not

👇

2. Features

(Also referred to as "x")

These are the input variables to our problem. We use these features to predict the "label."

For example:

- pixels of a picture
- number of bedrooms of a house
- square footage of a house

👇

3. Samples

(This is also known as "examples.")

A sample is a particular instance of data (features or "x.") It could be "labeled" or "unlabeled."

👇

Santiago
@svpino

An introduction to one of the the most basic structures used in machine learning: a tensor.

🧵👇

Tensors are the data structure used by machine learning systems, and getting to know them is an essential skill you should build early on.

A tensor is a container for numerical data. It is the way we store the information that we'll use within our system.

(2 / 16)

Three primary attributes define a tensor:

▫️ Its rank
▫️ Its shape
▫️ Its data type

(3 / 16)

The rank of a tensor refers to the tensor's number of axes.

Examples:

▫️ The rank of a matrix is 2 because it has two axes.
▫️ The rank of a vector is 1 because it has a single axis.

(4 / 16)

The shape of a tensor describes the number of dimensions along each axis.

Example:

▫️ A square matrix may have (3, 3) dimensions.
▫️ A tensor of rank 3 may have (2, 5, 7) dimensions.

(5 / 16)

Advertisement

emmi
@emmibevensee

The "polarization is the problem" narrative to me is bad because it implies the status quo was a stable and acceptable set of trade-offs. It's the equivalent of a pollyanna "why can't we all get along" but the we in question is white supremacy and its targets.

like i understand that conflict mediation requires mutual concessions (i did my MA in conflict studies) and while it's important to meet people's deep needs across spectrums of difference, certain things are just not up for negotiation.

Also while peacebuilding work is good. It's not just "seeing each other as human" it's literally about transforming structural issues. (hence Galtung's distinction between positive and negative peace wherin negative peace is like a ceasefire and positive is a transformed society.

And I think that to whatever extent that kind of peacebuilding work is applicable, it's not between fascists and people who think fascists are bad. Fascists are not invited. They are only invited if they seize territories and can't be crushed. Then secret negotiations.

The US is always like "we don't negotiate with terrorists" but literally every president negotiates with who they consider to be terrorists. It's all just down low. But that's just because counter-insurgency failed.

Alejandro Piad Morffis
@AlejandroPiad

This is a Twitter series on #FoundationsOfML.

❓ Today, I want to start discussing the different types of Machine Learning flavors we can find.

This is a very high-level overview. In later threads, we'll dive deeper into each paradigm... 👇🧵

Last time we talked about how Machine Learning works.

Basically, it's about having some source of experience E for solving a given task T, that allows us to find a program P which is (hopefully) optimal w.r.t. some metric

I'm starting a Twitter series on #FoundationsOfML. Today, I want to answer this simple question.

\u2753 What is Machine Learning?

This is my preferred way of explaining it... \U0001f447\U0001f9f5
— Alejandro Piad Morffis (@AlejandroPiad) January 12, 2021

According to the nature of that experience, we can define different formulations, or flavors, of the learning process.

A useful distinction is whether we have an explicit goal or desired output, which gives rise to the definitions of 1️⃣ Supervised and 2️⃣ Unsupervised Learning 👇

1️⃣ Supervised Learning

In this formulation, the experience E is a collection of input/output pairs, and the task T is defined as a function that produces the right output for any given input.

👉 The underlying assumption is that there is some correlation (or, in general, a computable relation) between the structure of an input and its corresponding output and that it is possible to infer that function or mapping from a sufficiently large number of examples.

Madison Kanna
@Madisonkanna

With hard work and determination, anyone can learn to code.

Here’s a list of my favorites resources if you’re learning to code in 2021.

👇

1. freeCodeCamp.

I’d suggest picking one of the projects in the curriculum to tackle and then completing the lessons on syntax when you get stuck. This way you know *why* you’re learning what you’re learning, and you're building things

2. https://t.co/7XC50GlIaa is a hidden gem. Things I love about it:

1) You can see the most upvoted solutions so you can read really good code

2) You can ask questions in the discussion section if you're stuck, and people often answer. Free

3. https://t.co/V9gcXqqLN6 and https://t.co/KbEYGL21iE

On stackoverflow you can find answers to almost every problem you encounter. On GitHub you can read so much great code. You can build so much just from using these two resources and a blank text editor.

4. https://t.co/xX2J00fSrT @eggheadio specifically for frontend dev.

Their tutorials are designed to maximize your time, so you never feel overwhelmed by a 14-hour course. Also, the amount of prep they put into making great courses is unlike any other online course I've seen.

You May Also Like

$Kitze @ \U0001f1e9\U0001f1ea\ufe0f$

Kitze @ 🇩🇪️...
@thekitze

To people who are under the impression that you can get rich quickly by working on an app, here are the stats for https://t.co/az8F12pf02

📈 ~12000 vistis
☑️ 109 transactions
💰 353€ profit (285 after tax)

I have spent 1.5 months on this app. You can make more $ in 2 days.

🤷‍♂️

I'm still happy that I launched a paid app bcs it involved extra work:

- backend for processing payments (+ permissions, webhooks, etc)
- integration with payment processor
- UI for license activation in Electron
- machine activation limit
- autoupdates
- mailgun emails

etc.

These things seemed super scary at first. I always thought it was way too much work and something would break. But I'm glad I persisted. So far the only problem I have is that mailgun is not delivering the license keys to certain domains like https://t.co/6Bqn0FUYXo etc. 👌

omg I just realized that me . com is an Apple domain, of course something wouldn't work with these dicks

Vibhu Vashisth
@VIBHU_Tweet

ANCIENT BHARAT WAS CALLED VISHWAGURU, AND IT REALLY DESERVED IT.

India had a well developed education system centuries before the westerners arrived and called us uncivilized. Education was always given a great importance in Indian civilisation since times immemorial.

Ancient India had the Gurukulas and Ashramas as the epicentres of the knowledge and enlightenment. Bigger Gurukulas served as the centres of higher education called universities. Besides these universities, temples also emerged as the major learning centres.

Studying the Holy Scriptures, character building, personality development, responsibilities towards self,family and society,discipline and preservation of the ancient culture and heritage were the key embodiments of education.

This kind of education system made ancient India, a centre of knowledge all over the world. Many foreign students came to India for education and India was called the 'Vishwaguru'.
Takshshila and Nalanda were the two prominent universities of ancient India.

But were these only Universities of ancient India?The answer is 'No'.Let's learn about these gems of our education system that flourished across ancient India & be proud.

NALANDA UNIVERSITY
This was the oldest university of ancient India.Situated in Nalanda distt. of Bihar...

Advertisement

SK
@sk_diary1

Andhakasuravadha Samharamurti -#story #Thread
Sculpture of Bhagwan Shiva as the destroyer of the Asura Andhaka - Cave 29,Ellora caves

Asura Andhaka hangs on to a trishula held by 6-armed Shiva.

Once Parvathi & Shiva were relaxing in the pleasant gardens of Mount Mandhara.

Bhagwan Shiva sat down there to enjoy the happiness.Parvati closed the eyes of Shiva playfully & darkness covered the whole universe.Parvati was perplexed & her palms were wet with perspiration.A few drops of her sweat mixed with sweat from Shiva’s body& fell to the ground.

From this sweat,a fierce creature was born,it started to roar.Child was born blind as Shiva's eyes were closed so was named as Andaka.Andha means darkness&also ignorance.Around the same time Asura Hiranyaksha was performing rigorous tapas Shiva for a boon of powerful son.

Bhagwan Shiva was pleased by tapas of Hiranyaksha & blessed him with darshan.Hiranyaksha asked for a powerful son,Shiva told him that he cannot have his own son,but can adopt a blind child who would become very powerful later.Shiva gave child Andaka to Hiranyaksha.

Hiranyaksha returned to his kingdom with Andaka.Later after the vadh of Hiranyaksha, Andaka is said to have lived with his cousins.He was ridiculed,so dejected Andaka went to forest.He did severe tapas for over many years towards Brahma Dev. Brahma dev appeared before him.

Eric
@ericbahn

1/ Recently, I learned about the concept of ‘struggle porn’. This refers to the fetish of sacrificing nearly everything (health, relationships, time) for your startup. I fear that this label is creating more harm than good.

2/ @nateliason wrote a fabulous article about this topic, which first introduced me to this concept. Everyone should read it:

3/ @alexisohanian also recently spoke about struggle porn, though referring to it as ‘hustle porn.’ (as the co-founder of @hustlefundvc, obviously I take issue with this pejorative definition of ‘hustle’, but that’s another topic)

4/ Struggling in itself isn’t necessarily bad. Let’s be real here--if you’re going to start a startup, you will struggle. Founders should expect to work harder generally than employees. You will be challenged in many ways and also be super rewarded in others. This is normal.

5/ However, if you are over-indexing on hard work without clear purpose and degrading your mental health/relationships in the process--then this is very bad and we obviously should not celebrate this behavior.

Tweeting Historians
@Tweetistorian

@ChenHuailun here. I'm adding a supplementary thread before today's scheduled one to expand on the origins and early history of the Church of the East (aka the Nestorian Church) ~ahc #jingjiao 1/

The Church of the East traces its origins to the christological position of the School of Antioch, which held that the human and divine and human essences (ουσία) of Christ were united in a single prosopon (πρόσωπον). ~ahc #jingjiao 2/

This ran against the position of the School of Alexandria, which held that the two essences of Christ were united in a single hypostasis (ὑπόστασις). In general terms, hypostasis is more inherent than prosopon. ~ahc #jingjiao 3/

One implication of the Antiochene rejection of the hypostatic union is that Mary could not accurately be called the Theotokos, or God-bearer. Instead, she is the Christotokos, the Christ-bearer. ~ahc #jingjiao 4/

These Christological debates were further complicated by Syriac terms such as 'kyana' (ܟܝܢܐ), associated with 'ousia' but later translated to 'hypostasis' by Syrian Orthodox; and 'qnoma ' (ܩܢܘܡܐ), sometimes contentiously translated as 'hypostasis' or 'person.' ~ahc #jingjiao 5/