Important paper from Google on large batch optimization. They do impressively careful experiments measuring # iterations needed to achieve target validation error at various batch sizes. The main "surprise" is the lack of surprises. [thread]

https://t.co/7QIx5CFdfJ

The paper is a good example of lots of elements of good experimental design. They validate their metric by showing lots of variants give consistent results. They tune hyperparamters separately for each condition, check that optimum isn't at the endpoints, and measure sensitivity.
They have separate experiments where the hold fixed # iterations and # epochs, which (as they explain) measure very different things. They avoid confounds, such as batch norm's artificial dependence between batch size and regularization strength.
When the experiments are done carefully enough, the results are remarkably consistent between different datasets and architectures. Qualitatively, MNIST behaves just like ImageNet.
Importantly, they don't find any evidence for a "sharp/flat optima" effect whereby better optimization leads to worse final results. They have a good discussion of experimental artifacts/confounds in past papers where such effects were reported.
The time-to-target-validation is explained purely by optimization considerations. There's a regime where variance dominates, and you get linear speedups w/ batch size. Then there's a regime where curvature dominates and larger batches don't help. As theory would predict.
Incidentally, this paper must have been absurdly expensive, even by Google's standards. Doing careful empirical work on optimizers requires many, many runs of the algorithm. (I think surprising phenomena on ImageNet are often due to the difficulty of running proper experiments.)

More from Machine learning

Starting a new project using #Angular? Here is a list of all the stuff i use to launch my projects the fastest i can.

A THREAD 👇

Have you heard about Monorepo? I created one with all my Angular (and Nest) projects using
https://t.co/aY5llDtXg8.

I can share A LOT of code with it. Ex: Everytime i start a new project, i just need to import an Auth lib, that i created, and all Auth related stuff is set up.

Everyone in the Angular community knows about https://t.co/kDnunQZnxE. It's not the most beautiful component library out there, but it's good and easy to work with.

There's a bunch of state management solutions for Angular, but https://t.co/RJwpn74Qev is by far my favorite.

There's a lot of boilerplate, but you can solve this with the built-in schematics and/or with your own schematics

Are you not using custom schematics yet? Take a look at this:

https://t.co/iLrIaHVafm
https://t.co/3382Tn2k7C

You can automate all the boilerplate with hundreds of files associates with creating a new feature.

You May Also Like

This is NONSENSE. The people who take photos with their books on instagram are known to be voracious readers who graciously take time to review books and recommend them to their followers. Part of their medium is to take elaborate, beautiful photos of books. Die mad, Guardian.


THEY DO READ THEM, YOU JUDGY, RACOON-PICKED TRASH BIN


If you come for Bookstagram, i will fight you.

In appreciation, here are some of my favourite bookstagrams of my books: (photos by lit_nerd37, mybookacademy, bookswrotemystory, and scorpio_books)