NIPS 2016

The Conference and Workshop on Neural Information Processing Systems (NIPS) is probably the biggest conference with machine learning / deep learning as a main topic. This year, about 6000 people attended it. My friend Marvin and me were supported by Begabtenstiftung Informatik Karlsruhe.

The complete program can be found in the Conference Book, but I would like to point out some of my highlights.

Hot Topics

To get an idea what NIPS 2016 was about, I generated a word cloud from the titles of the accepted papers:

Wordcloud of NIPS titles of 2016

The organization team made something similar, but they had access to the information how the papers were tagged:

NIPS 2016 areas submitted

(Image source: www.tml.cs.uni-tuebingen.de)

The top 10 topics were:

01: Deep Learning or Neural Networks
42: (Application) Computer Vision
02: Large Scale Learning and Big Data
05: Learning Theory
53: (Other) Optimization
08: Sparsity and Feature Selection
51: (Other) Classification
03: Convex Optimization
54: (Other) Probabilistic Models and Methods
56: (Other) Unsupervised Learning Methods

I would say the top five hot topics (first is hottest) are:

GANs
Reinforcment Learning: Look for "bandit" in the paper titles
unsupervised learning
alternative ways to train DNNs
reducing the need for data (transfer learning, domain adaptation, semi-supervised learning)

GANs

Generative Adverserial Networks (short: GANs) were one hot topic at NIPS. The idea is to train two networks: A generator \(G\) and a discriminator \(D\). The generator creates content (e.g. images) and the discriminator has to decide if the content is of the natural distribution (the training set) or made by the generator.

An introduction can be found at blog.evjang.com

Noteworthy papers and ideas are:

Learning What and Where to Draw
InfoGAN: Get more control about properties of the generated content.
How to Train a GAN? Tips and tricks to make GANs work
Generative Visual Manipulation on the Natural Image Manifold: Allow interactive generation of images (see also: Reddit)

Applications of GANs are (according to Eric Jang):

reinforcement learning: paper
domain adaptation
security ML
compression

Bayesian Deep Learning

Combinding deep learning with graphical models like CRFs / Markov Random Fields has been done for semantic segmentation for a while now. It seems like the combination of those two is called "bayesian deep learning". If you look for the keyword "variational" it seems to belong in this category.

I don't really know this area, so I leav it to Eric Jang to point out important papers.

Nut's and Bolts of ML

Andrew Ng gave a talk in which he summarized what he thinks are some of the most important topics when training machine learning systems. Most of it is probably also in his book Machine Learning Yearning or in his Coursera course.

Here are some of the things he talked about:

When you design a speech recognition system, you can measure 3 types of errors: Human error, training set error and test set error. The difference between the human error and the training error is "avoidable error" (bias), the difference between training and test error is "variance".
Human level performance is ambiguous: In a medical application, is an amateur, a doctor, an experienced doctor or a team of (experienced) doctors the "human level performance"?
Role of an "AI Product Manager"

More Papers

Clustering:

Fast and Provably Good Seedings for k-Means

Optimization:

MetaGrad: Multiple Learning Rates in Online Learning
- https://bitbucket.org/wmkoolen/metagrad
Optimal Learning for Multi-pass Stochastic Gradient Methods
Layer Normalization: An successor for Batch Normalization?

Theory:

Deep Learning without Poor Local Minima: Local Minima are global minima in "typical" networks
Matrix Completion has No Spurious Local Minimum
Understanding the Effective Receptive Field in Deep Convolutional Neural Networks

Topology learning:

Learning the Number of Neurons in Deep Networks
From another talk. Most add nodes / edges over time from an initial seed network:
- 1960, Erdös & Renyi: Random graphs
- 1998, Watts & Strogatz: Small-world graph
- 1999, Barabasi & Albert: Preferential attachment
- 1999, Kleinburg et al.: Copying model
- 2003, Vazquez et al.: Duplication-divergence
- 2007, Leskovec et al.: Forest fire
- 2008, Clauset et al.: Hierarchical random graphs
- 2010, Leskovec et al.: Kronecker graphs
Network compression
- Dynamic Network Surgery for Efficient DNNs
- PerforatedCNNs: Acceleration through Elimination of Redundant Convolutions
Swapout: Learning an ensemble of deep architectures

Analysis of ML models:

Content creation:

Visual Dynamics: Probabilistic Future Frame Synthesis via Cross Convolutional Networks

Labeling:

Content based Image Retrival (CBIR):

Misc:

Learnable Visual Markers: Visual markers are something like barcodes
Neurally-Guided Procedural Models: Amortized Inference for Procedural Graphics Programs using Neural Networks
Universal Correspondence Network: Find semantically meaningful similar points in two images. For example, to frontal images of different humans, where the network finds eyes, nose, chin, lips in both images.
Scene Recognition Demo
TorontoCity: Seeing the World with a Million Eyes: A new benchmark dataset by Raquel Urtasun
What makes ImageNet good for Transfer Learning? by Jacob Huh (UC Berkeley), Pulkit Agrawal (UC Berkeley), and Alexei Efros (UC Berkeley)

Lessons learned for conferences

Bring a camera: The information comes very fast. Too fast to take notes, but you can shoot a photo of the slides. In fact, quite a lot of people do so.
Shoot a photo of the first slide, so that you know what the talk was about when you look at your slides.
If you give a talk / poster...
- ... let the first slide be there long enough, so that people can take a photo of it.
- ... or have a URL / the title of the paper on every single slide
- ... let every slide be visible long enough, so that people can take photos
- ... don't use QR-codes only, but also (shortened) URLs
- ... make it available online as PDF
- ... answer key questions: (1) Which problem did you tackle? (2) How did you test your results? (3) To what is your "solution" similar?
As a session organizer...
- ... make sure there is a schedule at the door (outside)
- ... make sure the schedule is online
- ... make sure the schedule is changed everywhere, if it is actually changed

Miscallenious

Advances in Neural Information Processing Systems 29 (NIPS 2016) pre-proceedings
Jobs
- unify.id
- bluevisionlabs.com
NIPS Papers Kaggle Dataset
blog.aylien.com: Highlights of NIPS 2016: Adversarial Learning, Meta-learning and more
salon-des-refuses.org should contain papers which were refused, but are also of high quality. However, the website seems to be down.
Martin Zinkevich: Rules of Machine Learning: Best Practices for ML Engineering
Implementations for NIPS 2016 papers
YouTube:
- GAN Workshop playlist
- Yann Le Cun Energy-based GANs & other adverserial training

NIPS 2016

Hot Topics

GANs

Bayesian Deep Learning

Nut's and Bolts of ML

More Papers

Lessons learned for conferences

Miscallenious

Published

Category

Tags

Contact