• Martin Thoma
  • Home
  • Categories
  • Tags
  • Archives
  • Support me

Machine Learning Glossary

Contents

  • See also

The following is a list of short explanations of different terms in machine learning. The aim is to keep things simple and brief, not to explain the terms in full detail.

Active Learning
The algorithm gives a pattern and asks for a label.
Backpropagation
A clever implementation of gradient descent for neural networks.
Bias
Bias is a concept which describes a systematic error. A classifier with a high bias tends to give one answer more often, no matter what the input is. This concept is relatied to variance and well described with the images here.
BLSTM, BiLSTM
Bidirectional long short-term memory (see paper and poster).
Co-Training
A form of semi-supervised learning. Two independant classifiers are trained on different labeled datasets. The classifiers are applied to the unlabeled data. Data with high confidence will be added to the other classifiers data.
Collaborative Filtering
You have users and items which are rated. No user rated everything. You want to fill the gaps (see article).
Computer Vision
The academic discipline which deals with how to gain high-level understanding from digital images or videos. Common tasks include image classifiction, semantic segmentation, detection and localization.
Curriculum learning
A method for pretraining. First optimize a smoothed objective and gradually consider less smoothing. So a curriculum is a sequence of training criteria. One might show gradually more difficult training examples. See Curriculum Learning by Benigo, Louradour, Collobert and Weston for details.
Curse of dimensionality
Various problems of high-dimensional spaces that do not occur in low-dimensional spaces. High-dimensional often means several 100 dimensions.
DCGAN (Deep Convolutional Generative Adverserial Networks)
TODO
DCIGN (Deep Convolutional Inverse Graphic Network)
TODO
DCNN (Doubly Convolutional Neural Network)
Introduced in this paper (summary). Note Some people also call Deep Convolutional Neural Networks DCNNs.
DNN
Deep Neural Network. The meaning of "deep" differs. Sometimes it means at least one hidden layer, sometimes it means at least 12 hidden layers.
Domain adaptation
A model is trained on dataset $A$. How does it have to be changed to work on dataset $B$?
Detection in Computer Vision (Object detection)
Object detection in an image is a computer vision task. The input is an image and the output is a list with rectangles which contain objects of the given type. Face detection is one well-studied example. A photo could contain no face or hundrets of them. The rectangles can overlap.
Deep Learning
Buzzword. The meaning depends on who you ask / in which year you asked. Sometimes it means multi-layer perceptrons with more than $N$ layers (some say $N=2$ is already deep learning, others want N>20 or nowadays $N>100$).
Discriminative Model
The model gives a conditional probability of the classes $k$, given the feature vector $x$: $P(k | x)$. This kind of model is often used for prediction.
FC7-Features
Features of an image which was run through a trained neural network. AlexNet called the last fully connected layer FC7. However, FC7 features are not necessarily created by AlexNet.
FMLLR
Feature-Space Maximum Likelihood Linear Regression
Feature Map
A feature map is the result of a single filter of a convolutional layer being applied. So it is the activation of that filter over the given input.
Fine-tuning
See pre-training
GMM
Gaussian Mixture Model
GEMM (GEneral Matrix to Matrix Multiplication)
General Matrix to Matrix Multiplication is the problem of calculating the result of $C = A \cdot B$ with $A \in \mathbb{R}^{n \times m}, B \in \mathbb{R}^{m \times k}, C \in \mathbb{R}^{n \times k}$.
Generative model
The model gives the relationship of variables: $P(x, y)$. This kind of model can be used for prediction, too.
Gradient Descent
An iterative optimization algorithm for differentiable functions.
HMM
Hidden Markov Model
i-vector
speaker identity vector. See Front-End Factor Analysis for Speaker Verification.
MANN
Memory-Augmented Neural Networks (see Blog post)
Machine Vision
Computer vision applied for industrial applications.
Matrix Completion
See collaborative filtering.
MLLR
Maximum Likelihood Linear Regression
MMD (Maximum Mean Descrepancy)
MMD is a measure of the difference between a distribution $P$ and a distribution $Q$: $$MMD(F, p, q) = sup_{f \in F} (\mathbb{E}_{x \sim p} [f(x)] - \mathbb{E}_{y \sim q} [f(y)])$$
Multi-Task learning
Train a model which does multiple tasks at the same time, e.g. segmentation and detection (see MultiNet).
NEAT
Neuroevolution of Augmenting Topologies (see Blogpost).
Object recognition
Classification on images. The task is to decide in which class a given image falls, judging by the content. This can be cat, dog, plane or similar.
One-Shot learning
Learn only with one or very few examples per class. See One-Shot Learning of Object Categories.
Optical Flow
Optical flow is defined for two images. It describes how the points in one image moved when switching to the second image.
PCA
Principal component analysis (short: PCA) is a linear transformation which projects $n$ points $\mathbf{x} \in \mathbb{R}^{n \times s}$ with $s$ features each on a hyperplane in such a way that the projection error is minimal. Hence it is an unsupervised method for feature reduction. It simply works by finding a matrix $P \in \mathbb{R}^{s \times m}$, where $m \leq s$ can be chosen as small as desired.
Pre-training
  1. You have machine learning model $m$.
  2. Pre-training: You have a dataset $A$ on which you train $m$.
  3. You have a dataset $B$. Before you start training the model, you initialize some of the parameters of $m$ with the model which is trained on $A$.
  4. Fine-tuning: You train $m$ on $B$.
Regularization
Regularization are techniques to make the fitted function smoother. This helps to prevent overfitting.
Examples: L1, L2, Dropout, Weight Decay in Neural Networks. Parameter $C$ in SVMs.
Reinforcement Learning
Reinforcment learning is a sub-field of machine learning, which focuses on the question how to find actions which lead to higher rewards. See German lecture notes.
Self-Learning
One form of semi-supervised learning, where you train an initial system on the labeled data, then label the unlabeled data where the classifier is 'sure enough'. After that, you train a new system on all data and re-label the unlabeled data. This is iterated.
Semi-supervised learning
Some training data has labels, but most has no labels.
Supervised learning
All training data has labels.
Spatial Pyramid Pooling (SPP)
SPP is the idea of dividing the image into a grid with a fixed number of cells and a variable size, depending on the input. Each cell computes one feature and hence leads to a fixed-size representation of a variable-sized input.
See paper and summary
TF-IDF
TF-IDF (short for Term frequency–inverse document frequency) is a measure that reflects how important a word is to a document in a collection or corpus.
Transductive learning
label unlabeled data (the aim here is NOT to find a hypothesis)
Unsupervised learning
No training data has labels.
VC-Dimension
A theoretical natural number assigned to any classifier. The higher the VC dimension of a classifier, the more situations it is able to capture (see longer explanation, german explanation).
VLAD
Vector of Locally Aligned Descriptors
VTLN
vocal tract length normalization
WRN
Wide residual network
Zero-Shot learning
Learning to predict classes, of which no example has been seen during training. For example, Flicker gets several new tags each day and they want to predict tags for new images. One idea is to use WordNet and ImageNet to generate a common embedding. This way, new words of WordNet could already have an embedding and thus new images categories could also automatically be classified the right way. See Zero-Shot Learning with Semantic Output Codes as well as this YouTube video.

See also

  • Lectures:
    • Analysetechniken großer Datenbestände
    • Informationsfusion
    • Machine Learning 1
    • Machine Learning 2
    • Mustererkennung
    • Neuronale Netze
    • Lokalisierung Mobiler Agenten
    • Probabilistische Planung
  • Wikipedia
  • scholarpedia
  • Other
    • alumni.media.mit.edu
    • robotics.stanford.edu
    • ee.columbia.edu
    • The Machine Learning Dictionary
    • 37steps.com
    • asimovinstitute.org: The Neural Network Zoo

Published

Okt 24, 2016
by Martin Thoma

Category

Machine Learning

Tags

  • Machine Learning 81

Contact

  • Martin Thoma - A blog about Code, the Web and Cyberculture
  • E-mail subscription
  • RSS-Feed
  • Privacy/Datenschutzerklärung
  • Impressum
  • Powered by Pelican. Theme: Elegant by Talha Mansoor