• Martin Thoma
  • Home
  • Categories
  • Tags
  • Archives
  • Support me

ZCA Whitening

Contents

  • How to do it
  • See also

Whitening is a transformation of data in such a way that its covariance matrix $\Sigma$ is the identity matrix. Hence whitening decorrelates features. It is used as a preprocessing method.

When you have $N$ data points in $\mathbb{R}^n$, then the covariance matrix $\Sigma \in \mathbb{R}^{n \times n}$ is estimated to be:

$$\hat{\Sigma}{jk} = \frac{1}{N-1} \sum}^N (x_{ij} - \bar{xj) \cdot (x_k)$$} - \bar{x

where $\bar{x}_j$ denotes the $j$-th component of the estimated mean of the samples $x$.

Any matrix $W \in \mathbb{R}^{n \times n}$ which satisfies the condition

$$W^T W = C^{-1}$$

whitens the data. ZCA whitening is the choice $W = M^{- \frac{1}{2}}$. PCA is another choice. According to "Neural Networks: Tricks of the Trade" PCA and ZCA whitening differ only by a rotation.

How to do it

When you look at the Keras code, you can see the following:

# Calculate principal components
sigma = np.dot(flat_x.T, flat_x) / flat_x.shape[0]
u, s, _ = linalg.svd(sigma)
principal_components = np.dot(np.dot(u, np.diag(1.0 / np.sqrt(s + 10e-7))), u.T)

# Apply ZCA whitening
whitex = np.dot(flat_x, principal_components)

So, at first you compute the covariance matrix $\Sigma$. I'm not quite sure, but I think they should divide by flat_x.shape[0] - 1 for the unbiased estimator.

Then you apply singular value decomposition to the estimated covariance matrix. The matrix $u \in \mathbb{R}^{n \times n}$ is unitary and $s \in \mathbb{R}^{n \times n}$ is a diagonal matrix with non-negative real numbers on the diagonal. Those number are the singular values of $\Sigma$.

Next, the principal components are calculated: [u \cdot \frac{1}{\sqrt{s + 10^{-7}}} I \cdot u^T]

By adding 10e-7 one prevents division by zero.

Whitening is then simply the multiplication with the principal components.

See also

  • Alex Krizhevsky and Geoffrey Hinton: Learning multiple layers of features from tiny images
  • Optimal whitening and decorrelation

Published

Mär 29, 2017
by Martin Thoma

Category

Machine Learning

Tags

  • Computer Vision 6
  • Machine Learning 81

Contact

  • Martin Thoma - A blog about Code, the Web and Cyberculture
  • E-mail subscription
  • RSS-Feed
  • Privacy/Datenschutzerklärung
  • Impressum
  • Powered by Pelican. Theme: Elegant by Talha Mansoor