Softmax

Softmax is an activation function for multi-layer perceptrons (MLPs). It is a function which gets applied to a vector in $\mathbb{x} \in \mathbb{R}^K$ and returns a vector in $[0, 1]^K$ with the property that the sum of all elements is 1:

$$\varphi(\mathbb{x})_j = \frac{e^{x_j}}{\sum_{k=1}^K e^{x_k}} \;\;\;\text{ for } j=1, \dots, K$$

Python implementation

The implementation is straight forward:

#! /usr/bin/env python

import numpy


def softmax(w):
    """Calculate the softmax of a list of numbers w.

    Parameters
    ----------
    w : list of numbers

    Return
    ------
    a list of the same length as w of non-negative numbers

    Examples
    --------
    >>> softmax([0.1, 0.2])
    array([ 0.47502081,  0.52497919])
    >>> softmax([-0.1, 0.2])
    array([ 0.42555748,  0.57444252])
    >>> softmax([0.9, -10])
    array([  9.99981542e-01,   1.84578933e-05])
    >>> softmax([0, 10])
    array([  4.53978687e-05,   9.99954602e-01])
    """
    e = numpy.exp(numpy.array(w))
    dist = e / numpy.sum(e)
    return dist


if __name__ == "__main__":
    import doctest

    doctest.testmod()

Short analysis

One obvious property of the softmax function is that the sum of all elements is one due to the normalization in the denominator.

By printing the following you can see that values below 1 get closer together and elements above 1 get farer away.

def percentage(before):
    before = numpy.array(before)
    after = softmax(before)
    print("Before: %s" % str(before / before[-1]))
    print("After:  %s" % str(after / after[-1]))
    print("-" * 60)


experiments = []
experiments.append([1, 1.1, 1.1])
experiments.append([1, 2, 3])
experiments.append([0.6, 0.7, 0.8])

for a in experiments:
    a = sorted(a, reverse=True)
    percentage(a)

gives

Before: [ 1.1  1.1  1. ]
After:  [ 1.10517092  1.10517092  1.        ]
------------------------------------------------------------
Before: [3 2 1]
After:  [ 7.3890561   2.71828183  1.        ]
------------------------------------------------------------
Before: [ 1.33333333  1.16666667  1.        ]
After:  [ 1.22140276  1.10517092  1.        ]
------------------------------------------------------------

Softmax

Python implementation

Short analysis

Published

Category

Tags

Contact