What is the SoftMax function in Neural Networks? How can we use the SoftMax function in ANN? Where can we use SoftMax in AI technologies? Let’s explain these terms.
Table of Contents
What is the Softmax function?
The SoftMax Function is a generalization of the logistic function to multiple dimensions. It is also known as softargmax or normalized exponential function. It is used in multinomial logistic regression and is often used as the last activation function of a neural network to normalize the output of a network. Thus, it is used to a probability distribution over predicted output vectors. SoftMax function cannot be used as an activation function, but it can be used as a final step after having all outputs from the activation function, then we can normalize this vector (or array) by the SoftMax. To put it another way, SoftMax give us important values from the given output vector or array.
SoftMax uses the standard exponential function on each element of activation outputs and the output of SoftMax for each value is between the 0 and 1. It normalizes these values by dividing by the sum of all these exponentials; this normalization ensures that the sum of the components of the output vector is 1.
What does the Softmax function do?
In neural networks, the SoftMax function is often used in the final layer of a neural network-based classifier. Such these kinds of networks are generally trained under a log loss or cross entropy methods that are a non-linear variant of multinomial logistic regression.
For a x vector (or array) which has n members a Softmax for each member can be written as below,
This function may overflow due to infinitive results. To avoid this we can modulate x values by subtracting maximum value m.
How can I write a Softmax function in C++?
This SoftMax function can be written in C++ as below,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 |
static void softmax(double *input, double *output, unsigned int n) { double sum = 0; double m = -INFINITY; for (long int i = 0; i < n; i++) { m = std::max( m, input[i]); } for (unsigned int j = 0; j < n; j++) { sum += std::exp( input[j] -m); } for (unsigned int i = 0; i < n; i++) { output[i] = std::expf( input[i] -m)/sum; } } |
We can also use an offset to calculate softmax as below,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
static void softmax2(double *input, double *output, size_t input_len) { assert(input); double m = -INFINITY; for (long int i = 0; i < input_len; i++) { if (input[i] > m) { m = input[i]; } } doublke sum = 0.0; for (size_t i = 0; i < input_len; i++) { sum += expf(input[i] - m); } double offset = m + logf(sum); for (size_t i = 0; i < input_len; i++) { output[i] = expf(input[i] - offset); } } |
Is there a simple C++ SoftMax example?
In this example below both softmax() functions are used
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 |
#include <iostream> #include <assert.h> #include <algorithm> #include <math.h> static void softmax(double *input, double *output, unsigned int n) { double sum = 0; double m = -INFINITY; for (long int i = 0; i < n; i++) { m = std::max( m, input[i]); } for (unsigned int j = 0; j < n; j++) { sum += std::exp( input[j] -m); } for (unsigned int i = 0; i < n; i++) { output[i] = std::expf( input[i] -m)/sum; } } static void softmax2(double *input, double *output, size_t input_len) { assert(input); double m = -INFINITY; for (long int i = 0; i < input_len; i++) { if (input[i] > m) { m = input[i]; } } doublke sum = 0.0; for (size_t i = 0; i < input_len; i++) { sum += expf(input[i] - m); } double offset = m + logf(sum); for (size_t i = 0; i < input_len; i++) { output[i] = expf(input[i] - offset); } } #define N 7 int main() { double inp[] = {1.0, 2.0, 400.0, 4000.0, 1.0, 2.0, 3.0}; double out[N]; std::cout << "Inputs:\n"; for( int i=0; i<N; i++) { std::cout << inp[i] << ','; } std::cout << '\n'; softmax(inp, out, N); double tot=0; std::cout << "Softmax Output:\n"; for( int i=0; i<N; i++) { std::cout << out[i] << ','; tot += out[i]; } std::cout << '\n'; std::cout << "total of softmax output:" << tot << '\n'; softmax2(inp, out, N); tot=0; std::cout << "Softmax Output:\n"; for( int i=0; i<N; i++) { std::cout << out[i] << ','; tot += out[i]; } std::cout << '\n'; std::cout << "total of softmax output:" << tot << '\n'; getchar(); return 0; } |