What is an SELU Activation function? How can we use Scaled Exponential Linear Unit in an artificial neural network (ANN )? How can I use SELU activation functions in my own C+ app?
Convolutional Neural Networks (CNNs) created a revolution in visual analysis and recurrent neural networks (RNNs) were similarly revolutionary in Natural Language Processing. Consequently, both are two of leading AI technologies that we use in Deep Learning. There are also rare success stories of Deep Learning with standard Feed-Forward Neural Network (FFN). There are many different activation functions used in these methods. Let’s refresh our memory about activation functions and explain these terms.
What is an Activation Function in AI?
An Activation Function ( phi() ) also called is a transfer function, or threshold function that determines the activation value ( a = phi(sum) ) from a given value (sum) from the Net Input Function. The Net Input Function, here, is a sum of signals in their weights, and activation function is a new value of this sum with a given function or conditions.
In other words, the activation function is a way to transfer the sum of all weighted signals to a new activation value of that signal. There are different activation functions, mostly Linear (Identity), bipolar and logistic (sigmoid) functions are used. The activation function and its types are explained well here.
In C++ (in general in most Programming Languages) you can create your activation function. Note that sum is the result of Net Input Function which calculates the sum of all weighted signals. We will use some as a result of the input function. Here activation value of an artificial neuron (output value) can be written by the activation function as below,
By using this sum Net Input Function Valueand phi() activation functions, let’s see some of activation functions in C++; Now Let’s see how we can use SELU Activation Function with this example formula,
What is a Scaled Exponential Linear Unit (SELU)?
The Scaled Exponential Linear Unit is another activation function which is a scaled version of ELU by using λ parameter. Scaled Exponential Linear Unit is developed and released with the “Self-Normalizing Neural Networks” paper by Günter Klambauer, Thomas Unterthiner, Andreas Mayr in 2017. They introduced self-normalizing neural networks (SNNs) to enable high-level abstract representations. Neuron activations of SNNs automatically converge towards zero mean and unit variance, while the batch normalization requires explicit normalization.
SELU is a scaled version of ELU activation function by multiplying with λ parameter, So we can simply say this,
The SELU Activation Function can be written as follows,
They have solved for α and λ and obtain the solutions α01 ≈ 1.6733 and λ01 ≈ 1.0507, where the subscript 01 indicates that these are the parameters for fixed point (0, 1). According this explanation, each node may have different α and λ parameters. So we can define alfa and lambda parameters in neuron structure and we can calculate SELU as below.
1 2 3 4 5 6 |
double phi(double sum) { return( sum>0? lambda*sum : lambda*alpha*( std::exp(sum)-1) ); // SELU Function } |
A simple ANN example with a Scaled Exponential Linear Unit (SELU)
We can use this given SELU function in our Tneuron class as below,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
#include <iostream> #define NN 2 // number of neurons class Tneuron // neuron class { public: double a; // activity of each neurons double w[NN+1]; // weight of links between each neurons double alpha = 1.6733; double lambda = 1.0507; Tneuron() { a=0; for(int i=0; i<=NN; i++) w[i]=-1; // if weight is negative there is no link } // let's define an activation function (or threshold) for the output neuron double phi(double sum) { return( sum>0? lambda*sum : lambda*alpha*( std::exp(sum)-1) ); // SELU Function } }; Tneuron ne[NN+1]; // neuron objects void fire(int nn) { float sum = 0; for ( int j=0; j<=NN; j++ ) { if( ne[j].w[nn]>=0 ) sum += ne[j].a*ne[j].w[nn]; } ne[nn].a = ne[nn].activation_function(sum); } int main() { //let's define activity of two input neurons (a0, a1) and one output neuron (a2) ne[0].a = 0.0; ne[1].a = 1.0; ne[2].a = 0; //let's define weights of signals comes from two input neurons to output neuron (0 to 2 and 1 to 2) ne[0].w[2] = 0.6; ne[1].w[2] = 0.4; // Let's fire our artificial neuron activity, output will be fire(2); printf("%10.6f\n", ne[2].a); getchar(); return 0; } |
Why not download a free trial of C++ Builder today and see the kind of future you can build?