In this post, we will explain what an Exponential Linear Unit, or ELU, is. How can we make use of an ELU Activation Function? By learning all of these, you will be able to create C++ applications using C++ software.
What do we need to know about activation functions?
An Activation Function ( phi() ) also called as transfer function, or threshold function, determines the activation value ( a = phi(sum) ) from a given value (sum) from the Net Input Function . Net Input Function, here the sum is a sum of signals in their weights, and the activation function is a new value of this sum with a given function or conditions. In other terms. The activation function is a way to transfer the sum of all weighted signals to a new activation value of that signal. There are different activation functions, mostly Linear (Identity), bipolar and logistic (sigmoid) functions are used. The activation function and its types are explained well here.
In C++ (in general in most Programming Languages) you can create activation functions. Note that sum is the result of Net Input Function which calculates the sum of all weighted signals. We will use some as a result of the input function. Here activation value of an artificial neuron (output value) can be written by the activation function as below,
By using this sum Net Input Function Value and phi() activation functions.
What is an Exponential Linear Unit or ELU?
An Exponential Linear Unit (ELU) is another activation function which is developed and published by Djork-Arne Clevert, Thomas Unterthiner & Sepp Hochreiter with the title “FAST AND ACCURATE DEEP NETWORK LEARNING BY EXPONENTIAL LINEAR UNITS (ELUS)”. You can find the text of the paper by clicking here.
According to their study, they introduce the “exponential linear unit” (ELU) that it speeds up learning in deep neural networks and leads to higher classification accuracies. ELU activation function alleviate the vanishing gradient problem via the identity for positive values, like rectified linear units (ReLUs), leaky ReLUs (LReLUs) and parametrized ReLUs (PReLUs),. They also proof that ELUs have improved learning characteristics compared to the units with other activation functions. In contrast to ReLUs,
Exponential Linear Unit (ELU) can be written as below,
and derivative of this function can be written as,
In C & C++ Programming language, simply Exponential Linear Unit function can be written as below
1 2 3 4 5 6 7 8 |
double alpha = 0.1; // ranges from 0 to 1.0 double phi(double sum) { return( sum>0? sum : alpha*( std::exp(sum)-1) ); // ELU Function } |
Is there a simple Artificial Neural Network example which uses an Exponential Linear Unit (ELU) in C++?
We can use this given ELU function in our Tneuron
class as below,
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
#include <iostream> #define NN 2 // number of neurons double alpha = 0.1; // ranges from 0 to 1.0, can be defined in neuron class if each neuron has different alpha class Tneuron // neuron class { public: double a; // activity of each neurons double w[NN+1]; // weight of links between each neurons Tneuron() { a=0; for(int i=0; i<=NN; i++) w[i]=-1; // if weight is negative there is no link } // let's define an activation function (or threshold) for the output neuron double phi(double sum) { return( sum>0? sum : alpha*( std::exp(sum)-1) ); // ELU Function } }; Tneuron ne[NN+1]; // neuron objects void fire(int nn) { float sum = 0; for ( int j=0; j<=NN; j++ ) { if( ne[j].w[nn]>=0 ) sum += ne[j].a*ne[j].w[nn]; } ne[nn].a = ne[nn].activation_function(sum); } int main() { //let's define activity of two input neurons (a0, a1) and one output neuron (a2) ne[0].a = 0.0; ne[1].a = 1.0; ne[2].a = 0; //let's define weights of signals comes from two input neurons to output neuron (0 to 2 and 1 to 2) ne[0].w[2] = 0.3; ne[1].w[2] = 0.2; // Let's fire our artificial neuron activity, output will be fire(2); printf("%10.6f\n", ne[2].a); getchar(); return 0; } |