Artificial Intelligence TechC++C++11C++14C++17Learn C++

What You Need To Know About C++ Gaussian Error Linear Units

What You Need To Know About C++ Gaussian Error Linear Units

What is a Gaussian Error Linear Unit? How do we use GELU function in ANN? Where can we use GELU in AI technologies? Let’s review activation functions and explain these terms.

What is an activation function?

An Activation Function ( phi() ) also called as transfer function, or threshold function determines the activation value ( a = phi(sum) ) from a given value (sum) from the Net Input FunctionNet Input Function, here, means the sum is a sum of signals in their weights, and activation function is a new value of this sum with a given function or conditions. In another term. The activation function is a way to transfer the sum of all weighted signals to a new activation value of that signal. There are different activation functions, mostly Linear (Identity), bipolar and logistic (sigmoid) functions are used. The activation function and its types are explained well here.

In C++ (in general in most Programming Languages) you can create your activation function. Note that sum is the result of Net Input Function which calculates the sum of all weighted signals. We will use some as a result of the input function. Here activation value of an artificial neuron (output value) can be written by the activation function as below,

What is a Gaussian Error Linear Unit or GELU?

A Gaussian Error Linear Unit is an alternative to RELU, ELU functions, defined and published by Dan Hendrycks and Kevin Gimpel in 2016. It is used to smooth RELU and ELU activations (Full paper can be found here)

The Gaussian Error Linear Unit (GELU) is a high-performing neural network activation function. The GELU activation function is xΦ(x), where Φ(x) the standard Gaussian cumulative distribution function. The GELU nonlinearity weights inputs by their value, rather than gates inputs by their sign as in ReLUs (x1x>0). An empirical evaluation of the GELU nonlinearity against the ReLU and ELU activations has been applied and there is performance improvements across all considered computer vision, natural language processing, and speech tasks.

GELU function can be written as

We can approximate the GELU with,

or if greater feedforward speed is worth the cost of exactness we can use this approximation below,

We can use different CDFs, i.e we can use Logistic Function, Cumulative Distribution Function CDF σ(x) to obtain activation value, that is call the Sigmoid Linear Unit (SiLU) xσ(x).

From the second formula we can code our phi() activation function with GELU as below,

From the third formula we can use a sigmoid function and we can code our phi() activation function as below,

These formulas can both be tested in this example below,

and the output for the phi(0.5) is;

Is there a simple C++ ANN example with a GELU Activation Function?


Oh hi there 👋
It’s nice to meet you.

Sign up to receive awesome C++ content in your inbox, every day.

We don’t spam! Read our privacy policy for more info.

About author

33+ years of coding with more than 30+ programming languages, mostly C++ on Windows, Android, Mac-OS, iOS, Linux and some other operating systems. Dr. Yilmaz Yoru was born in 1974, Eskisehir-Turkey. He graduated from the department of Mechanical Engineering of Eskisehir Osmangazi University in 1997. One year later he started to work in the same university as an assistant. He received his MSc and PhD degrees from the same department of the same university. He has married and he is a father of a son. Some of his interests are Programming, Thermodynamics, Fluid Mechanics and Artificial Intelligence. He also likes the graphical 2D & 3D design and high-end innovations.
Related posts
C++ComponentsLanguage FeatureLearn C++

How To Add Shadow Effects To Your C++ Apps

C++ComponentsLanguage FeatureLearn C++

How To Make Controls Have A Glow Effect In C++?

C++Learn C++

What Is The sscanf Function In C++ And How Can I Use It?

C++C++17Introduction to C++Language FeatureLearn C++

This Is How To Use Parallel Programming in C++ Builder