Variable Activation Functions in Neuroevolution

Neural networks and the Artificial Intelligence (AI) driven by them are beginning to see widespread deployment in applications such as autonomous driving, voice-driven user interfaces, and control of complex systems. With the recent advancements in convolutional neural networks and deep learning (often called deep neural networks due to the many layers present in the convolutional neural network architecture), neural networks research has seen a resurgence. In traditional neural networks, a single activation function – which sets the triggering behavior of the artificial neuron – is fixed for each layer of neurons in the design of the neural network, prior to training. Most commonly, supervised learning is used, where a training algorithm is selected to train the network to a specific data set using a collection of input-output pairings, such as images and their labels. All training algorithms attempt to optimize the neural network to the training set by adjusting the connection weights between the neurons. If no optimum point is reached, a close approximation is selected. This paper proposes the Variable Activation Function Neural Network (VAFNN), an architecture where activation functions are varied on a per-neuron basis. This method may have the potential to model similar behavior as deep neural networks with fewer layers, therefore making the network more efficient. In addition, the proposed architecture enables the possibility of using activation functions that need not be monotonic, continuous, or differentiable. Traditional training algorithms typically require smooth activation functions for training and optimization. Instead of traditional training, a form of neuroevolution is used to vary the weights and activation functions simultaneously. The evolution algorithm only mutates a single individual candidate network at a time, as opposed to a population of networks. While the local minima problem is still an issue, this neuroevolutionary approach uses significantly less memory than the population-based neuroevolutionary approach. Finally, the results of VAFNN are compared to the traditional fixed activation function approach on an XOR network and it is shown that the VAFNN approach uncovers a more efficient implementation than has previously been reported.

Derek Smith
Ohio Northern University
United States

Heath LeBlanc
Ohio Northern University
United States