Deep Neural Network with Adaptive Parametric Rectified Linear Units and its Fast Learning
Keywords:deep neural network, convolutional neural network, adaptive parametric rectified unit, activation function
The adaptive parametric rectified linear unit (AdPReLU) as an activation function of the deep neural network is proposed in the article. The main benefit of the proposed system is adjusted activation function whose parameters are tuning parallel with synaptic weights in online mode. The algorithm of the simultaneous learning of all neurons parameters with AdPReLU and the modified backpropagation procedure based on this algorithm is introduced. The approach under consideration permits to reduce volume of the training data set and increase tuning speed of the DNN with AdPReLU. The proposed approach could be applied in the deep convolutional neural networks (CNN) in conditions of the small value of training data sets and additional requirements for system performance. The main feature of DNN under consideration is possibility to tune not only synaptic weights but the parameters of activation function too. The effectiveness of this approach is proved by experimental modeling.
G. Cybenko, “Approximation by superposition of a sigmoidal function,” Math. Control Signals Systems, vol. 2, pp. 303-314, 1989, https://doi.org/10.1007/BF02551274.
K. Hornik, “Approximation capabilities of multilayer feedforward networks,” Neural Networks, vol. 4, issue 2, pp. 251-257, 1994, https://doi.org/10.1016/0893-6080(91)90009-T.
A. Cichocki and R. Unbehauen, Neural Networks for Optimization and Signal Processing, Wiley, Chichester, 1993, 536 p. https://doi.org/10.1002/acs.4480080309.
Y. LeCun, Y. Bengio and G. Hinton, “Deep learning,” Nature, vol. 521, pp. 436-444, 2015, https://doi.org/10.1038/nature14539.
J. Schmidhuber, “Deep learning in neural networks: An overview,” Neural Networks, vol. 61, pp. 85-117, 2015, https://doi.org/10.1016/j.neunet.2014.09.003.
I. Goodfellow, Y. Bengio and A. Courville, Deep Learning, The MIT Press, 2016.
D. Graupe, Deep Learning Neural Networks: Design and Case Studies, New Jersey: World Scientific, 2016. https://doi.org/10.1142/10190.
A. L. Caterini, D. E. Chang, Deep Neural Networks in a Mathematical Framework, Cham: Springer, 2018, https://doi.org/10.1007/978-3-319-75304-1.
C. C. Aggarwal, Neural Networks and Deep Learning, Cham: Springer, 2018, https://doi.org/10.1007/978-3-319-94463-0.
B. Xu, N. Wang, T. Chen, M. Li, Empirical Evaluation of Rectified Activations in Convolutional Network, arXiv preprint arXiv:1505.00853, 2015.
D.-A. Clevert, T. Unterthiner and S. Hochreiter, Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs), arXiv preprint arXiv:1511.07289, 2015.
K. He, X. Zhang, S. Ren and J. Sun, “Delving deep into rectifiers: Surpassing human-level performance on imagenet classification,” Proceedings of the IEEE International Conference on Computer Vision (ICCV), Santiago, 2015, pp. 1026-1034, https://doi.org/10.1109/ICCV.2015.123.
K. He, X. Zhang, S. Ren and J. Sun, “Deep residual learning for image recognition,” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, 2016, pp. 770-778, https://doi.org/10.1109/CVPR.2016.90.
I. Goodfellow, D. Warde-Farley, M. Mirza, A.Courville and Y. Bengio, Maxout Networks, arXiv preprint arXiv:1302.4389, 2013.
F. Agostinelli, M. Hoffman, P. Sadowski and P. Baldi, Learning Activation Functions to Improve Deep Neural Networks, arXiv preprint arXiv:1412.6830, 2015.
X. Jin, Ch. Xu, J. Feng, Yu. Wei, J. Xiong and Sh. Yan, Deep Learning with S-shaped Rectified Linear Activation Units, arXiv preprint arXiv:1512.07030, 2015.
L.R. Sütfeld, F. Brieger, H. Finger, S. Füllhase and G. Pipa, Adaptive Blending Units: Trainable Activation Functions for Deep Neural Networks, arXiv preprint arXiv:1806.10064, 2018.
J. K. Kruschke, J. R. Movellan, “Benefits of gain: Speeded learning and minimal hidden layers in back-propagation networks,” IEEE Transactions on Systems, Man, and Cybernetics, vol. 21, no. 1, pp. 273-280, 1991, https://doi.org/10.1109/21.101159.
Y. Bodyanskiy, A. Deineko, I. Pliss and V. Slepanska, “Formal neuron based on adaptive parametric rectified linear activation function and its learning,” Proceedings of the 1st International Workshop on Digital Content and Smart Multimedia “DCSMart 2019”, Lviv, Ukraine, December 23-25, 2019, CEUR Workshop Proceedings volume 2533, pp. 14-22.
S. Kaczmarz, “Angenäherte Auslösung von Systemen linearer Gleichungen,” Bull. Internat. Acad. Polon.Sci., Lettres A, pp. 355-357, 1937. (in German)
S. Kaczmarz, “Approximate solution of systems of linear equations,” International Journal of Control, vol. 57, issue 6, pp. 1269-1271, 1993, https://doi.org/10.1080/00207179308934446.
B. Widrow, M. Hoff, “Adaptive Switching Circuits,” IRE WESCON Convention Record, New York, IRE, Part 4, 1959, pp. 96-104. https://doi.org/10.21236/AD0241531.
P. Otto, Y. Bodyanskiy and V. Kolodyazhniy, “A new learning algorithm for a forecasting neuro-fuzzy network,” Integrated Computer-Aided Engineering, vol. 10, no. 4, pp. 399-409, 2003. https://doi.org/10.3233/ICA-2003-10409.
G. C. Goodwin, P. J. Ramadge and P. E. Caines, “A globally convergent adaptive predictor,” Automatica, vol. 17, issue 1, pp. 135-140, 1981. https://doi.org/10.1016/0005-1098(81)90089-3.
How to Cite
LicenseInternational Journal of Computing is an open access journal. Authors who publish with this journal agree to the following terms:
• Authors retain copyright and grant the journal right of first publication with the work simultaneously licensed under a Creative Commons Attribution License that allows others to share the work with an acknowledgement of the work's authorship and initial publication in this journal.
• Authors are able to enter into separate, additional contractual arrangements for the non-exclusive distribution of the journal's published version of the work (e.g., post it to an institutional repository or publish it in a book), with an acknowledgement of its initial publication in this journal.
• Authors are permitted and encouraged to post their work online (e.g., in institutional repositories or on their website) prior to and during the submission process, as it can lead to productive exchanges, as well as earlier and greater citation of published work.