عنوان مقاله فارسی: تحلیل پراکنده در آموزش اولیه شبکههای عصبی عمیق
عنوان مقاله لاتین: Sparseness Analysis in the Pretraining of Deep Neural Networks
نویسندگان: Jun Li; Tong Zhang; Wei Luo; Jian Yang; Xiao-Tong Yuan; Jian Zhang
تعداد صفحات: 13
سال انتشار: 2017
زبان: لاتین
Abstract:
A major progress in deep multilayer neural networks (DNNs) is the invention of various unsupervised pretraining methods to initialize network parameters which lead to good prediction accuracy. This paper presents the sparseness analysis on the hidden unit in the pretraining process. In particular, we use the L1-norm to measure sparseness and provide some sufficient conditions for that pretraining leads to sparseness with respect to the popular pretraining models- such as denoising autoencoders (DAEs) and restricted Boltzmann machines (RBMs). Our experimental results demonstrate that when the sufficient conditions are satisfied, the pretraining models lead to sparseness. Our experiments also reveal that when using the sigmoid activation functions, pretraining plays an important sparseness role in DNNs with sigmoid (Dsigm), and when using the rectifier linear unit (ReLU) activation functions, pretraining becomes less effective for DNNs with ReLU (Drelu). Luckily, Drelu can reach a higher recognition accuracy than DNNs with pretraining (DAEs and RBMs), as it can capture the main benefit (such as sparseness-encouraging) of pretraining in Dsigm. However, ReLU is not adapted to the different firing rates in biological neurons, because the firing rate actually changes along with the varying membrane resistances. To address this problem, we further propose a family of rectifier piecewise linear units (RePLUs) to fit the different firing rates. The experimental results show that the performance of RePLU is better than ReLU, and is comparable with those with some pretraining techniques, such as RBMs and DAEs.
sparseness analysis in the pretraining of deep neural networks_1623228025_48975_4145_1763.zip3.41 MB |