A method for training the committee machine with an arbitrary logic is described. First, an expression of the discriminant function realized by the committee machine is introduced. By making use of the expression, an ...
详细信息
A method for training the committee machine with an arbitrary logic is described. First, an expression of the discriminant function realized by the committee machine is introduced. By making use of the expression, an error-correction procedure for training the committee machine is proposed. The procedure of the perceptron is clearly explained as the special case of the proposed procedure. Experimental results show that the procedure is effective.
training algorithms for natural speech recognition require very large amounts of transcribed speech data. Commercially distributed books on tape constitute an abundant source of such data, but it is difficult to take ...
详细信息
training algorithms for natural speech recognition require very large amounts of transcribed speech data. Commercially distributed books on tape constitute an abundant source of such data, but it is difficult to take advantage of it using current training algorithms because of the requirement that the data be hand-segmented into chunks that can be comfortably processed in memory. In order to address this problem we have developed a training algorithm which is capable of handling unsegmented data files of arbitrary length;the computational requirements of the algorithm are linear in the amount of data to be processed and the memory requirements are constant.
Graph autoencoder can map graph data into a low-dimensional space. It is a powerful graph embedding method applied in graph analytics to reduce the computational cost. The training algorithm of a graph autoencoder sea...
详细信息
ISBN:
(纸本)9781450376426
Graph autoencoder can map graph data into a low-dimensional space. It is a powerful graph embedding method applied in graph analytics to reduce the computational cost. The training algorithm of a graph autoencoder searches the weight setting for preserving most graph information of the graph data with reduced dimensionality. This paper presents a simple training strategy, which can improve the training performance without significantly increasing time complexity. This strategy can flexibly fit many existing training algorithms. The experimental results confirm the effectiveness of this strategy.
This paper describes a novel acceleration technique of quasi-Newton method (QN) using momentum terms for training in neural networks. Recently, Nesterov's accelerated quasi-Newton method (NAQ) has shown that the m...
详细信息
ISBN:
(纸本)9783030299118;9783030299101
This paper describes a novel acceleration technique of quasi-Newton method (QN) using momentum terms for training in neural networks. Recently, Nesterov's accelerated quasi-Newton method (NAQ) has shown that the momentum term is effective in reducing the number of iterations and in accelerating its convergence speed. However, the gradients had to calculate two times during one iteration in the NAQ training. This increased the computation time of a training loop compared with the conventional QN. In this research, an improvement to NAQ is done by approximating the Nesterov's accelerated gradient used in NAQ as a linear combination of the current and previous gradients. Then the gradient is calculated only once per iteration same as QN. The performance of the proposed algorithm is evaluated through computer simulations on a benchmark problem of the function modeling and real-world problems of the microwave circuit modeling. The results show the significant acceleration in the computation time compared with conventional training algorithms.
This paper proposes a Gaussian-Cauchy Particle Swarm Optimization (PSO) algorithm to provide the optimized parameters for a Feed Forward Neural Network. The improved PSO trains the Neural Network by optimizing the net...
详细信息
ISBN:
(纸本)9783319638591;9783319638584
This paper proposes a Gaussian-Cauchy Particle Swarm Optimization (PSO) algorithm to provide the optimized parameters for a Feed Forward Neural Network. The improved PSO trains the Neural Network by optimizing the network weights and bias in the Neural Network. In comparison with the Back Propagation Neural Network, the Gaussian-Cauchy PSO Neural Network converges faster and is immune to local minima.
Several big data applications are particularly focused on classification and clustering tasks. Robustness of such system depends on how well it can extract important features from the raw data. For big data processing...
详细信息
ISBN:
(纸本)9781467375658
Several big data applications are particularly focused on classification and clustering tasks. Robustness of such system depends on how well it can extract important features from the raw data. For big data processing we are interested for a generic feature extraction mechanism for different applications. autoencoder is a popular unsupervised training algorithm for dimensionality reduction and feature extraction. In this work we have examined memiistor crossbar based implementation of autoencoder which will consume very low power. We have designed on-chip training circuitry for the unsupervised training scheme.
This paper describes a novel quasi-Newton (QN) based accelerated technique for training of neural networks. Recently, Nesterov's accelerated gradient method has been utilized for training the neural networks. In t...
详细信息
ISBN:
(纸本)9781509025978
This paper describes a novel quasi-Newton (QN) based accelerated technique for training of neural networks. Recently, Nesterov's accelerated gradient method has been utilized for training the neural networks. In this paper the acceleration of the QN training algorithm is realized using Momentum term in Nesterov's method. It is shown that the proposed algorithm has a similar convergence property with the conventional QN method. Neural network training for a simple function approximation and a microwave circuit modeling is presented to demonstrate the proposed algorithm. The proposed algorithm drastically improves the convergence speed of the QN algorithm.
Today's deep learning models are primarily trained on CPUs and GPUs. Although these models tend to have low error, they consume high power and utilize large amount of memory owing to double precision floating poin...
详细信息
ISBN:
(纸本)9780738143378
Today's deep learning models are primarily trained on CPUs and GPUs. Although these models tend to have low error, they consume high power and utilize large amount of memory owing to double precision floating point learning parameters. Beyond the Moore's law, a significant portion of deep learning tasks would run on edge computing systems, which will form an indispensable part of the entire computation fabric. Subsequently, training deep learning models for such systems will have to be tailored and adopted to generate models that have the following desirable characteristics: low error, low memory, and low power. We believe that deep neural networks (DNNs), where learning parameters are constrained to have a set of finite discrete values, running on neuromorphic computing systems would be instrumental for intelligent edge computing systems having these desirable characteristics. To this extent, we propose the Combinatorial Neural Network training algorithm (CoNNTrA), that leverages a coordinate gradient descent-based approach for training deep learning models with finite discrete learning parameters. Next, we elaborate on the theoretical underpinnings and evaluate the computational complexity of CoNNTrA. As a proof of concept, we use CoNNTrA to train deep learning models with ternary learning parameters on the MNIST, Iris and ImageNet data sets and compare their performance to the same models trained using Backpropagation. We use following performance metrics for the comparison: (i) training error;(ii) Validation error;(iii) Memory usage;and (iv) training time. Our results indicate that CoNNTrA models use 32x less memory and have errors at par with the Backpropagation models.
This paper presents on-chip training circuits for memristor based deep neural networks utilizing unsupervised and supervised learning methods. Memristor crossbar circuits allow neural algorithms to be implemented very...
详细信息
ISBN:
(纸本)9781509061822
This paper presents on-chip training circuits for memristor based deep neural networks utilizing unsupervised and supervised learning methods. Memristor crossbar circuits allow neural algorithms to be implemented very efficiently, but could be prone to device variations and faults. On chip training circuits would allow the training algorithm to account for device variability and faults in these circuits. We have utilized autoencoders for layer-wise pre-training of the deep networks and utilized the back-propagation algorithm for supervised fine tuning. Our design utilizes two memristors per synapse for higher precision of weights. We have demonstrated successful training of memristor based deep networks for the MNIST digit classification and the KDD intrusion detection datasets.
We investigated batch and stochastic Manhattan Rule algorithms for training multilayer perceptron classifiers implemented with memristive crossbar circuits. In Manhattan Rule training, the weights are updated only usi...
详细信息
ISBN:
(纸本)9781479972524
We investigated batch and stochastic Manhattan Rule algorithms for training multilayer perceptron classifiers implemented with memristive crossbar circuits. In Manhattan Rule training, the weights are updated only using sign information of classical backpropagation algorithm. The main advantage of Manhattan Rule is its simplicity, which leads to more compact hardware implementation and faster training time. Additionally, in case of stochastic training, Manhattan Rule allows performing all weight updates in parallel, which further speeds up the training procedure. The tradeoff for simplicity is slightly worse classification performance. For example, simulation results showed that classification fidelity on Proben1 benchmark for memristor-based implementation trained with batch Manhattan Rule were comparable to that of classical backpropagation algorithm, and about 2.8 percent worse than the best reported results.
暂无评论