Many recent breakthroughs in deep learning were achieved by training increasingly larger models on massive datasets. However, training such models can be prohibitively expensive. For instance, the cluster used to trai...
详细信息
ISBN:
(纸本)9781713829546
Many recent breakthroughs in deep learning were achieved by training increasingly larger models on massive datasets. However, training such models can be prohibitively expensive. For instance, the cluster used to train GPT-3 costs over $250 million(2). As a result, most researchers cannot afford to train state of the art models and contribute to their development. Hypothetically, a researcher could crowdsource the training of large neuralnetworks with thousands of regular PCs provided by volunteers. The raw computing power of a hundred thousand $2500 desktops dwarfs that of a $250M server pod, but one cannot utilize that power efficiently with conventional distributed training methods. In this work, we propose Learning@home: a novel neuralnetwork training paradigm designed to handle large amounts of poorly connected participants. We analyze the performance, reliability, and architectural constraints of this paradigm and compare it against existing distributed training techniques.
In this paper, the main interest is the fusion and the control of data that is obtained from a set of sensors. This task requires the use of a both effective and versatile computational model. The chosen architecture ...
详细信息
ISBN:
(纸本)9780769537887
In this paper, the main interest is the fusion and the control of data that is obtained from a set of sensors. This task requires the use of a both effective and versatile computational model. The chosen architecture is the already known for its suitability Cellular neuralnetwork (CNN). This specific model, adopts some significant features, such as: continuous-time dynamics, local interconnection, reliability, simple implementation, low power consumption and as far as its behavior is concerned, great flexibility. Furthermore, it is taken into consideration, that depending on the application, the corresponding network dimension may vary. In order to confront this problem, a methodology is proposed for the automatic generation of CNNs of variable dimensions. The above task is achieved by developing an algorithm, which enables the combination of the basic CNN circuit counterparts, so as to produce the desired network dimensions.
It is pointed out that recurrent lateral connectivity in a layer of processing units gives rise to a rich variety of nonlinear response properties, such as overall gain control, emergent periodic response on a preferr...
详细信息
It is pointed out that recurrent lateral connectivity in a layer of processing units gives rise to a rich variety of nonlinear response properties, such as overall gain control, emergent periodic response on a preferred spatial scale (collective excitations), and distributed winner-take-all response. This diversity of response properties is observed in several different classes of simple network architectures, including the additive linear network, the additive sigmoidal network, and the nonlinear shunting network. When Hebbian learning is coupled with network dynamics, these models have been shown to support the development of modular connectivity structures analogous to cortical columns
This paper presents a new parallel distributedprocessing (PDP) approach to solve job-shop scheduling problem which is np-complete. In this approach, a stochastic model and a controlled external energy is used to impr...
详细信息
ISBN:
(纸本)0780314212
This paper presents a new parallel distributedprocessing (PDP) approach to solve job-shop scheduling problem which is np-complete. In this approach, a stochastic model and a controlled external energy is used to improve the scheduling solution iteratively. Different to the processing element (PE) of the Hopfield neuralnetwork model, each PE of our model represents an operation of a certain job. So, the functions of each PE are a little more complicated than that of a Hopfield PE. Under such model, each PE is designed to perform some stochastic, collective computations. From experimental result, the solutions can be improved toward optimal ones much faster than other methods. Instead of the polynomial number of variables needed in neuralnetwork approach, the variables number needed to formulate a job-shop problem in our model is only a linear function of the operation number contained in the given job-shop problem.
We develop an eigenspace estimation algorithm for distributed environments with arbitrary node failures, where a subset of computing nodes can return structurally valid but otherwise arbitrarily chosen responses. Nota...
ISBN:
(纸本)9781713871088
We develop an eigenspace estimation algorithm for distributed environments with arbitrary node failures, where a subset of computing nodes can return structurally valid but otherwise arbitrarily chosen responses. Notably, this setting encompasses several important scenarios that arise in distributed computing and data-collection environments such as silent/soft errors, outliers or corrupted data at certain nodes, and adversarial responses. Our estimator builds upon and matches the performance of a recently proposed non-robust estimator up to an additive (O) over tilde(sigma root alpha) error, where sigma(2) is the variance of the existing estimator and alpha is the fraction of corrupted nodes.
Modern distributednetworks such as smartphones, wearable devices, and self-driving vehicles generate a wealth of data. As the computation, storage, and battery capabilities of these devices grow, local data storage a...
详细信息
ISBN:
(纸本)9798350343557
Modern distributednetworks such as smartphones, wearable devices, and self-driving vehicles generate a wealth of data. As the computation, storage, and battery capabilities of these devices grow, local data storage and processing become easier and more secure. This has led to a growing interest in federated learning which provides training of deep learning models while keeping the training data decentralized. During the training, the performance of deep learning models can be improved by filtering redundant, malicious, and abnormal samples with data valuation methods. In this work, the aim is to improve federated learning models by integrating data valuation methods into the training process. For this, a two-layered convolutional neuralnetwork is trained on the MNIST dataset inside a small federated learning network. The Shapley and Leave-one-out based data evaluation methods developed in this study have resulted in an 8.81% accuracy improvement during experiments conducted on the MNIST dataset with corrupted labels.
We study distributed learning of nonparametric conditional quantiles with Tikhonov regularization in a reproducing kernel Hilbert space (RKHS). Although distributed parametric quantile regression has been investigated...
详细信息
ISBN:
(纸本)9781713871088
We study distributed learning of nonparametric conditional quantiles with Tikhonov regularization in a reproducing kernel Hilbert space (RKHS). Although distributed parametric quantile regression has been investigated in several existing works, the current nonparametric quantile setting poses different challenges and is still unexplored. The difficulty lies in the illusive explicit bias-variance decomposition in the quantile RKHS setting as in the regularized least squares regression. For the simple divide-and-conquer approach that partitions the data set into multiple parts and then takes an arithmetic average of the individual outputs, we establish the risk bounds using a novel second-order empirical process for quantile risk.
One way to reduce network traffic in multinode data-parallel stochastic gradient descent is to only exchange the largest gradients. However, doing so damages the gradient and degrades the model's performance. Tran...
详细信息
ISBN:
(纸本)9781950737901
One way to reduce network traffic in multinode data-parallel stochastic gradient descent is to only exchange the largest gradients. However, doing so damages the gradient and degrades the model's performance. Tranformer models degrade dramatically while the impact on RNNs is smaller. We restore gradient quality by combining the compressed global gradient with the node's locally computed uncompressed gradient. neural machine translation experiments show that Transformer convergence is restored while RNNs converge faster. With our method, training on 4 nodes converges up to 1.5x as fast as with uncompressed gradients and scales 3.5x relative to singlenode training.
With advancements in distributed communications for the IoV, security threats expose significant challenges. While current IoV intrusion detection systems demonstrate high accuracy, they rely heavily on private or eas...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
With advancements in distributed communications for the IoV, security threats expose significant challenges. While current IoV intrusion detection systems demonstrate high accuracy, they rely heavily on private or easily forged data. Moreover, the training process incurs increased network communication costs, fails to protect user privacy, and distorted data degrades detection performance. To address these limitations, we proposes a distributed federated learning-based intrusion detection model for IoV using non-private behavior features. Firstly, we design a data processing algorithm that groups and slices IoV communication messages into time series. Then, behavior vectors are extracted using an attention-based time series model designed in this work. Attacks are detected by spatially transforming the residuals with a neuralnetwork. Finally, we use a federated learning algorithm for data processing and training of the model, effectively reduce communication burden and protect privacy training data on the vehicle-side. Extensive experiments on two datasets validate the proposed model, achieving F1 scores of 91.66% and 90.25% respectively, outperforming state-of-theart methods. We publicly release the model and algorithms to improve reproducibility and accessibility of effective IoV intrusion detection solutions.
A simulator for neuralnetwork models is presented which is based on the X Window System programming environment and allows interactive construction of many different network models. The visualization of net topologie...
详细信息
ISBN:
(纸本)0780302273
A simulator for neuralnetwork models is presented which is based on the X Window System programming environment and allows interactive construction of many different network models. The visualization of net topologies and parameters is handled by graphic workstations. Furthermore, the simulator is capable of using a multiprocessing system to perform time-consuming net control algorithms, which have been especially designed to efficiently use the underlying hardware. Several tools are being implemented for controlling the distribution of computational tasks, performed on a neuralnetwork, among the available processors.
暂无评论