Accurate quantitative characterization of crack length, location, and orientation are critical for the safety assessment of load bearing structures to avoid catastrophic structural failures. Ultrasound non-destructive...
详细信息
Accurate quantitative characterization of crack length, location, and orientation are critical for the safety assessment of load bearing structures to avoid catastrophic structural failures. Ultrasound non-destructive evaluation is one of the key methods to detect and evaluate embedded flaws inside a material during fabrication or operation. Although significant progress has been made in developing advanced ultrasound sensors and signal data processing methods, current practices rely on human expertise to evaluate the ultrasound measurements, which leads to high uncertainty and errors in the predictions. Here we demonstrate that an embedded crack reflected ultrasound time signal contains complete information about the key characteristics of the crack, which can be accurately quantified using an optimally trained machine learning model. A lack of sufficiently large, well distributed, and suitably labeled datasets to train machine learning models continues to be a significant obstacle for evaluating non-visible cracks. To overcome this limitation, we demonstrate that our finite element simulation trained convolutional neuralnetwork (CNN) is able to accurately predict all three crack characteristics from experimentally measured ultrasound non-destructive test signals. We created a moderate size A-scan time signal simulation dataset (1200 scans) for three-dimensional (3D) elliptical penny shaped cracks inside rectangular cuboid steel to train our CNN. Independent validation experiments were performed by conducting 21 ultrasound tests on 3D printed steel specimens containing a variety of embedded crack geometries. We show that our purely finite element simulation trained CNN accurately predicts crack length, crack location, and crack orientation from experimentally measured signals with an average error of 5.7%, 5.6%, and 8.4% for length, location, and orientation, respectively. This approach of utilizing simulation based training of a neuralnetwork can be used in othe
Graph embedding training models access parameters sparsely in a "one-hot" manner. Currently, the distributed graph embedding neuralnetwork is learned by data parallel with the parameter server, which suffer...
详细信息
Graph embedding training models access parameters sparsely in a "one-hot" manner. Currently, the distributed graph embedding neuralnetwork is learned by data parallel with the parameter server, which suffers significant performance and scalability problems. In this article, we analyze the problems and characteristics of training this kind of models on distributed GPU clusters for the first time, and find that fixed model parameters scattered among different machine nodes are a major limiting factor for efficiency. Based on our observation, we develop an efficient distributed graph embedding system called EDGES, which can utilize GPU clusters to train large graph models with billions of nodes and trillions of edges using data and model parallelism. Within the system, we propose a novel dynamic partition architecture for training these models, achieving at least one half of communication reduction compared to existing training systems. According to our evaluations on real-world networks, our system delivers a competitive accuracy for the trained embeddings, and significantly accelerates the training process of the graph node embedding neuralnetwork, achieving a speedup of 7.23x and 18.6x over the existing fastest training system on single node and multi-node, respectively. As for the scalability, our experiments show that EDGES obtains a nearly linear speedup.
Deep learning has been one of the trendiest research topics. However, as data quantities rise exponentially, training large neuralnetworks can become prohibitively expensive with billions of parameters. Fortunately, ...
详细信息
Deep learning has been one of the trendiest research topics. However, as data quantities rise exponentially, training large neuralnetworks can become prohibitively expensive with billions of parameters. Fortunately, recent research has discovered that not all of the computations in traditional network training are necessary. By selectively sparsifying the majority of the neurons during training, we can still obtain acceptable accuracy. SLIDE, a C++ OpenMP-based sub-linear deep learning engine, has been developed in this situation. SLIDE uses the algorithm of locality sensitive hashing (LSH) to query neurons with high activation in sub-linear time. It achieves a remarkable speedup in training large fully-connected networks by making use of the network sparsity as well as multi-core parallelism. However, SLIDE is limited to CPUs, ignoring the popular GPU devices with greater parallel potential and computational capability. In this article, we propose G-SLIDE, a GPU-based sub-linear deep learning engine, which combines the benefits of SLIDE's adaptive sparsification algorithms with GPUs' high performance. The main challenges in developing G-SLIDE are efficiently using LSH to sparsify networks and training the special sparse neuralnetworks on the GPU. To address these challenges, we propose several novel solutions, such as specific data formats and appropriate workload partitioning for threads to fully utilize the GPU resources. We evaluate G-SLIDE on two extremely sparse datasets with a 2080 Ti GPU, and the results demonstrate that for the time of one training epoch, G-SLIDE can achieve more than 16.4x speedup over SLIDE on a 32-core/64-thread CPU. Furthermore, on the same platform, G-SLIDE can earn an average of 16.2x speedup over TensorFlow-GPU and 30.8x speedup over TensorFlow-CPU.
In many applications using wireless sensor networks, the reliability of monitored data is crucial to analyze situations and take decisions. Compressed sensing methods are effective to ensure durability of a wireless s...
详细信息
Fully Connected neuralnetwork (FCNN) are widely used in image recognition and natural language processing. However, the time cost of training large datasets is high. Optical network-on Chip (ONoC) has been proposed t...
详细信息
ISBN:
(数字)9798331509712
ISBN:
(纸本)9798331509729
Fully Connected neuralnetwork (FCNN) are widely used in image recognition and natural language processing. However, the time cost of training large datasets is high. Optical network-on Chip (ONoC) has been proposed to accelerate the parallel computing of FCNN because of its advantages. Therefore, this paper proposes an accelerated FCNN model based on ONoC. We first design an FCNN-aware mapping strategy, and then propose a group-based inter-core communication scheme with low wavelength requirements according to the distribution of mapping cores. The optimal number of cores in each period is obtained by achieving the trade-off between the communication and computation time. The simulation results show that the proposed scheme has the advantages of low wavelength requirement, short training time and good scalability.
Graph networks are naturally suitable for modeling multi-channel features of EEG signals. However, the existing study that attempts to utilize graph-based neuralnetworks for EEG-based emotion recognition doesn't ...
详细信息
Graph networks are naturally suitable for modeling multi-channel features of EEG signals. However, the existing study that attempts to utilize graph-based neuralnetworks for EEG-based emotion recognition doesn't take the spatio-temporal redundancy of EEG features and differences in brain topology into account. In this paper, we propose EEG-GCN, a paradigm that adopts spatio-temporal and self-adaptive graph convolutional networks for single and multi-view EEG-based emotion recognition. With spatio-temporal attention mechanism employed, EEG-GCN can adaptively capture significant sequential segments and spatial location information in EEG signals. Meanwhile, a self-adaptive brain network adjacency matrix is designed to quantify the connection strength between the channels, in which way to represent the diverse activation patterns under different emotion scenarios. Additionally, we propose a multi-view EEG-based emotion recognition method, which effectively integrates the diverse features of EEG signals. Extensive experiments conducted on two benchmark datasets SEED and DEAP demonstrate that our proposed method outperforms other representative methods from both single and multiple views.
In vehicular ad hoc networks, few existing works on task offloading focus on co-offloading at intra-vehicle level and inter-vehicle level for deep neuralnetwork (DNN) inference. Moreover, they ignore the decentralize...
详细信息
Understanding how neurons perform, when they are organized in interacting networks, is a key to understanding how the brain performs complex functions. Different models that approximate the behavior of interconnected ...
详细信息
Understanding how neurons perform, when they are organized in interacting networks, is a key to understanding how the brain performs complex functions. Different models that approximate the behavior of interconnected neurons have been proposed in the literature. Implementing these models to simulate neuron behavior at an appropriately detailed level to observe collective phenomena is computationally intensive. In this study we analyze the coupled Leaky Integrate-and-Fire model and report on the issues that affect performance when the model is implemented on a GPU. We conclude that the problem is heavily memory-bound. Advances in memory technology at the hardware level seem to be the deciding factor to achieve better performance on the GPU. Our results show that using an NVidia K40 GPU a modest 2x speedup can be achieved compared to a parallel implementation running on a modern multi-core CPU. However, a substantial speedup of 11.1x can be achieved using an NVidia V100 GPU, mainly due to the improvements in its memory subsystem. (C) 2022 Elsevier Inc. All rights reserved.
In this paper, the distributed form of the zeroing neuralnetwork for solving time-varying optimal problems is put forward. Compared with traditional centralized algorithms, distributed algorithms possess better priva...
详细信息
This article studies the leader-follower cooperative tracking problem of a class of multi-agent systems with unknown nonlinear dynamics. As the load of the following agent may be changing throughout the whole work pro...
详细信息
This article studies the leader-follower cooperative tracking problem of a class of multi-agent systems with unknown nonlinear dynamics. As the load of the following agent may be changing throughout the whole work process, we consider the control coefficient of the following agent to be time-varying and nonlinear instead of constant, which is more practical. All agents are connected by the directed communication graph with weighted topology. The followers can have unknown nonidentical nonlinear dynamics and external disturbances. The nonautonomous leader generates the reference trajectory for only part of the followers and others can only receive the information from their neighbors. To achieve the ultimate synchronization of all following agents to the leader, the novel cooperative adaptive control protocols are designed based on the neural approximation and adaptive updating mechanism. A novel singularity-avoided adaptive updating law is proposed to estimate the control coefficient and compensate for the unknown dynamics online. Lyapunov theory is used to prove the ultimate boundedness of the synchronization tracking error. The correctness and effectiveness of the presented control scheme are demonstrated by two simulations in SISO and MIMO cases, respectively.
暂无评论