检索结果-内蒙古大学图书馆

47th IEEE International Conference on Acoustics, Speech and Signal processing (ICASSP)

作者： Pavel, Saidur R. Zhang, Yimin D. Temple Univ Dept Elect & Comp Engn Philadelphia PA 19122 USA

ISBN: (纸本)9781665405409

distributed array consisting of multiple subarrays is attractive for high-resolution direction-of-arrival (DOA) estimation when a large-scale array is infeasible. To achieve effective distributed DOA estimation, it is required to transmit information observed at the subarrays to the fusion center, where DOA estimation is performed. For noncoherent data fusion, the covariance matrices are used for subarray fusion. To address the complexity involved with the large array size, we propose a compression framework consisting of multiple parallel encoders and a classifier. The parallel encoders at the distributed subarrays are trained to compress the respective covariance matrices. The compressed results are sent to the fusion center where the signal DOAs are estimated using a classifier based on the compressed covariance matrices.

关键词： Direction-of-arrival estimation distributed array distributed sensing data compression neural network

来源：评论

学校读者我要写书评

暂无评论

Drowsiness Detection for Drivers using Hybrid T-distributed Stochastic Neighbor Embedding with Convolutional neural network 2

Drowsiness Detection for Drivers using Hybrid T-Distributed ...

引用

2nd IEEE International Conference on Futuristic Technologies, INCOFT 2023

作者： Balakrishnan, D. Mariappan, Umasree Nikhitha, Balabhadra Akshitha, Bokkisam Visali, Bedamsetty Snehalatha, Bandikattu Kalasalingam Academy of Research and Education Department of Computer Science and Engineering Tamil Nadu Krishnankoil626126 India Sri Vidya College of Engineering and Technology Department of Computer Science and Engineering Tamil Nadu Virudhunagar626005 India

ISBN: (纸本)9798350308846

Drowsiness Detection (DD) is the procedure of identifying signs of drowsiness in individuals, especially in critical situations like driving, heavy machinery operation, or aircraft piloting. Hybrid Bi-directional Long Short Term Memory with Convolutional neural network (HBLSTM-CNN) was proposed for driver's DD. At first, an eye dataset was gathered and subsequently subjected to pre-processing using a shape predictor. The pre-processed images are initially analyzed using a BiLSTM to identify the most prominent characteristics, which are subsequently subjected to additional processing in a CNN to serve the purpose of DD. Nonetheless, as the data's dimensionality grows, BiLSTM demands more computational resources, including memory and processing power. So in this paper, Hybrid t-distributed Stochastic Neighbor Embedding with CNN (Ht-SNE-CNN) is introduced for driver's DD. In Ht-SNE-CNN, t-SNE is utilized for reducing dimensionality because of its inherent simplicity, obviating the requirement to train a sophisticated model. It directly computes a lower-dimensional depiction of the data while emphasizing the preservation of local similarities. This inherent simplicity frequently leads to faster and more direct utilization, particularly when handling high-dimensional data. The reduced data are given as input to CNN for driver's DD. The experimental findings demonstrate that the suggested HtSNE-CNN outperforms HBLSTM-CNN with regard to exactness, precision, and recall for detecting driver drowsiness. © 2023 IEEE.

关键词： bi-directional long short term memory convolutional neural network dimensionality reduction Drowsiness detection t-distributed stochastic neighbor embedding

来源：评论

学校读者我要写书评

暂无评论

SDM-SNN: Sparse distributed Memory Using Constant-Weight Fibonacci Code for Spiking neural network

SDM-SNN: Sparse Distributed Memory Using Constant-Weight Fib...

引用

2024 International VLSI Symposium on Technology, Systems and Applications, VLSI TSA 2024

作者： Zhou, Yu-Xuan Liu, Chih-Wei Institute of Electronics National Yang Ming Chiao Tung University Taiwan Industrial Technology Research Institute China

ISBN: (纸本)9798350360349

The long-term memory (LTM) is generally considered as the unlimited and permanent storage of information in the human brain. This concept has spurred numerous researches focusing on associative memory models, such as Sparse distributed Memory (SDM) in the past, as well as the increasing interest in brain-inspired Spiking neural networks (SNN). In this paper, we present an SDM-like SNN, which combines elements of both SDM and SNN for unsupervised training and inference processing. Despite the hardware-friendly nature of event-driven processing, SNN still faces criticism due to its large memory footprint and repetitive neuron operations. Therefore, the objective of this study is to demonstrate the complementary nature of SDM and SNN, with the aim of significantly reducing energy consumption while maintaining an acceptable level of accuracy loss on neuromorphic computing systems. As an integral part of the SDM-like design, we have developed and thoroughly tested an M-of-N code generation method. The experimental results on the MNIST dataset demonstrated that the proposed constant-weight Fibonacci code SDM-SNN achieves 87.4% accuracy with a simplified 2-bit unsupervised learning rule. In addition, with only 400 neurons, the proposed SDM-SNN hardware architecture reduces numerous energy consumption to 4.5-nJ/classification and 7.5 pJ/SOP with just a little performance degradation (∼ 3.7%) compared to the SOTA full-connected SNN network. © 2024 IEEE.

关键词： Unsupervised learning

来源：评论

学校读者我要写书评

暂无评论

A Deep Recurrent neural network Based Predictive Control Framework for Reliable distributed Stream Data processing 33

A Deep Recurrent Neural Network Based Predictive Control Fra...

引用

33rd IEEE International Parallel and distributed processing Symposium (IPDPS)

作者： Xu, Jielong Tang, Jian Xu, Zhiyuan Yin, Chengxiang Kwiat, Kevin Kamhoua, Charles Syracuse Univ Dept Elect Engn & Comp Sci Syracuse NY 13244 USA US Air Force Res Lab AFRL Wright Patterson AFB OH USA US Army Res Lab ARL Adelphi MD USA

ISBN: (纸本)9781728112466

In this paper, we present design, implementation and evaluation of a novel predictive control framework to enable reliable distributed stream data processing, which features a Deep Recurrent neural network (DRNN) model for performance prediction, and dynamic grouping for flexible control. Specifically, we present a novel DRNN model, which makes accurate performance prediction with careful consideration for interference of co-located worker processes, according to multi-level runtime statistics. Moreover, we design a new grouping method, dynamic grouping, which can distribute/re-distribute data tuples to downstream tasks according to any given split ratio on the fly. So it can be used to re-direct data tuples to bypass misbehaving workers. We implemented the proposed framework based on a widely used distributed Stream Data processing System (DSDPS), Storm. For validation and performance evaluation, we developed two representative stream data processing applications: Windowed URL Count and Continuous Queries. Extensive experimental results show: 1) The proposed DRNN model outperforms widely used baseline solutions, ARIMA and SVR, in terms of prediction accuracy;2) dynamic grouping works as expected;and 3) the proposed framework enhances reliability by offering minor performance degradation with misbehaving workers.

关键词： Deep Learning Recurrent neural network distributed Stream Data processing Storm Prediction

来源：评论

学校读者我要写书评

暂无评论

distributed Heterogeneous Spiking neural network Simulator Using Sunway Accelerators

引用

Big Data Mining and Analytics 2024年第4期7卷 1301-1320页

作者： Xuelei Li Zhichao Wang Yi Pan Jintao Meng Shengzhong Feng Yanjie Wei Shenzhen Institute of Advanced Technology Chinese Academy of SciencesShenzhen 518055Chinaand Guangdong Laboratory of Artificial Intelligence and Digital Economy(SZ)Shenzhen 518055China Shenzhen Institute of Advanced Technology Chinese Academy of SciencesShenzhen 518055China Guangdong Institute of Intelligence Science and Technology Zhuhai 519031China

Spiking neural network(SNN)simulation is very important for studying brain function and validating the hypotheses for neuroscience,and it can also be used in artificial ***,GPU-based simulators have been developed to support the real-time simulation of ***,these simulators’simulating performance and scale are severely limited,due to the random memory access pattern and the global communication between ***,we propose an efficient distributed heterogeneous SNN simulator based on the Sunway accelerators(including SW26010 and SW26010pro),named SWsnn,which supports accurate simulation with small time step(1/16 ms),randomly delay sizes for synapses,and larger scale network *** with existing GPUs,the Local Dynamic Memory(LDM)(similar to cache)in Sunway is much bigger(4 MB or 16 MB in each core group).To improve the simulation performance,we redesign the network data storage structure and the synaptic plasticity flow to make most random accesses occur in *** hides Message Passing Interface(MPI)-related operations to reduce communication costs by separating SNN general ***,SWsnn relies on parallel Compute processing Elements(CPEs)rather than serial Manage processing Element(MPE)to control the communicating buffers,using Register-Level Communication(RLC)and Direct Memory Access(DMA).In addition,SWsnn is further optimized using vectorization and DMA hiding *** results show that SWsnn runs 1.4−2.2 times faster than state-of-the-art GPU-based SNN simulator GPU-enhanced Neuronal networks(GeNN),and supports much larger scale real-time simulation.

关键词： Spiking neural network(SNN)simulation Sunway accelerator random access Message Passing Interface(MPI)communication real-time simulation

来源：评论

学校读者我要写书评

暂无评论

MDLoader: A Hybrid Model-Driven Data Loader for distributed Graph neural network Training

MDLoader: A Hybrid Model-Driven Data Loader for Distributed ...

引用

2024 Workshops of the International Conference for High Performance Computing, networking, Storage and Analysis, SC Workshops 2024

作者： Bae, Jonghyun Choi, Jong Youl Pasini, Massimiliano Lupo Mehta, Kshitij Zhang, Pei Ibrahim, Khaled Z. Lawrence Berkeley National Laboratory United States Oak Ridge National Laboratory United States

ISBN: (纸本)9798350355543

Scalable data management is essential for processing large scientific dataset on HPC platforms for distributed deep learning. In-memory distributed storage is preferred for its speed, enabling rapid, random, and frequent data access required by stochastic optimizers. Processes use one-sided or collective communication to fetch remote data, with optimal performance depending on (i) dataset characteristics, (ii) training scale, and (iii) interconnection network. Empirical analysis shows collective communication excels with larger mini-batch sizes and/or fewer processes, whereas one-sided communication outperforms at larger scales. We propose MDLoader, a hybrid in-memory data loader for distributed graph neural network training. MDLoader features a model-driven performance estimator that dynamically selects between one-sided and collective communication at the beginning of training using Tree of Parzen Estimators (TPE). Evaluations on NERSC Perlmutter and OLCF Summit show MDLoader outperforms single-backend loaders by up to 2.83 × and predicts the suitable communication method with 96.3% (Perlmutter) and 94.3% (Summit) success rate. © 2024 IEEE.

关键词： Graph neural network MPI communication Performance estimator

来源：评论

学校读者我要写书评

暂无评论

DONNA: distributed Optimized neural network Allocation on CIM-Based Heterogeneous Accelerators 8

DONNA: Distributed Optimized Neural Network Allocation on CI...

引用

8th IEEE International Conference on Edge Computing and Communications, EDGE 2024

作者： Alshams, Mojtaba F. Smagulova, Kamilya S. Fahmy, Suhaib A. Fouda, Mohammed E. Eltawil, Ahmed M. King Abdullah University of Science and Technology CEMSE Division Thuwal23955 Saudi Arabia Rain Neuromorphics Inc. San FranciscoCA94110 United States

ISBN: (纸本)9798350368499

The continued development of neural network architectures continues to drive demand for computing power. While data center scaling continues, inference away from the cloud will increasingly rely on distributed inference on multiple devices. Most prior efforts have focused on optimizing singledevice inference or partitioning models to enhance inference throughput. Meanwhile, energy consumption continues to grow in importance as a factor of consideration. This work proposes a framework that searches for optimal model splits and distributes the partitions across the combination of devices taking into account throughput and energy. Participating devices are strategically grouped into homogeneous and heterogeneous clusters consisting of general-purpose CPU and GPU architectures, as well as emerging Compute-In-Memory (CIM) accelerators. The framework simultaneously optimizes inference throughput and energy consumption. It is able to demonstrate up to 4× speedup with approximately 4× per-device energy reduction in a heterogeneous setup compared to single GPU inference. The algorithm also finds a smooth Pareto-like curve in the energy-throughput space for CIM devices. © 2024 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

Sparsity-Aware Communication for distributed Graph neural network Training 24

Sparsity-Aware Communication for Distributed Graph Neural Ne...

引用

53rd International Conference on Parallel processing (ICPP)

作者： Mukhopadhyay, Ujjaini Tripathy, Alok Selvitopi, Oguz Yelick, Katherine Buluc, Aydin Univ Calif Berkeley Berkeley CA 94720 USA Lawrence Berkeley Nat Lab Berkeley CA USA

ISBN: (纸本)9798400717932

Graph neural networks (GNNs) are a computationally efficient method to learn embeddings and classifications on graph data. However, GNN training has low computational intensity, making communication costs the bottleneck for scalability. Sparse-matrix dense-matrix multiplication (SpMM) is the core computational operation in full-graph training of GNNs. Previous work parallelizing this operation focused on sparsity-oblivious algorithms, where matrix elements are communicated regardless of the sparsity pattern. This leads to a predictable communication pattern that can be overlapped with computation and enables the use of collective communication operations at the expense of wasting significant bandwidth by communicating unnecessary data. We develop sparsity-aware algorithms that tackle the communication bottlenecks in GNN training with three novel approaches. First, we communicate only the necessary matrix elements. Second, we utilize a graph partitioning model to reorder the matrix and drastically reduce the amount of communicated elements. Finally, we address the high load imbalance in communication with a tailored partitioning model, which minimizes both the total communication volume and the maximum sending volume. We further couple these sparsity-exploiting approaches with a communication-avoiding approach (1.5D parallel SpMM) in which submatrices are replicated to reduce communication. We explore the tradeoffs of these combined optimizations and show up to 14x improvement on 256 GPUs and on some instances reducing communication to almost zero resulting in a communication-free parallel training relative to a popular GNN framework based on communication-oblivious SpMM.

关键词： Graph neural networks

来源：评论

学校读者我要写书评

暂无评论

Internet of Things Enabled Smart Energy Management of distributed Energy Resources using Self-Adaptive Physics-Informed neural network Optimized with Green Anaconda Optimization 4

Internet of Things Enabled Smart Energy Management of Distri...

引用

4th International Conference on Sentiment Analysis and Deep Learning, ICSADL 2025

作者： Ghongade, Rahul D. Arunaa, J. Sri Latha Mageshwari, P.S. Vinoth, K. Shanmugathai, M. Mercy Rani, J. Femila P. R. Pote Patil College of Engineering and Management Department of Electronics and Telecommunication Engineering Maharashtra Amravati444602 India Dr. M. G. R Educational and Research Institute Department of Electronics and Communication Engineering Tamil Nadu Chennai600095 India Tamil Nadu Kavaraipettai601206 India Vel Tech Rangarajan Dr. Sagunthala R&d Institute of Science and Technology Department of Electrical and Electronics Engineering Tamil Nadu Chennai600062 India Sri Sairam Engineering College Department of English Tamil Nadu Chennai600044 India Department of Mathematics Tamil Nadu Dindigul624622 India

ISBN: (纸本)9798331523923

Internet of Things (IoT)-enabled Smart Energy Management (SEM) in distributed Energy Resources (DERs), while crucial for optimizing energy distribution and resource management faces challenges such as data inconsistencies and the variability of renewable energy generation. These issues result in inaccurate demand forecasts, leading to suboptimal allocation of resources and inefficient Energy Management (EM). Additionally, as energy demand and generation patterns are influenced by factors like weather conditions, time of day, and energy consumption behaviors, prediction models may struggle to capture these dynamic changes, affecting the reliability of forecasts. To address these limitations, this manuscript proposes a novel approach for energy demand prediction. Data is collected from IoT-enabled sensors monitoring DERs. The data undergoes pre-processing, where the Fast Resampled Iterative Filtering (FRIF) method is used to eliminate missing values and normalize the inputs. The Self-Adaptive Physics-Informed neural network (SAPINN) model then utilizes the processed data to forecast energy demand, renewable energy generation, and storage levels. Green Anaconda Optimization (GAO) is applied to optimize the weight parameters of the SAPINN model. The proposed SAPINN-GAO method is implemented using the MATLAB platform and compared with existing models, such Stacked Convoluted Bi-Directional Gated Attention network-Hybrid Darts Seagull Optimizer (SConBGAN-HDSO), Recurrent neural network (RNN), and Support Vector Machine-Particle Swarm Optimization (SVM-PSO). The SAPINN-GAO method achieves an accuracy of 99.2%, precision of 99.2%, and a Root Mean Square Error (RMSE) of 2.2%, demonstrating its superior performance in energy demand prediction. The SAPINN-GAO method's higher accuracy and precision, coupled with its robust performance, make it a reliable and efficient solution for energy demand, renewable energy generation, and storage levels forecasting in IoT-enabled SEM syst

关键词： Recurrent neural networks

来源：评论

学校读者我要写书评

暂无评论

Graph Attention neural network distributed Model Training

Graph Attention Neural Network Distributed Model Training

引用

IEEE World AI IoT Congress (AIIoT)

作者： Esmaeilzadeh, Armin Kambar, Mina Esmail Zadeh Nojoo Heidari, Maryam Univ Nevada Las Vegas NV 89154 USA George Mason Univ Fairfax VA 22030 USA

ISBN: (纸本)9781665484534

The scale of neural language models has been increasing significantly over recent years. As a result, the time complexity of training larger language models and resource utilization has been increasing at a higher rate as well. In this research, we propose a distributed implementation of a Graph Attention neural network model with 120 million parameters and train it on a cluster of eight GPUs. We demonstrate three times speedup in model training while keeping the stability of accuracy and loss rates during training and testing compared to single GPU instance training.

关键词： natural language processing NLP machine learning distributed machine learning distributed systems big data pytorch

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：