检索结果-内蒙古大学图书馆

AutoPipe-H: A Heterogeneity-Aware Data-Paralleled Pipeline Approach on Commodity GPU Servers

IEEE TRANSACTIONS ON COMPUTERS 2025年第4期74卷 1196-1209页

作者： Liu, Weijie Lu, Kai Lai, Zhiquan Li, Shengwei Ge, Keshi Li, Dongsheng Lu, Xicheng Natl Univ Def Technol Coll Comp Changsha 410073 Hunan Peoples R China

Recently, the data-parallel pipeline approach has been widely used in training DNN models on commodity GPU servers. However, there are still three challenges for hybrid parallelism on commodity GPU servers: i) a balanced model partition is crucial for efficiency, whereas prior works lack a sound solution to generate a balanced partition automatically;ii) an orchestrated device mapping is essential to reduce communication contention, however, prior works ignore server heterogeneity, exacerbating communication contention;iii) the startup overhead is inevitable and especially significant for deep pipelines, which is an essential source of pipeline bubbles and severely affects pipeline scalability. We propose AutoPipe-H to solve these three problems, which contains i) a pipeline partitioner component for automatically and quickly generating a balanced sub-block partition scheme;ii) a device mapping component that assigns pipeline stages to devices, considering server heterogeneity, to reduce communication contention;and iii) a distributed training runtime component that reduces pipeline startup overhead by splitting the micro-batch evenly. The experimental results show that AutoPipe-H can accelerate training by up to 1.26x over the hybrid parallelism framework DAPPLE and Piper, with a 2.73x-12.7x improvement in the partition balance and an order-of-magnitude time reduction in partition scheme searching.

关键词： Pipelines Parallel processing Training Servers Computational modeling Graphics processing units Transformers Costs Runtime Data models Artificial neural networks distributed systems distributed deep neural network training hybrid parallelism pipeline parallelism data parallelism

来源：评论

学校读者我要写书评

暂无评论

A distributed Mobile Edge Computing Based Dynamic Resource Allocation in 5G network Using Green Anaconda Optimization Based Deep Learning network

引用

INTERNATIONAL JOURNAL OF COMMUNICATION SYSTEMS 2025年第5期38卷

作者： Selvan, C. Rajulu, G. Govinda Padmanaban, K. Aghalya, S. REVA Univ Sch Comp Sci & Engn Bangalore Karnataka India Vel Tech Rangarajan Dr Sagunthala R&D Inst Sci & T Dept Comp Sci & Engn Chennai India Koneru Lakshmaiah Educ Fdn Dept Comp Sci & Engn Vaddeswaram Andhraprades India St Josephs Coll Engn Dept Elect & Commun Engn Chennai India

Mobile edge computing (MEC) facilitates storage, cloud computing, and analysis capabilities near to the users in 5G communication systems. MEC and deep learning (DL) are combined in 5G networks to enable automated network management that provides resource allocation (RA), energy efficiency (EE), and adaptive security, thereby reducing computational costs and enhancing user services. A hybrid quantum-classical convolutional neural network (HQCCNN) with simplicial attention network (SAN) is presented in the study that allocates appropriate resources for various users in the network. First, the green anaconda optimization (GAO) algorithm is used to optimize the objective function for effective RA. Consequently, the neural network receives the optimized objective functions to allocate resources. In the study, the suggested HQCCNN-GAO model assesses the degree of need for every user and, based on those needs, allots resources to every user in the 5G network while preserving higher throughput and EE. Throughput, latency, mean square errors, processing time, bit error rates, and EE are used to measure the proposed model's efficiency. A few of the RA models that are now in use are contrasted with the outcomes of the suggested method. From the obtained outcomes, it is noticed that the suggested model provides a low latency of 0.08 s and a high throughput of 790 kbps for a range of network users.

关键词： 5G networks green anaconda optimization (GAO) algorithm hybrid quantum-classical convolutional neural network (HQCCNN) mobile edge computing (MEC) resource allocation (RA) and energy efficiency (EE) simplicial attention network (SAN)

来源：评论

学校读者我要写书评

暂无评论

Vector-Valued Hopfield neural networks and distributed Synapse Based Convolutional and Linear Time-Variant Associative Memories

引用

neural processing LETTERS 2023年第4期55卷 4163-4182页

作者： Garimella, Rama Murthy Valle, Marcos Eduardo Vieira, Guilherme Rayala, Anil Munugoti, Dileep Mahindra Univ Ecole Cent Sch Engn Dept Comp Sci Hyderabad India Univ Estadual Campinas Campinas SP Brazil Int Inst Informat Technol Hyderabad India Indian Inst Technol Gauhati India

The Hopfield network is an example of an artificial neural network used to implement associative memories. A binary digit represents the neuron's state of a traditional Hopfield neural network. Inspired by the human brain's ability to cope simultaneously with multiple sensorial inputs, this paper presents three multi-modal Hopfield-type neural networks that treat multi-dimensional data as a single entity. In the first model, called the vector-valued Hopfield neural network, the neuron's state is a vector of binary digits. Synaptic weights are modeled as finite impulse response (FIR) filters in the second model, yielding the so-called convolutional associative memory. Finally, the synaptic weights are modeled by linear time-varying (LTV) filters in the third model. Besides their potential applications for multi-modal intelligence, the new associative memories may also be used for signal and image processing and solve optimization and classification tasks.

关键词： Vector-valued Hopfield neural network Finite impulse response filter Convolution Sub-sampling matrix Linear time-varying filter Multimodal intelligence

来源：评论

学校读者我要写书评

暂无评论

Annealing-inspired training of an optical neural network with ternary weights

引用

COMMUNICATIONS PHYSICS 2025年第1期8卷 1-10页

作者： Skalli, Anas Goldmann, Mirko Haghighi, Nasibeh Reitzenstein, Stephan Lott, James A. Brunner, Daniel Univ Marie & Louis Pasteur CNRS Inst FEMTO ST UMR 6174 Besancon France Tech Univ Berlin Berlin Germany

Artificial neural networks (ANNs) represent a fundamentally connectionist and distributed approach to computing, and as such they differ from classical computers that utilize the von Neumann architecture. This has revived research interest in new unconventional hardware for more efficient ANNs rather than emulating them on traditional machines. To fully leverage ANNs, optimization algorithms must account for hardware limitations and imperfections. Photonics offers a promising platform with scalability, speed, energy efficiency, and parallel processing capabilities. However, fully autonomous optical neural networks (ONNs) with in-situ learning are scarce. In this work, we propose and demonstrate a ternary weight high-dimensional semiconductor laser-based ONN and introduce a method for achieving ternary weights using Boolean hardware, enhancing the ONN's information processing capabilities. Furthermore, we design an in-situ optimization algorithm that is compatible with both Boolean and ternary weights. Our algorithm results in benefits, both in terms of convergence speed and performance. Our experimental results show the ONN's long-term inference stability, with a consistency above 99% for over 10 h. Our work is of particular relevance in the context of in-situ learning under restricted hardware resources, especially since minimizing the power consumption of auxiliary hardware is crucial to preserving efficiency gains achieved by non-von Neumann ANN implementations.

关键词： In situ processing

来源：评论

学校读者我要写书评

暂无评论

Intrusion Detection using Federated Attention neural network for Edge Enabled Internet of Things

引用

JOURNAL OF GRID COMPUTING 2024年第1期22卷 15-15页

作者： Song, Xiedong Ma, Qinmin JiNing Univ Sch Math & Comp Applicat Technol Jining 273155 Peoples R China Cent China Normal Univ Coll Phys Sci & Technol Wuhan 430079 Peoples R China Shenzhen Polytech Univ Sch Artificial Intelligence Shenzhen 518055 Peoples R China

Edge nodes, which are expected to grow into a multi-billion-dollar market, are essential for detection against a variety of cyber threats on Internet-of-Things endpoints. Adopting the current network intrusion detection system with deep learning models (DLM) based on FedACNN is constrained by the resource limitations of this network equipment layer. We solve this issue by creating a unique, lightweight, quick, and accurate edge detection model to identify DLM-based distributed denial service attacks on edge nodes. Our approach can generate real results at a relevant pace even with limited resources, such as low power, memory, and processing capabilities. The Federated Convolution neural network (FedACNN) deep learning method uses attention mechanisms to minimise communication delay. The developed model uses a recent cybersecurity dataset deployed on an edge node simulated by a Raspberry Pi (UNSW 2015). Our findings show that, compared to traditional DLM methodologies, our model retains a high accuracy rate of about 99%, even with decreased CPU and memory resource use. Also, it is about three times smaller in volume than the most advanced model while requiring a lot less testing time.

关键词： Edge computing Deep Learning Internet of Things DDoS Federated Convolution neural network

来源：评论

学校读者我要写书评

暂无评论

Improving Image Monitoring Performance for Underwater Laser Cutting Using a Deep neural network

引用

INTERNATIONAL JOURNAL OF PRECISION ENGINEERING AND MANUFACTURING 2023年第4期24卷 671-682页

作者： Park, Seung-Kyu Song, Ki-Hee Oh, Seong Yong Shin, Jae Sung Park, Hyunmin Korea Atom Energy Res Inst 111 Daedeokdaero 989Beon Gil Daejeon 34057 South Korea

We designed an efficient signal processing implemented with artificial intelligence using a deep neural network for image monitoring of underwater laser cutting for nuclear power plant dismantling. Monitoring images for underwater laser cutting with intense flames in turbid water are characterized by low visibility while pixel values of an image are distributed over the entire dynamic range. The visibility for underwater laser cutting operations was improved by widely stretching pixel value distribution to the full possible dynamic range after removing excessively dark or bright pixels that are far from the dominant pixel intensity distribution. Here, areas of intense flame where pixel values are close to saturation values are preserved. In addition, an efficiently designed look-up table increases contrast in cutting areas with intense flames, and an image acquisition method using the lowest pixel values in the latest frames reduces intermittent monitoring interference caused by the flames erupting in irregular patterns and flowing bubbles. A deep learning neural network trained with the designed signal processing datasets effectively improved the image monitoring performance in underwater laser cutting experiments.

关键词： Image monitoring Underwater laser cutting Signal processing Artificial intelligence Deep learning neural network

来源：评论

学校读者我要写书评

暂无评论

An intelligent mesh-smoothing method with graph neural networks

引用

Frontiers of Information Technology & Electronic Engineering 2025年第3期26卷 367-384页

作者： Zhichao WANG Xinhai CHEN Junjun YAN Jie LIU Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense TechnologyChangsha 410073China Laboratory of Digitizing Software for Frontier Equipment National University of Defense TechnologyChangsha 410073China

In computational fluid dynamics(CFD),mesh-smoothing methods are widely used to refine the mesh quality for achieving high-precision numerical ***,optimization-based smoothing is used for high-quality mesh smoothing,but it incurs significant computational *** works have improved its smoothing efficiency by adopting supervised learning to learn smoothing methods from high-quality ***,they pose difficulties in smoothing the mesh nodes with varying degrees and require data augmentation to address the node input sequence ***,the required labeled high-quality meshes further limit the applicability of the proposed *** this paper,we present graph-based smoothing mesh net(GMSNet),a lightweight neural network model for intelligent mesh *** adopts graph neural networks(GNNs)to extract features of the node’s neighbors and outputs the optimal node *** smoothing,we also introduce a fault-tolerance mechanism to prevent GMSNet from generating negative volume *** a lightweight model,GMSNet can effectively smooth mesh nodes with varying degrees and remain unaffected by the order of input data.A novel loss function,MetricLoss,is developed to eliminate the need for high-quality meshes,which provides stable and rapid convergence during *** compare GMSNet with commonly used mesh-smoothing methods on two-dimensional(2D)triangle *** results show that GMSNet achieves outstanding mesh-smoothing performances with 5%of the model parameters compared to the previous model,but offers a speedup of 13.56 times over the optimization-based smoothing.

关键词： Unstructured mesh Mesh smoothing Graph neural network Optimization-based smoothing

来源：评论

学校读者我要写书评

暂无评论

distributed UAV Beamforming using Graph Recurrent neural networks 13

Distributed UAV Beamforming using Graph Recurrent Neural Net...

引用

13rd IEEE Sensor Array and Multichannel Signal processing Workshop (SAM)

作者： Zheng, Wenqing Sadler, Brian M. Gama, Fernando Chen, Tianlong Univ Texas Austin Austin TX 78712 USA Univ Penn Philadelphia PA 19104 USA

ISBN: (纸本)9798350344820;9798350344813

In this paper we develop a novel learning-based approach for mobile distributed beamforming without channel state information. We consider narrowband beamforming between a mobile UAV group and a base station under limited feedback, and propose a graph recurrent neural network (GRNN) approach to leverage local collaboration among the UAVs. The GRNN method is shown to be robust to variations in UAV speeds and group heading, and scales with the UAV group size. We compare to codebook and binary feedback methods and show that better performance is achieved with the proposed GRNN method.

关键词： distributed beamforming limited feedback recurrent graph neural network UAVs

来源：评论

学校读者我要写书评

暂无评论

DONNA: distributed Optimized neural network Allocation on CIM-Based Heterogeneous Accelerators

DONNA: Distributed Optimized Neural Network Allocation on CI...

引用

IEEE International Conference on Edge Computing (EDGE)

作者： Mojtaba F. AlShams Kamilya S. Smagulova Suhaib A. Fahmy Mohammed E. Fouda Ahmed M. Eltawil CEMSE Division King Abdullah University of Science and Technology Thuwal Saudi Arabia Rain Neuromorphics Inc. San Francisco CA USA

ISBN: (数字)9798350368499

ISBN: (纸本)9798350368505

The continued development of neural network architectures continues to drive demand for computing power. While data center scaling continues, inference away from the cloud will increasingly rely on distributed inference on multiple devices. Most prior efforts have focused on optimizing singledevice inference or partitioning models to enhance inference throughput. Meanwhile, energy consumption continues to grow in importance as a factor of consideration. This work proposes a framework that searches for optimal model splits and distributes the partitions across the combination of devices taking into account throughput and energy. Participating devices are strategically grouped into homogeneous and heterogeneous clusters consisting of general-purpose CPU and GPU architectures, as well as emerging Compute-In-Memory (CIM) accelerators. The framework simultaneously optimizes inference throughput and energy consumption. It is able to demonstrate up to $4 \times$ speedup with approximately $4 \times$ per-device energy reduction in a heterogeneous setup compared to single GPU inference. The algorithm also finds a smooth Pareto-like curve in the energy-throughput space for CIM devices.

关键词： Performance evaluation Energy consumption Computational modeling neural networks Graphics processing units Computer architecture In-memory computing

来源：评论

学校读者我要写书评

暂无评论

AI-Integrated Traffic Information System: A Synergistic Approach of Physics Informed neural network and GPT-4 for Traffic Estimation and Real-Time Assistance

引用

IEEE ACCESS 2024年 12卷 65869-65882页

作者： Gebre, Tewodros Syum Beni, Leila Tsehaye Wasehun, Eden Dorbu, Freda Elikem North Carolina A&T State Univ Coll Sci & Technol Appl Sci & Technol Program Greensboro NC 27411 USA North Carolina A&T State Univ Coll Sci & Technol Geomat Program Greensboro NC 27411 USA North Carolina A&T State Univ Dept Computat Data Sci & Engn Greensboro NC 27411 USA

Traffic management systems have primarily relied on live traffic sensors for real-time traffic guidance. However, this dependence often results in uneven service delivery due to the limited scope of sensor coverage or potential sensor failures. This research introduces a novel approach to overcome this limitation by synergistically integrating a Physics-Informed neural network-based Traffic State Estimator (PINN-TSE) with a powerful Natural Language processing model, GPT-4. The purpose of this integration is to provide a seamless and personalized user experience, while ensuring accurate traffic density prediction even in areas with limited data availability. The innovative PINN-TSE model was developed and tested, demonstrating a promising level of precision with a Mean Absolute Error of less than four vehicles per mile in traffic density estimation. This performance underlines the model's ability to provide dependable traffic information, even in regions where conventional traffic sensors may be sparsely distributed or data communication is likely to be interrupted. Furthermore, the incorporation of GPT-4 enhances user interactions by understanding and responding to inquiries in a manner akin to human conversation. This not only provides precise traffic updates but also interprets user intentions for a tailored experience. The results of this research showcase an AI-integrated traffic guidance system that outperforms traditional methods in terms of traffic estimation, personalization, and reliability. While the study primarily focuses on a single road segment, the methodology shows promising potential for expansion to network-level traffic guidance, offering even greater accuracy and usability. This paves the way for a smarter and more efficient approach to traffic management in the future.

关键词： Sensors Biological neural networks Real-time systems Mathematical models Predictive models Information systems Sensor systems Traffic control Traffic congestion Intelligent transportation systems Artificial intelligence Natural language processing AI-integrated traffic information system physics informed neural network (PINN) traffic state estimation (TSE) traffic data processing GPT-4 prompt engineering natural language processing (NLP) large language models (LLM) foundation models

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：