检索结果-内蒙古大学图书馆

distributed Learning for Uplink Cell-Free Massive MIMO networks

IEEE TRANSACTIONS ON COMMUNICATIONS 2023年第9期71卷 5595-5606页

作者： Wang, Rui Dai, Weijie Jiang, Yi Fudan Univ Sch Informat Sci & Technol Dept Commun Sci & Engn Key Lab Informat Sci Electromagnet Waves MoE Shanghai 200438 Peoples R China

Cell-free massive multiple-input multiple-output (MIMO) can resolve the inter-cell interference issue in cellular networks through cooperative beamforming of the distributed access points (APs). This paper focuses on an uplink cell-free massive MIMO network and investigates novel methods to train the central processing unit (CPU), the APs, and the users in the network. To reduce the communication burden posed on the fronthaul, each AP applies receive beamforming to compress the vector signals into scalar ones before passing them to the CPU for centralized processing. By drawing analogies between an uplink cell-free network and a quasi-neural network and borrowing the idea of backpropagation algorithm, we propose a novel scheme named the distributed learning for uplink cell-free massive MIMO beamforming (DLCB), which can achieve the multi-AP cooperation without explicit estimation of their channel state information (CSI). The DLCB has low computational complexity and is applicable to various objective functions, such as the minimum mean squared error criterion and the maximum sum rate criterion. Extensive simulations verify that the proposed scheme achieves superior performance over the state-of-the-art methods.

关键词： Cell-free massive MIMO network backpropa-gation algorithm quasi-neural network distributed learning

来源：评论

学校读者我要写书评

暂无评论

distributed Graph neural network Design for Sum Ergodic Spectral Efficiency Maximization in Cell-Free Massive MIMO

arXiv

引用

arXiv 2024年

作者： Tung, Nguyen Xuan Van Chien, Trinh Ngo, Hien Quoc Hwang, Won Joo The Department of Information Convergence Engineering Pusan National University Busan46241 Korea Republic of Viet Nam Queen's University Belfast United Kingdom The School of Computer Science and Engineering Center for Artificial Intelligence Research Pusan National University Busan46241 Korea Republic of

This paper proposes a distributed learning-based framework to tackle the sum ergodic rate maximization problem in cell-free massive multiple-input multiple-output (MIMO) systems by utilizing the graph neural network (GNN). Different from centralized schemes, which gather all the channel state information (CSI) at the central processing unit (CPU) for calculating the resource allocation, the local resource of access points (APs) is exploited in the proposed distributed GNN-based framework to allocate transmit powers. Specifically, APs can use a unique GNN model to allocate their power based on the local CSI. The GNN model is trained at the CPU using the local CSI of one AP, with partially exchanged information from other APs to calculate the loss function to reflect system characteristics, capturing comprehensive network information while avoiding computation burden. Numerical results show that the proposed distributed learning-based approach achieves a sum ergodic rate close to that of centralized learning while outperforming the model-based optimization. Copyright © 2024, The Authors. All rights reserved.

关键词： Channel state information

来源：评论

学校读者我要写书评

暂无评论

distributed solar photovoltaic power prediction algorithm based on deep neural network

Journal of Engineering Research (Kuwait)

引用

Journal of Engineering Research (Kuwait) 2025年

作者： Zhao, Ming Li, Sitao Chen, Hao Ling, Min Chang, Hong Digital Platform Technology Company China Southern Power Grid Guangdong Shenzhen 518000 China

Solar photovoltaic (PV) power prediction is easily affected by weather factors. In order to reduce the solar photovoltaic (PV) power prediction deviation and improve the prediction accuracy, a distributed solar photovoltaic (PV) power prediction algorithm based on deep neural network is proposed. By deeply exploring the working principle of photovoltaic power generation, constructing a photovoltaic power generation system model, and systematically analyzing various factors that affect photovoltaic power generation, detailed classification of weather types can be achieved. On this basis, outlier detection, standardization processing, and normalization techniques are used to deeply clean and optimize the raw data, effectively avoiding the problem of neuron saturation. The use of wavelet packet decomposition method to decompose the photovoltaic power generation sequence into multiple sub sequences significantly reduces the difficulty of prediction. The effective fusion of LSTM (Long Short-Term Memory) and BPNN (Back Propagation neural network), and the fine adjustment of the fusion ratio parameter through genetic algorithm, ultimately achieved high-precision prediction of distributed photovoltaic power under complex and variable weather conditions. The experimental results show that the proposed method can accurately predict photovoltaic power under different weather conditions, and the prediction results are reliable. © 2024 The Authors

关键词： BPNN Deep neural network distributed photovoltaic Genetic algorithm LSTM Power prediction Weather factors

来源：评论

学校读者我要写书评

暂无评论

C-DNN: An Energy-Efficient Complementary Deep-neural-network Processor With Heterogeneous CNN/SNN Core Architecture

引用

IEEE JOURNAL OF SOLID-STATE CIRCUITS 2024年第1期59卷 157-172页

作者： Kim, Sangyeob Kim, Soyeon Hong, Seongyon Kim, Sangjin Han, Donghyeon Choi, Jiwon Yoo, Hoi-Jun Korea Adv Inst Sci & Technol Sch Elect Engn Daejeon 34141 South Korea

In this article, we propose a complementary deep-neural-network (C-DNN) processor by combining convolutional neural network (CNN) and spiking neural network (SNN) to take advantage of them. The C-DNN processor can support both complementary inference and training with heterogeneous CNN and SNN core architecture. In addition, the C-DNN processor is the first DNN accelerator application-specific integrated circuit (ASIC) that can support CNN-SNN workload division by using their magnitude-energy tradeoff. The C-DNN processor integrates the CNN-SNN workload allocator and attention module to find a more energy-efficient network domain for each workload in DNN. They enable the C-DNN processor to operate at the energy optimal point. Moreover, the SNN processing element (PE) array with distributed L1 cache can reduce the redundant memory access for SNN processing, resulting in a 42.2%-49.1% reduction. For high energy-efficient DNN training, the C-DNN processor integrates the global counter and local delta-weight (LDW) unit to eliminate power-consuming counters for a forward delta-weight generation. Furthermore, the forward delta-weight-based sparsity generation (FDWSG) is proposed to reduce the number of operations for training by 31%-79% The C-DNN processor achieves an energy efficiency of 85.8 and 79.9 TOPS/W for inference with CIFAR-10 and CIFAR-100, respectively (VGG-16). Moreover, the C-DNN processor achieves ImageNet classification with state-of-the-art energy efficiency of 24.5 TOPS/W (ResNet-50). For training, the C-DNN processor achieves the state-of-the-art energy efficiency of 84.5 and 17.2 TOPS/W for CIFAR-10 and ImageNet, respectively. Furthermore, it achieves 77.1% accuracy for ImageNet training with ResNet-50.

关键词： Application-specific integrated circuit (ASIC) complementary deep neural network (C-DNN) convolutional neural network (CNN) deep learning deep neural network spiking neural network (SNN)

来源：评论

学校读者我要写书评

暂无评论

Tree-Structured neural network for Hyperspectral Pansharpening

引用

IEEE JOURNAL OF SELECTED TOPICS IN APPLIED EARTH OBSERVATIONS AND REMOTE SENSING 2024年 17卷 2516-2530页

作者： He, Lin Ye, Hanghui Xi, Dahan Li, Jun Plaza, Antonio Zhang, Mei South China Univ Technol Sch Automat Sci & Engn Guangzhou 510640 Peoples R China China Univ Geosci Sch Comp Sci Wuhan 430078 Peoples R China China Univ Geosci Hubei Key Lab Intelligent Geoinformat Proc Wuhan 430078 Peoples R China Univ Extremadura Hyperspectral Comp Lab Caceres 10003 Spain

Hyperspectral (HS) pansharpening refers to fusing low spatial resolution HS (LRHS) images with the corresponding panchromatic (PAN) images to create high spatial resolution HS (HRHS) images. Most of the existing HS pansharpening methods overlook the spatial and spectral imbalance of the ground objects of different types in the observed scenes. To address the dilemma, in this article we develop a novel tree-structured neural network (Tree-SNet) to form an adaptive spatial-spectral processing for HS pansharpening. The Tree-SNet method maps a convolutional neural network (CNN) onto a hierarchical tree structure, where routing nodes automatically tune the data distributed to tree paths, which is adaptive to the local characteristics of the data, while spatial enhancement (SpatE) and spectral enhancement (SpecE) modules are dynamically performed in the tree paths to further strengthen the adaptive processing. The proposed Tree-SNet is evaluated on several datasets, and the experimental results verify its superiority.

关键词： Adaptive data distribution convolutional neural network (CNN) dynamic enhancement hierarchical tree hyperspectral (HS) images pansharpening

来源：评论

学校读者我要写书评

暂无评论

Construction of an Efficient Divided/distributed neural network Model Using Edge Computing

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2020年第10期E103D卷 2072-2082页

作者： Shingai, Ryuta Hiraga, Yuria Fukuoka, Hisakazu Mitani, Takamasa Nakada, Takashi Nakashima, Yasuhiko Nara Inst Sci & Technol Ikoma 6300192 Japan

Modern deep learning has significantly improved performance and has been used in a wide variety of applications. Since the amount of computation required for the inference process of the neural network is large, it is processed not by the data acquisition location like a surveillance camera but by the server with abundant computing power installed in the data center. Edge computing is getting considerable attention to solve this problem. However, edge computing can provide limited computation resources. Therefore, we assumed a divided/distributed neural network model using both the edge device and the server. By processing part of the convolution layer on edge, the amount of communication becomes smaller than that of the sensor data. In this paper, we have evaluated AlexNet and the other eight models on the distributed environment and estimated FPS values with Wi-Fi, 3G, and 5G communication. To reduce communication costs, we also introduced the compression process before communication. This compression may degrade the object recognition accuracy. As necessary conditions, we set FPS to 30 or faster and object recognition accuracy to 69.7% or higher. This value is determined based on that of an approximation model that binarizes the activation of neural network. We constructed performance and energy models to find the optimal configuration that consumes minimum energy while satisfying the necessary conditions. Through the comprehensive evaluation, we found that the optimal configurations of all nine models. For small models, such as AlexNet, processing entire models in the edge was the best. On the other hand, for huge models, such as VGG16, processing entire models in the server was the best. For medium-size models, the distributed models were good candidates. We confirmed that our model found the most energy efficient configuration while satisfying FPS and accuracy requirements, and the distributed models successfully reduced the energy consumption up to 48.6%, and

关键词： convolutional neural network edge computing distributed neural network video compression

来源：评论

学校读者我要写书评

暂无评论

DistrEE: distributed Early Exit of Deep neural network Inference on Edge Devices

DistrEE: Distributed Early Exit of Deep Neural Network Infer...

引用

IEEE Conference on Global Communications (GLOBECOM)

作者： Xian Peng Xin Wu Lianming Xu Li Wang Aiguo Fei School of Computer Science (National Pilot Software Engineering School) Beijing University of Posts and Telecommunications Beijing China School of Electronic Engineering Beijing University of Posts and Telecommunications Beijing China

ISBN: (数字)9798350351255

ISBN: (纸本)9798350351262

distributed DNN inference is becoming increasingly important as the demand for intelligent services at the network edge grows. By leveraging the power of distributed computing, edge devices can perform complicated and resource-hungry inference tasks previously only possible on powerful servers, enabling new applications in areas such as autonomous vehicles, industrial automation, and smart homes. However, it is challenging to achieve accurate and efficient distributed edge inference due to the fluctuating nature of the actual resources of the devices and the processing difficulty of the input data. In this work, we propose DistrEE, a distributed DNN inference framework that can exit model inference early to meet specific quality of service requirements. In particular, the framework firstly integrates model early exit and distributed inference for multi-node collaborative inferencing scenarios. Furthermore, it designs an early exit policy to control when the model inference terminates. Extensive simulation results demonstrate that DistrEE can efficiently realize efficient collaborative inference, achieving an effective trade-off between inference latency and accuracy.

关键词： Performance evaluation Training Adaptation models Accuracy Computational modeling Simulation Collaboration distributed databases Artificial neural networks Vehicle dynamics

来源：评论

学校读者我要写书评

暂无评论

Load-Aware Orchestrator for Edge-Computing-Aided Wireless Augmented Reality

引用

IEEE INTERNET OF THINGS JOURNAL 2025年第6期12卷 6595-6606页

作者： Qian, Weiyang Coutinho, Rodolfo W. L. Concordia Univ Dept Elect & Comp Engn Montreal PQ H3G 1M8 Canada

Mobile augmented reality (MAR) has gained increased attention thanks to its potential to transform applications in different domains. One of the challenges to realizing MAR systems is the processing of video frames efficiently. MAR user devices are often resource-constrained and unsuitable for real-time object detection and recognition from video streams. Edge computing has tremendous potential to enable MAR systems, where processing instances (e.g., serverless functions, containers, or virtual machines) can implement and manage the execution of convolutional neural networks (CNNs) for processing MAR offloaded video frames. One of the challenges is how to balance the video frames across the edge servers and processing instances. In this article, we proposed the LAOS orchestrator for resource management and load balancing of distributed edge servers for MAR systems. The LAOS orchestrator balances incoming video frames among processing instances at the edge servers that process video frames. It also determines when to spawn new instances of the CNN functions aimed at ensuring a predefined latency threshold for the processing of video frames. Besides, we devised a novel queuing-based framework for modeling the resource management problem of distributed edge servers for MAR systems. The obtained numerical results show that the proposed LAOS orchestrator reduces the latency and efficiently manages the edge computing resources when dynamic workload peaks are considered.

关键词： Augmented reality (AR) edge computing mobile computing mobile computing network performance modeling network performance modeling network performance modeling

来源：评论

学校读者我要写书评

暂无评论

Smart Memory: Deep Learning Acceleration in 3D-Stacked Memories

IEEE COMPUTER ARCHITECTURE LETTERS

引用

IEEE COMPUTER ARCHITECTURE LETTERS 2024年第1期23卷 137-141页

作者： Rezaei, Seyyed Hossein SeyyedAghaei Moghaddam, Parham Zilouchian Modarressi, Mehdi Univ Tehran Sch Elect & Comp Engn Tehran 25529 Iran

processing-in-memory (PIM) is the most promising paradigm to address the bandwidth bottleneck in deep neural network (DNN) accelerators. However, the algorithmic and dataflow structure of DNNs still necessitates moving a large amount of data across banks inside the memory device to bring input data and their corresponding model parameters together, negatively shifting part of the bandwidth bottleneck to the in-memory data communication infrastructure. To alleviate this bottleneck, we present Smart Memory, a highly parallel in-memory DNN accelerator for 3D memories that benefits from a scalable high-bandwidth in-memory network. Whereas the existing PIM designs implement the compute units and network-on-chip on the logic die of the underlying 3D memory, in Smart Memory the computation and data transmission tasks are distributed across the memory banks. To this end, each memory bank is equipped with (1) a very simple processing unit to run neural networks, and (2) a circuit-switched router to interconnect memory banks by a 3D network-on-memory. Our evaluation shows 44% average performance improvement over state-of-the-art in-memory DNN accelerators.

关键词： network-on-memory processing-in-memory 3D-stacked memory deep learning accelerator

来源：评论

学校读者我要写书评

暂无评论

Scalable Multivariate Fronthaul Quantization for Cell-Free Massive MIMO

引用

IEEE TRANSACTIONS ON SIGNAL processing 2025年 73卷 1658-1673页

作者： Park, Sangwoo Gokceoglu, Ahmet Hasim Wang, Li Simeone, Osvaldo Kings Coll London Ctr Intelligent Informat Proc Syst CIIPS Dept Engn Kings Commun Learning & Informat Proc KCLIP Lab London WC2 R2LS England Imperial Coll London Dept Elect & Elect Engn Commun & Signal Proc CSP Grp London SW7 2AZ England Huaweis Sweden Res & Dev Ctr S-16440 Stockholm Sweden

The conventional approach to the fronthaul design for cell-free massive MIMO system follows the compress-and-precode (CP) paradigm. Accordingly, encoded bits and precoding coefficients are shared by the distributed unit (DU) on the fronthaul links, and precoding takes place at the radio units (RUs). Previous theoretical work has shown that CP can be potentially improved by a significant margin by precode-and-compress (PC) methods, in which all baseband processing is carried out at the DU, which compresses the precoded signals for transmission on the fronthaul links. The theoretical performance gain of PC methods are particularly pronounced when the DU implements multivariate quantization (MQ), applying joint quantization across the signals for all the RUs. However, existing solutions for MQ are characterized by a computational complexity that grows exponentially with the sum-fronthaul capacity from the DU to all RUs. In this work, we first present $\alpha$-parallel MQ ($\alpha$-PMQ), a novel MQ scheme whose complexity for quantization is exponential in the fronthaul capacity towards individual RUs. $\alpha$-PMQ tailors MQ to the topology of the network by allowing for parallel local quantization steps for RUs that do not interfere too much with each other. The performance of $\alpha$-PMQ is seen to be close to exact MQ in the regime when both schemes are feasible. We then introduce neural MQ, which replaces the exhaustive search in MQ with gradient-based updates for a neural-network-based decoder, attaining a quantization complexity that grows linearly with the sum-fronthaul capacity. This makes neural-MQ the first truly scalable MQ strategy. Numerical results demonstrate that neural-MQ outperforms CP across all values of the fronthaul capacity regimes.

关键词： Precoding Vectors Quantization (signal) Baseband Massive MIMO Signal resolution Frequency modulation Array signal processing Transmitting antennas Presses -Cell-free massive MIMO distributed MIMO fronthaul quantization multivariate quantization scalability

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：