检索结果-内蒙古大学图书馆

Harnessing the Power of Local Supervision in Federated Learning

IEEE Transactions on Big Data 2024年 1-12页

作者： Wang, Fei Li, Baochun Department of Electrical and Computer Engineering University of Toronto Canada

Federated learning is widely accepted as a privacy-preserving paradigm for training a shared global model across multiple client devices in a collaborative fashion. However, in practice, the significantly limited computational power on client devices has been a major barrier when we wish to train large models with potentially hundreds of millions of parameters. In this paper, we propose a new architecture, referred to as Infocomm, that incorporates locally supervised learning in federated learning. With locally supervised learning, the disadvantages of split learning can be avoided by using a more flexible way to offload training from resource constrained clients to a more capable server. Infocomm enables parallel training of different modules of the neural network in both the server and clients in a gradient-isolated fashion. The efficacy in reducing both training time and communication time is supported by our theoretical analysis and empirical results. In the scenario involving larger models and fewer available local data, Infocomm has been observed to reduce the elapsed time per round by over 37% without sacrificing accuracy compared to both conventional federated learning or directly combining federated learning and split learning, which showcases the advantages of Infocomm under power-constrained IoT scenarios. IEEE

关键词： Internet of things

来源：评论

学校读者我要写书评

暂无评论

Distributed Global Nash Equilibrium of Interactive Adversarial Graphical Games

引用

Journal of Systems Science & Complexity 2025年第2期38卷 613-632页

作者： ZHANG Yizhong LIAN Bosen LEWIS Frank L. Department of Electrical and Computer Engineering Auburn University UTA Research Institute University of Texas at Arlington

This article formulates interactive adversarial differential graphical games for synchronization control of multiagent systems(MASs) subject to adversarial inputs interacting with the systems through topology communications. Local control and interactive adversarial inputs affect each agent's local synchronization error via local networks. The distributed global Nash equilibrium(NE) solutions are guaranteed in the games by solving the optimal control input of each agent and the worst-case adversarial input based solely on local states and communications. The asymptotic stability of the local synchronization error dynamics and the NE are guaranteed. Furthermore, the authors devise a data-driven online reinforcement learning(RL) algorithm that only computes the distributed Nash control online using system trajectory data, eliminating the need for explicit system dynamics. A simulation-based example validates the game and algorithm.

关键词： Adversarial inputs differential graphical games Nash equilibrium reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

Optimizing BERT for Bengali Emotion Classification: Evaluating Knowledge Distillation, Pruning, and Quantization

引用

computer Modeling in engineering & Sciences 2025年第2期142卷 1637-1666页

作者： Md Hasibur Rahman Mohammed Arif Uddin Zinnat Fowzia Ria Rashedur M.Rahman Department of Electrical and Computer Engineering North South UniversityDhaka1229Bangladesh

The rapid growth of digital data necessitates advanced natural language processing(NLP)models like BERT(Bidi-rectional Encoder Representations from Transformers),known for its superior performance in text ***,BERT’s size and computational demands limit its practicality,especially in resource-constrained *** research compresses the BERT base model for Bengali emotion classification through knowledge distillation(KD),pruning,and quantization *** Bengali being the sixth most spoken language globally,NLP research in this area is *** approach addresses this gap by creating an efficient BERT-based model for Bengali *** have explored 20 combinations for KD,quantization,and pruning,resulting in improved speedup,fewer parameters,and reduced memory *** best results demonstrate significant improvements in both speed and *** instance,in the case of mBERT,we achieved a 3.87×speedup and 4×compression ratio with a combination of Distil+Prune+Quant that reduced parameters from 178 to 46 M,while the memory size decreased from 711 to 178 *** results offer scalable solutions for NLP tasks in various languages and advance the field of model compression,making these models suitable for real-world applications in resource-limited environments.

关键词： Bengali NLP black-box distillation emotion classification model compression post-training quantization unstructured pruning

来源：评论

学校读者我要写书评

暂无评论

One model, two skills: active vision and action learning model for robotic manipulation

引用

Science China(Information Sciences) 2025年第6期68卷 331-349页

作者： Guokang WANG Yanhong LIU Huaping LIU School of Electrical and Information Engineering Zhengzhou University Department of Computer Science and Technology Tsinghua University

The perception in most existing vision-based reinforcement learning(RL) models for robotic manipulation relies heavily on static third-person or hand-mounted first-person cameras. In scenarios with occlusions and limited maneuvering space, these carefully positioned cameras often struggle to provide effective visual observations during manipulation. Taking inspiration from human capabilities, we introduce a novel RL-based dual-arm active visual-guided manipulation model(DAVMM), which simultaneously infers “eye” actions and “hand” actions for two separate robotic arms(referred to as the vision-arm and the worker-arm) based on current observations, empowering the robot with the ability to actively perceive and interact with its environment. To handle the extensive redundant observation-action space, we propose a decouplable target-centric reward paradigm to offer stable guidance for the training process. For making fine-grained manipulation action decisions, alongside a global scene image encoder, we utilize an independent encoder to extract local target texture features,enabling the simultaneous acquisition of both global and detailed local information. Additionally, we employ residual-RL and curriculum learning techniques to further enhance our model's sample efficiency and training stability. We conducted comparative experiments and analyses of DAVMM against a set of strong baselines on three occluded and narrow-space manipulation tasks. DAVMM notably improves the success rates across all manipulation tasks and showcases rapid learning capabilities.

关键词： robotic manipulation visual learning reinforcement learning active sensing machine vision

来源：评论

学校读者我要写书评

暂无评论

TSMS-InceptionNeXt:A Framework for Image-Based Combustion State Recognition in Counterflow Burners via Feature Extraction Optimization

引用

computers, Materials & Continua 2025年第6期83卷 4329-4352页

作者： Huiling Yu Xibei Jia Yongfeng Niu Yizhuo Zhang Software Engineering Department of Computer ScienceChangzhou UniversityChangzhou213146China Electrical Engineering Department of Computer ScienceChangzhou UniversityChangzhou213146China

The counterflow burner is a combustion device used for research on *** utilizing deep convolutional models to identify the combustion state of a counter flow burner through visible flame images,it facilitates the optimization of the combustion process and enhances combustion *** existing deep convolutional models,InceptionNeXt is a deep learning architecture that integrates the ideas of the Inception series and *** has garnered significant attention for its computational efficiency,remarkable model accuracy,and exceptional feature extraction ***,since this model still has limitations in the combustion state recognition task,we propose a Triple-Scale Multi-Stage InceptionNeXt(TSMS-InceptionNeXt)combustion state recognitionmethod based on feature extraction ***,to address the InceptionNeXt model’s limited ability to capture dynamic features in flame images,we introduce Triplet Attention,which applies attention to the width,height,and Red Green Blue(RGB)dimensions of the flame images to enhance its ability to model dynamic ***,to address the issue of key information loss in the Inception deep convolution layers,we propose a Similarity-based Feature Concentration(SimC)mechanism to enhance the model’s capability to concentrate on critical ***,to address the insufficient receptive field of the model,we propose a Multi-Scale Dilated Channel Parallel Integration(MDCPI)mechanism to enhance the model’s ability to extract multi-scale contextual ***,to address the issue of the model’s Multi-Layer Perceptron Head(MlpHead)neglecting channel interactions,we propose a Channel Shuffle-Guided Channel-Spatial Attention(ShuffleCS)mechanism,which integrates information from different channels to further enhance the representational power of the input *** validate the effectiveness of the method,experiments are conducted on the counterflow burner flame visible light image datase

关键词： Counterflow burner combustion state recognition InceptionNeXt dilated convolution channel shuffling

来源：评论

学校读者我要写书评

暂无评论

DALTON - Deep Local Learning in SNNs via local Weights and Surrogate-Derivative Transfer

引用

IEEE Transactions on Emerging Topics in Computing 2024年 1-12页

作者： Gaurav, Ramashish Do, Duy Anh Doan, Thinh Yi, Yang Department of Electrical and Computer Engineering Virginia Tech USA

Direct training of Spiking Neural Networks (SNNs) is a challenging task because of their inherent temporality. Added to it, the vanilla Back-propagation based methods are not applicable either, due to the non-differentiability of the spikes in SNNs. Surrogate-Derivative based methods with Backpropagation Through Time (BPTT) address these direct training challenges quite well;however, such methods are not neuromorphic-hardware friendly for the On-chip training of SNNs. Recently formalized Three-Factor based Rules (TFR) for direct local-training of SNNs are neuromorphic-hardware friendly;however, they do not effectively leverage the depth of the SNN architectures (we show it empirically here), thus, are limited. In this work, we present an improved version of a conventional three-factor rule, for local learning in SNNs which effectively leverages depth - in the context of learning features hierarchically. Taking inspiration from the Back-propagation algorithm, we theoretically derive our improved, local, three-factor based learning method, named DALTON (Deep LocAl Learning via local WeighTs and SurrOgate-Derivative TraNsfer), which employs weights and surrogate-derivative transfer from the local layers. Along the lines of TFR, our proposed method DALTON is also amenable to the neuromorphic-hardware implementation. Through extensive experiments on static (MNIST, FMNIST, & CIFAR10) and event-based (N-MNIST, DVS128-Gesture, & DVSCIFAR10) datasets, we show that our proposed local-learning method DALTON makes effective use of the depth in Convolutional SNNs, compared to the vanilla TFR implementation. IEEE

关键词： System-on-chip

来源：评论

学校读者我要写书评

暂无评论

Video captioning using transformer-based GAN

引用

Multimedia Tools and Applications 2025年第10期84卷 7091-7113页

作者： Babavalian, Mohammad Reza Kiani, Kourosh Electrical and Computer Engineering Department Semnan University Semnan Iran

Video captioning is the process of automatically generating natural language descriptions of video content. Historically, most video captioning methods have relied on extending Sequence-to-Sequence (Seq2Seq) models. However, such approaches possess limitations due to the sequential nature of the captions, which leads to less accurate captions. To address this limitation, this paper introduces a novel end-to-end architecture for video captioning that combines conditional Wasserstein Generative Adversarial Networks (cWGAN) with a transformer model. The proposed architecture consists of two modules: feature extraction and caption generation. The feature extraction module aims to obtain an encoded feature vector representing the video contents, while the caption generation module generates human-readable captions from encoded feature vector. To the best of our knowledge, this is the first architecture for generative video captioning that integrates a transformer model with GAN. The results of the proposed model based on the BLEU-4, METEOR, ROUGE-L, and CIDEr metrics, on two datasets, MSVD (BLEU-4 = 61.2, METEOR = 41.6) and MSR-VTT (BLEU-4 = 61.2, METEOR = 31.1), compared to state-of-the-art approaches, demonstrate the effectiveness of the transformer with generative model in generating accurate and human-readable captions. © The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature 2024.

关键词： Generative adversarial networks

来源：评论

学校读者我要写书评

暂无评论

Performance Analysis of ZF and RZF in Low-Resolution ADC/DAC Massive MIMO Systems

引用

China Communications 2024年第8期21卷 115-126页

作者： Talha Younas Shen Jin Muluneh Mekonnen Gao Mingliang Saqib Saleem Sohaib Tahir Mahrukh Liaqat School of Electrical and Electronic Engineering Shandong University of TechnologyZibo 255000China Department of Electrical and Computer Engineering COMSATS University IslamabadSahiwal Campus Department of Electrical and Computer Engineering College of EngineeringDhofar UniversitySalalah 211Oman Department of Electrical and Computer Engineering Addis Ababa Science and Technology UniversityAddis AbabaEthiopia Department of Electrical Engineering College of Electrical and Mechanical EngineeringNUSTIslamabadPakistan

Large number of antennas and higher bandwidth usage in massive multiple-input-multipleoutput(MIMO)systems create immense burden on receiver in terms of higher power *** power consumption at the receiver radio frequency(RF)circuits can be significantly reduced by the application of analog-to-digital converter(ADC)of low *** this paper we investigate bandwidth efficiency(BE)of massive MIMO with perfect channel state information(CSI)by applying low resolution ADCs with Rician *** start our analysis by deriving the additive quantization noise model,which helps to understand the effects of ADC resolution on BE by keeping the power constraint at the receiver in *** also investigate deeply the effects of using higher bit rates and the number of BS antennas on bandwidth efficiency(BE)of the *** emphasize that good bandwidth efficiency can be achieved by even using low resolution ADC by using regularized zero-forcing(RZF)combining *** also provide a generic analysis of energy efficiency(EE)with different options of bits by calculating the energy efficiencies(EE)using the achievable *** emphasize that satisfactory BE can be achieved by even using low-resolution ADC/DAC in massive MIMO.

关键词： low-bit analog-digital converter massive(multiple-input-multiple-output)MIMO minimum mean square error(MMSE) regularized zero forcing zero forcing

来源：评论

学校读者我要写书评

暂无评论

Stronger Polarization for the Deletion Channel

引用

IEEE Transactions on Information Theory 2025年第7期71卷 5192-5214页

作者： Arava, Dar Tal, Ido Technion Department of Electrical and Computer Engineering Haifa32000 Israel

In this paper we show a polar coding scheme for the deletion channel with a probability of error that decays roughly like 2-√Λ, where Λ is the length of the codeword. That is, the same decay rate as that of seminal polar codes for memoryless channels. This is stronger than prior art in which the square root is replaced by a cube root. Our coding scheme is similar yet distinct from prior art. The main differences are: 1) Guard-bands are placed in almost all polarization levels;2) Trellis decoding is applied to the whole received word, and not to segments of it. As before, the scheme is capacity-achieving. The price we pay for this improvement is a higher decoding complexity, which is nonetheless still polynomial, O(Λ4). © 1963-2012 IEEE.

关键词： Polarization

来源：评论

学校读者我要写书评

暂无评论

Realtime and Integrated Framework for LiDAR-based Object Tracking

引用

Journal of Institute of Control, Robotics and Systems 2025年第3期31卷 196-205页

作者： Lee, Gyuseok Kim, Kana Lee, Jejun Kim, Hakil Department of Electrical and Computer Engineering Inha University Korea Republic of

This study proposes a real-time integrated framework for LiDAR-based object tracking in autonomous driving environments. Advancements in LiDAR sensors are increasing point cloud data collection, leading to a demand for reliable real-time processing methods. The proposed framework applies voxelization and ground removal techniques to reduce computational load and integrates clustering and deep learning-based object recognition to ensure stability. Combining the point cloud data from LiDAR and the IMU data corrects distortions and refines real-time object movement, enabling accurate tracking in dynamic environments. This framework supports a maximum detection range of 100 m, with a computation time of 52 ms, a positional error of 1.06 m, a heading error of 3.79°, a relative velocity error of 1.46 m/s, and an average tracking frame count of 101, thereby improving object recognition accuracy and tracking performance while fulfilling real-time processing requirements. © ICROS 2025.

关键词： 3D object detection autonomous driving LiDAR object tracking

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：