检索结果-内蒙古大学图书馆

IEEE International Conference on Data Mining Workshops (ICDM Workshops)

作者： Mengxin Wang Liming Fang Kuiqi Chen College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics Nanjing China Nanjing University of Aeronautics and Astronautics Shenzhen Research Institute Shenzhen China Science and Technology on Parallel and Distributed Processing Laboratory (PDL) Changsha China

Federated learning (FL) is a decentralized machine learning framework that prioritizes privacy by allowing clients to train statistical models without sharing their private data, thus eliminating the impact of data fortresses. However, the presence of Byzantine attacks, such as data poisoning and backdoor attack, threatens the robustness of FL schemes. Currently, existing mainstream defense methods are susceptible to multiple adaptive attacks, some of which even violate the privacy principle of FL. Furthermore, these defense schemes become less robust when subjected to targeted poisoning attacks with highly non-IID data distributions. In this work, we propose FedNAT, a novel Byzantine-robust FL framework for whittling away these limitations mentioned above. Specifically, FedNAT first performs a privacy-respecting attention refinement on the activation layer outputs of the local uploads. Then, the server scores the local attentions by calculating their Wasserstein distances and clusters them through the k-median algorithm for global attention aggregation, thus rejecting poisoned local attentions for untargeted attacks. After this process, the global attention is transferred to local attention through the FedNAT loss function, which erases backdoors through the distillation concept. We conduct a comprehensive experimental evaluation to demonstrate that FedNAT significantly outperforms existing robust FL schemes in defending against Byzantine poisoning attacks under both IID and highly non-IID data proportions.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Multi-Outputs Is All You Need For Deblur

arXiv

引用

arXiv 2022年

作者： Liu, Sidun Qiao, Peng Dou, Yong Science and Technology on Parallel and Distributed Laboratory National University of Defense Technology Hunan China

Image deblurring task is an ill-posed one, where exists infinite feasible solutions for blurry image. Modern deep learning approaches usually discard the learning of blur kernels and directly employ end-to-end supervised learning. Popular deblurring datasets define the label as one of the feasible solutions. However, we argue that it’s not reasonable to specify a label directly, especially when the label is sampled from a random distribution. Therefore, we propose to make the network learn the distribution of feasible solutions, and design based on this consideration a novel multi-head output architecture and corresponding loss function for distribution learning. Our approach enables the model to output multiple feasible solutions to approximate the target distribution. We further propose a novel parameter multiplexing method that reduces the number of parameters and computational effort while improving performance. We evaluated our approach on multiple image-deblur models, including the current state-of-the-art NAFNet. The improvement of best overall (pick the highest score among multiple heads for each validation image) PSNR outperforms the compared baselines up to 0.11∼0.18dB. The improvement of the best single head (pick the best-performed head among multiple heads on validation set) PSNR outperforms the compared baselines up to 0.04∼0.08dB. The codes are available at https://***/Liu-SD/multi-output-deblur. © 2022, CC BY.

关键词： Image enhancement

来源：评论

学校读者我要写书评

暂无评论

Hard Contrastive Learning for Video Captioning

Hard Contrastive Learning for Video Captioning

引用

IEEE International Conference on Electronics and Communication Engineering (ICECE)

作者： Lilei Wu Jie Liu Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Laboratory of Software Engineering for Complex Systems National University of Defense Technology Changsha China

ISBN: (纸本)9781665487900

Maximum likelihood estimation has been widely adopted along with the encoder-decoder framework for video captioning. However, it ignores the structure of sentences and restrains the diversity and distinction of generated captions. To address this issue, we propose a hard contrastive learning (HCL) method for video captioning. Specifically, built on the encoder-decoder framework, we introduce mismatched pairs to learn a reference distribution of video descriptions. The target model on the matched pairs is learned on top the reference model, which improves the distinctiveness of generated captions. In addition, we further boost the distinctiveness of the captions by developing a hard mining technique to select the hardest mismatched pairs within the contrastive learning framework. Finally, the relationships among multiple relevant captions for each video is consider to encourage the diversity of generated captions. The proposed method generates high quality captions which effectively capture the specialties in individual videos. Extensive experiments on two benchmark datasets, i.e., MSVD and MSR-VTT, show that our approach outperforms state-of-the-art methods.

关键词： Maximum likelihood estimation Visualization Video description Benchmark testing

来源：评论

学校读者我要写书评

暂无评论

A Counterfactual Ultrasound Anti-Interference Self-Supervised Network for B-mode Ultrasound Tongue Extraction

A Counterfactual Ultrasound Anti-Interference Self-Supervise...

引用

International Conference on Acoustics, Speech, and Signal processing (ICASSP)

作者： Yan Jia Yuqing Cheng Kele Xu Yong Dou Peng Qiao Zhouyu He National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology Changsha China College of Systems Engineering National University of Defense Technology Changsha China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

B-mode ultrasound tongue imaging is a non-invasive and real-time method for visualizing vocal tract deformation. However, accurately extracting the tongue’s surface contour remains a significant challenge due to the low signal-to-noise ratio (SNR) and prevalent speckle noise in ultrasound images. Traditional supervised learning models often require large labeled datasets, which are labor-intensive to produce and susceptible to noise interference. To address these limitations, we present a novel Counterfactual Ultrasound Anti-Interference Self-Supervised Network (CUAI-SSN), which integrates self-supervised learning (SSL) with counterfactual data augmentation, progressively disentangles confounding factors, ensuring that the model generalizes well across varied ultrasound conditions. Our approach leverages causal reasoning to decouple noise from relevant features, enabling the model to learn robust representations that focus on essential tongue structures. By generating counterfactual image-label pairs, our method introduces alternative, noise-independent scenarios that enhance model training. Furthermore, we introduce attention mechanisms to enhance the network’s ability to capture fine-grained details even in noisy conditions. Extensive experiments on real ultrasound tongue images demonstrate that CUAI-SSN outperforms existing methods, setting a new benchmark for automated contour extraction in ultrasound tongue imaging. Our code is publicly available at https://***/inexhaustible419/CounterfactualultrasoundAI.

关键词： Training Ultrasonic imaging Tongue Self-supervised learning Data augmentation Data models Cognition Data mining Noise measurement Signal to noise ratio

来源：评论

学校读者我要写书评

暂无评论

A Class of Fast and Accurate Multi-layer Block Summation and Dot Product Algorithms 18th

A Class of Fast and Accurate Multi-layer Block Summation a...

引用

18th IFIP WG 10.3 International Conference on Network and parallel Computing, NPC 2021

作者： He, Kang Barrio, Roberto Chen, Lin Jiang, Hao Liu, Jie Gu, Tongxiang Qi, Jin Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha410073 China Department of Applied Mathematics University of Zaragoza ZaragozaE50009 Spain College of Computer National University of Defense Technology Changsha410073 China Institute of Applied Physics and Computational Mathematics Beijing100000 China

ISBN: (纸本)9783030935702

Basic recursive summation and common dot product algorithm have a backward error bound that grows linearly with the vector dimension. Blanchard [1] proposed a class of fast and accurate summation and dot product algorithms respectively called FABsum and FABdot, which trades off the calculation accuracy and speed by the block size. Castaldo [2] proposed a multi-layer block summation and dot product algorithm called SuperBlocksum and SuperBlockdot that can increase the accuracy while adding almost no additional calculations. We combine the idea of [1] with the multi-layer block structure to propose SuperFABsum (for "super fast and accurate block summation") and SuperFABdot (for "super fast and accurate block dot product"). Our algorithms have two variants, one is SuperFAB(within), the other is SuperFAB(outside). Our algorithms further improve accuracy and speed compared with FAB and SuperBlock. We conducted accuracy and speed tests on the high-performance FT2000+ processor. Experimental results show that SuperFABdot(within) algorithm is more accurate than FABdot and SuperBlockdot. Compared with FABdot, SuperFABdot(outside) algorithm can achieve up to 1.2 × performance speedup while ensuring similar accuracy. © 2022, IFIP International Federation for Information processing.

关键词： Artificial intelligence

来源：评论

学校读者我要写书评

暂无评论

OLM2: Automatic Optimal Strategy Generating for Large-Scale Model Training with Limited-Memory

OLM2: Automatic Optimal Strategy Generating for Large-Scale ...

引用

IEEE International Conference on Joint Cloud Computing (JCC)

作者： Zhilin Yang Yu Tang Linbo Qiao Xi Yang Zhen Huang National Key Laboratory of Parallel and Distributed Computing College of Computer Science National University of Defense Technology Changsha 410073 China

The scale of model parameters and the amount of training data is exponentially increasing. It requires more GPU memory with the exponential increasement of model parameters. Recomputation and swapping are two main memory optimization methods that have been extensively studied, and there are also optimization strategies that combine the two methods. However, most of them are based on heuristic search strategies, which do not explore the complete solution space and can’t guarantee the optimality of the solution results. An optimal search strategy with tensor-level recomputation and swapping is expected in large-scale model training. In this paper, we propose an optimal strategy searching algorithm combining tensor-based recomputation and swapping. Specifically, the memory swapping strategy is reformulated as an optimization problem, which converts the memory constraints into mixed integer programming, to find the optimal memory optimization strategy. By leveraging the advantages of both recomputation and swapping, this approach minimizes computation consumption without exceeding the available memory limitation. Experimental results show that our method exhibits about 60% reduction in memory requirements during the training process. Furthermore, our method can reduce the overall training time beyond the existing algorithms. Compared to Checkmate, our approach achieves about 0.3–0.9% reduction in computation cost per iteration.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Novel Interactive Recurrent Attention Network for Emotion-Cause Pair Extraction 3

A Novel Interactive Recurrent Attention Network for Emotion-...

引用

3rd International Conference on Algorithms, Computing and Artificial Intelligence, ACAI 2020

作者： Jia, Xiangyu Chen, Xinhai Wan, Qian Liu, Jie Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha China

ISBN: (纸本)9781450388115

Unlike Emotion Cause Extraction (ECE) task which consists of pre-annotate emotions and passage, emotion-cause pair extraction (ECPE) aims at extracting potential emotions and corresponding causes in the document without the need for pre-annotations. Traditional ECPE solutions divide the extracting emotions and causes operation into two separate parts. However, separating the bidirectional dependence between emotion and cause may lose a lot of potentially useful information. In this paper, we propose a novel interactive recurrent attention network (IRAN). Our approach focuses on the bidirectional impact between emotions and causes, and extracts emotions and causes simultaneously. The information in the document can be fully exploited through multiple modeling and information extraction. Our emotion-specific transformation and distance fusion correlation can adaptively focus on the emotions and the distance, gracefully incorporate them into a distinguishable neural network attention framework. The experimental results show that our proposed model achieves better performance than other widely-used models on the ECPE corpus. © 2020 ACM.

关键词： Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

A Connectivity-Enhanced Multi-Task Learning based on Anatomical Priors for 3D Class-Balanced Pulmonary Airway Segmentation

A Connectivity-Enhanced Multi-Task Learning based on Anatomi...

引用

IEEE International Conference on Bioinformatics and Biomedicine (BIBM)

作者： Yan Jia Yong Dou Peng Qiao Yuqing Cheng Kele Xu Zhouyu He National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology Changsha China College of Systems Engineering National University of Defense Technology Changsha China

ISBN: (数字)9798350386226

ISBN: (纸本)9798350386233

Accurate and efficient airway segmentation is essential for evaluating pulmonary diseases, aiding diagnosis, reducing the preoperative burden of airway identification, and minimizing patient discomfort during prolonged surgeries. However, current pulmonary airway reconstruction techniques are hindered by two major challenges: difficulty in accurately reconstructing fine airway branches due to the tendency to overlook small targets, and insufficient structural connectivity leading to frequent branch discontinuities within the airway tree. These limitations directly affect the clinical applicability of reconstructed airways. To overcome these challenges, a novel 3D pulmonary airway segmentation multi-task framework is proposed, designed to enhance the performance of existing backbone models. This approach integrates Anatomical Prior-Based Multi-Task Learning (AP-MTL) through the use of Gaussian-constructed connectivity-enhanced isosurfaces, significantly improving the network’s ability to maintain airway continuity. Additionally, a Class-Balanced CT Density Distribution Reconstruction mechanism (DDR-CB) is introduced, further refining the model’s capability to detect and segment fine airway branches. As a result of these enhancements, the model demonstrates a 11.5% average improvement in segmentation accuracy and connectivity compared to the baseline. The source code is publicly accessible at https://***/inexhaustible419/APMTLAirwaySegment.

关键词： Image segmentation Three-dimensional displays Accuracy Lungs Atmospheric modeling Supervised learning Surgery Multitasking Image reconstruction Biomedical imaging

来源：评论

学校读者我要写书评

暂无评论

Merak: An Efficient distributed DNN Training Framework with Automated 3D parallelism for Giant Foundation Models

arXiv

引用

arXiv 2022年

作者： Lai, Zhiquan Li, Shengwei Tang, Xudong Ge, Keshi Liu, Weijie Duan, Yabo Qiao, Linbo Li, Dongsheng The National Laboratory for Parallel and Distributed Processing College of Computer National University of Defense Technology in Changsha Hunan China

Foundation models are in the process of becoming the dominant deep learning technology. Pretraining a foundation model is always time-consuming due to the large scale of both the model parameter and training dataset. Besides being computing-intensive, the pretraining process is extremely memory- and communication-intensive. These challenges make it necessary to apply 3D parallelism, which integrates data parallelism, pipeline model parallelism, and tensor model parallelism, to achieve high training efficiency. However, current 3D parallelism frameworks still encounter two issues: i) they are not transparent to model developers, requiring manual model modification to parallelize training, and ii) their utilization of computation resources, GPU memory, and network bandwidth is insufficient. We propose Merak, an automated 3D parallelism deep learning training framework with high resource utilization. Merak automatically deploys 3D parallelism with an automatic model partitioner, which includes a graph-sharding algorithm and proxy node-based model graph. Merak also offers a non-intrusive API to scale out foundation model training with minimal code modification. In addition, we design a high-performance 3D parallel runtime engine that employs several techniques to exploit available training resources, including a shifted critical path pipeline schedule that increases computation utilization, stage-aware recomputation that makes use of idle worker memory, and sub-pipelined tensor model parallelism that overlaps communication and computation. Experiments on 64 GPUs demonstrate Merak's capability to speed up training performance over state-of-the-art 3D parallelism frameworks of models with 1.5, 2.5, 8.3, and 20 billion parameters by up to 1.42, 1.39, 1.43, and 1.61×, respectively. The code for Merak has been open-sourced at https://***/hpdl-group/Merak. Copyright © 2022, The Authors. All rights reserved.

关键词： Pipelines

来源：评论

学校读者我要写书评

暂无评论

SCGraph: Accelerating Sample-based GNN Training by Staged Caching of Features on GPUs

SCGraph: Accelerating Sample-based GNN Training by Staged Ca...

引用

IEEE International Conference on Big Data and Cloud Computing (BdCloud)

作者： Yuqi He Zhiquan Lai Zhejiang Ran Lizhi Zhang Dongsheng Li National Key Laboratory of Parallel and Distributed Processing College of Computer National University of Defense Technology Changsha China

Graph neural networks (GNNs) have been becoming important tools for processing structured graph data and successfully applied to multiple graph-based application scenarios. The existing GNN systems adopt sample-based training on large-scale graphs over multiple GPUs. Although they support large-scale graph training, large data loading overhead of transferring vertex features between CPUs and GPUs is still a bottleneck. In this work, we propose SCGraph, a method that supports GPU high-speed feature caching. SCGraph classifies the graph vertices sorted by out-degrees. For high out-degree vertices, SCGraph sets grading caches via different GPUs to increase the overall cache content through NVLink high-speed data transmission between them. For low out-degree vertices, SCGraph expands training vertices' neighborhood in advance to regenerate cache. We evaluate SCGraph against two state-of-the-art industrial GNN frameworks, i.e., DGL and PaGraph on various benchmarks. Experimental results show that SCGraph improves the cache hit rate over GPUs up to 23.6%, and achieves up to 1.71x performance speedup over the state-of-the-art baselines while the convergence almost constant.

关键词： Training Memory management Loading Graphics processing units Benchmark testing Graph neural networks Data communication

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：