检索结果-内蒙古大学图书馆

FMCC-RT: a scalable and fine-grained all-reduce algorithm for large-scale SMP clusters

science China(Information sciences) 2025年第5期68卷 362-379页

作者： Jintao PENG Jie LIU Jianbin FANG Min XIE Yi DAI Zhiquan LAI Bo YANG Chunye GONG Xinjun MAO Guo MAO Jie REN School of Computer Science and Technology National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Laboratory of Digitizing Software for Frontier Equipment National University of Defense Technology National Supercomputer Center in Tianjin School of Computer Science Shaanxi Normal University

All-reduce is a widely used communication technique for distributed and parallel applications typically implemented using either a tree-based or ring-based scheme. Each of these approaches has its own limitations: tree-based schemes struggle with efficiently exchanging large messages, while ring-based solutions assume constant communication throughput,an unrealistic expectation in modern network communication infrastructures. We present FMCC-RT, an all-reduce approach that combines the advantages of tree-and ring-based implementations while mitigating their drawbacks. FMCC-RT dynamically switches between tree and ring-based implementations depending on the size of the message being processed. It utilizes an analytical model to assess the impact of message sizes on the achieved throughput, enabling the derivation of optimal work partitioning parameters. Furthermore, FMCC-RT is designed with an Open MPI-compatible API, requiring no modification to user code. We evaluated FMCC-RT through micro-benchmarks and real-world application tests. Experimental results show that FMCC-RT outperforms state-of-the-art tree-and ring-based methods, achieving speedups of up to 5.6×.

关键词： all-reduce collective communication MPI scalability

来源：评论

学校读者我要写书评

暂无评论

Isolate Sets Based parallel Louvain Method for Community Detection

引用

Journal of computer science & Technology 2023年第2期38卷 373-390页

作者：郄航窦勇黄震熊运生 Science and Technology on Parallel and Distributed Laboratory School of Computer National University of Defense TechnologyChangsha 410073China

Community detection is a vital task in many fields,such as social networks and financial analysis,to name a *** Louvain method,the main workhorse of community detection,is a popular heuristic *** apply it to large-scale graph networks,researchers have proposed several parallel Louvain methods(PLMs),which suffer from two challenges:the latency in the information synchronization,and the community *** tackle these two challenges,we propose an isolate sets based parallel Louvain method(IPLM)and a fusion IPLM with the hashtables based Louvain method(FIPLM),which are based on a novel graph partition *** graph partition algorithm divides the graph network into subgraphs called isolate sets,in which the vertices are relatively decoupled from *** first describe the concepts and properties of the isolate *** we propose an algorithm to divide the graph network into isolate sets,which enjoys the same computation complexity as the breadth-first ***,we propose IPLM,which can efficiently calculate and update vertices information in parallel without latency or community ***,we achieve further acceleration by FIPLM,which maintains a high quality of community detection with a faster speedup than *** two methods are for shared-memory architecture,and we implement our methods on an 8-core PC;the experiments show that IPLM achieves a maximum speedup of 4.62x and outputs higher modularity(maximum 4.76%)than the serial Louvain method on 14 of 18 ***,FIPLM achieves a maximum speedup of 7.26x.

关键词： parallel computing isolate set graph partition Louvain method community detection

来源：评论

学校读者我要写书评

暂无评论

YFLM: An Improved Levenberg-Marquardt Algorithm for Global Bundle Adjustment 41st

YFLM: An Improved Levenberg-Marquardt Algorithm for Global ...

引用

41st computer Graphics International Conference, CGI 2024

作者： Peng, Jiaxin Li, Tao Jiang, Qin Liu, Jie Wang, Ruibo Laboratory of Software Engineering for Complex Systems School of Computer Science National University of Defense Technology Hunan Changsha410073 China Parallel and Distributed Processing Laboratory School of Computer Science National University of Defense Technology Hunan Changsha410073 China

ISBN: (纸本)9783031820205

The conventional Levenberg-Marquardt (LM) algorithm is a state-of-the-art trust-region optimization method for solving bundle adjustment problems in the Structure-from-Motion community, which not only takes advantage of the fast convergence of the Gauss-Newton method, but also the stability of the gradient descent method when approaching optimal solutions. However, the damping ratio of LM is simply provided by trial-and-error, which causes slow convergence rate for large-scale problems. This paper proposes the Yamashita-Fukushima LM (YFLM) algorithm to reduce the time complexity for global bundle adjustment, where the damping factor is determined by Yamashita and Fukushima’s method. YFLM dynamically calculates a more reasonable and optimal damping ratio according to the newest reprojection error. The experimental results show that the YFLM algorithm outperforms the conventional LM algorithm for most public bundle adjustment datasets. Besides this, the convergence of the YFLM algorithm is also evaluated with different σ∈(0,2]. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Damping

来源：评论

学校读者我要写书评

暂无评论

Smoothing Point Adjustment-Based Evaluation of Time Series Anomaly Detection 48

Smoothing Point Adjustment-Based Evaluation of Time Series A...

引用

48th IEEE International Conference on Acoustics, Speech and Signal processing, ICASSP 2023

作者： Liu, Mingyu Wang, Yijie Xu, Hongzuo Zhou, Xiaohui Li, Bin Wang, Yongjun National University of Defense Technology Science and Technology on Parallel and Distributed Processing Laboratory College of Computer Changsha China

ISBN: (纸本)9781728163277

Anomalies in time series appear consecutively, forming anomaly segments. Applying the classical point-based evaluation metrics to evaluate the detection performance of segments leads to considerable underestimation, so most related studies resort to point adjustment. This operation treats all points as true positives within a segment equally when only one individual point alarms, resulting in significant overestimation and creating an illusion of superior performance. This paper proposes smoothing point adjustment, a novel range-based evaluation protocol for time series anomaly detection. Our protocol reflects detection performance impartially by carefully considering the specific location and frequency of alarms in the raw results. It is achieved by smoothly determining the adjustment range and rewarding early detection via a ranging function and a rewarding function. Compared with other evaluation metrics, experiments on different datasets show that our protocol can yield a performance ranking of various methods more consistent with the desired situation. © 2023 IEEE.

关键词： Anomaly Detection Evaluation Protocol Point Adjustment Time Series

来源：评论

学校读者我要写书评

暂无评论

Area-NeRF: Area-based Neural Radiance Fields 2

Area-NeRF: Area-based Neural Radiance Fields

引用

2nd International Conference on Image processing, computer Vision and Machine Learning, ICICML 2023

作者： Ye, Zonxin Li, Wenyu Qiao, Peng Dou, Yong National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing School of Computer Changsha China

ISBN: (纸本)9798350331417

Neural Radiance Field (NeRF) has received widespread attention for its photo-realistic novel view synthesis quality. Current methods mainly represent the scene based on point sampling of ray casting, ignoring the influence of the observed area changing with distance. In addition, The current sampling strategies are all focused on the distribution of sampling points on the ray, without paying attention to the sampling of the ray. We found that the current ray sampling strategy for scenes with the camera moving forward severely reduces the convergence speed. In this work, we extend the point representation to area representation by using relative positional encoding, and propose a ray sampling strategy that is suitable for camera trajectory moving forward. We validated the effectiveness of our method on multiple public datasets. © 2023 IEEE.

关键词： NeRF neural radiance field neural rendering novel view synthesis

来源：评论

学校读者我要写书评

暂无评论

Optimizing Yinyang K-Means Algorithm on ARMv8 Many-Core CPUs 22nd

Optimizing Yinyang K-Means Algorithm on ARMv8 Many-Core CPU...

引用

22nd International Conference on Algorithms and Architectures for parallel processing, ICA3PP 2022

作者： Zhou, Tianyang Wang, Qinglin Yin, Shangfei Hao, Ruochen Liu, Jie Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha410073 China School of Computer Science National University of Defense Technology Changsha410073 China

ISBN: (纸本)9783031226762

K-Means algorithm is one of the most common clustering algorithms widely applied in various data analysis applications. Yinyang K-Means algorithm is a popular enhanced K-Means algorithm that avoids most unnecessary calculations using triangle inequality. However, Yinyang K-Means algorithm is time-consuming when the problem size is large. Due to the influence of performance and energy-efficiency, ARM CPUs have appeared in high performance computing. Therefore, it is very interesting to accelerate Yinyang K-Means algorithm on ARM CPUs. In this paper, we propose an efficient parallel implementation of Yinyang K-Means algorithm on ARMv8 many-core CPUs by means of vectorization, NUMA affinity memory optimization and data layout optimization. The experiment on two ARMv8 many-core CPUs has shown that our implementation can achieve up to 5.6 times faster than the open-source multi-threaded one of Yinyang K-Means algorithm. To the best of our knowledge, this is the first work that studies the optimization of Yinyang K-Means algorithms on ARMv8 CPUs. © 2023, Springer Nature Switzerland AG.

关键词： K-means clustering

来源：评论

学校读者我要写书评

暂无评论

A data representation method using distance correlation

引用

Frontiers of computer science 2025年第1期19卷 1-14页

作者： Xinyan LIANG Yuhua QIAN Qian GUO keyin ZHENG Institute of Big Data Science and Industry Shanxi UniversityTaiyuan 030006China Key Laboratory of Computational Intelligence and Chinese Information Processing of Ministry of Education Shanxi UniversityTaiyuan 030006China School of Computer Science and Technology Taiyuan University of Science and TechnologyTaiyuan 030024China Shanxi Key Laboratory of Big Data Analysis and Parallel Computing Taiyuan University of Science and TechnologyTaiyuan 030024China

Association in-between features has been demonstrated to improve the representation ability of data. However, the original association data reconstruction method may face two issues: the dimension of reconstructed data is undoubtedly higher than that of original data, and adopted association measure method does not well balance effectiveness and efficiency. To address above two issues, this paper proposes a novel association-based representation improvement method, named as AssoRep. AssoRep first obtains the association between features via distance correlation method that has some advantages than Pearson’s correlation coefficient. Then an improved matrix is formed via stacking the association value of any two features. Next, an improved feature representation is obtained by aggregating the original feature with the enhancement matrix. Finally, the improved feature representation is mapped to a low-dimensional space via principal component analysis. The effectiveness of AssoRep is validated on 120 datasets and the fruits further prefect our previous work on the association data reconstruction.

关键词： association representation distance correlation classification

来源：评论

学校读者我要写书评

暂无评论

DMSA: Decentralized and Multi-keyword Selective Data Sharing and Acquisition 22

DMSA: Decentralized and Multi-keyword Selective Data Sharing...

引用

22nd IEEE International Symposium on parallel and distributed processing with Applications, ISPA 2024

作者： Lin, Moheng Shi, Peichang Fu, Xiang Jiang, Feng Yi, Guodong National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing College of Computer Science Changsha410073 China Xiangjiang Lab Changsha410073 China

ISBN: (纸本)9798331509712

Blockchain technology has been extensively uti-lized in decentralized data-sharing applications, with the immutability of blockchain providing a witness for the circulation of data. However, current blockchain data-sharing solutions still fail to address the simultaneous screening needs of both the sender and receiver with multi-keywords. Without the capability to support bilateral simultaneous filtering, the disclosure of reasons for matching failures could inadvertently expose sensitive user data. Therefore, the challenge lies in enabling ciphertexts with multiple keywords and receivers with multiple interests to achieve mutual and simultaneous matching. Based on the technical foundations of SE (Searchable Encryption), MABE (Multi-Attribute Based Encryption), and polynomial fitting, this paper proposes a scheme called DMSA (Decentralized and Multi-keyword selective Sharing and selective Acquisition). This scheme can satisfy soundness, enabling ciphertexts carrying multiple keywords and receivers representing multiple interests to match each other simultaneously. We conducted a security analysis that confirms the security of DMSA against chosen-plaintext attacks. Our experimental results demonstrate a significant efficiency improvement, with a 67% increase over single-keyword data-sharing schemes and a 16% enhancement compared to the existing multi-keyword data-sharing solution. © 2024 IEEE.

关键词： Ciphertext

来源：评论

学校读者我要写书评

暂无评论

FedEAE: Federated Learning Based Privacy-Preserving Event Argument Extraction 12th

FedEAE: Federated Learning Based Privacy-Preserving Event Ar...

引用

12th national CCF Conference on Natural Language processing and Chinese Computing, NLPCC 2023

作者： Hu, Fei Dong, Shenpo Chang, Tao Zhou, Jie Li, Haili Wang, Jingnan Chen, Rui Liu, Haijiao Wang, Xiaodong National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology Changsha410073 China

ISBN: (纸本)9783031446955

Benefiting from Pre-trained Language Model (PLM), Event Argument Extraction (EAE) methods have achieved SOTA performance in general scenarios of Event Extraction (EE). However, with increasing concerns and regulations on data privacy, aggregating distributed data among different institutions in some privacy-sensitive territories (e.g., medical record analysis, financial statement analysis, etc.) becomes very difficult, and it’s hard to train an accurate EAE model with limited local data. Federated Learning (FL) provides promising methods for a large number of clients to collaboratively learn a shared global model without the need to exchange privacy-sensitive data. Therefore, we propose a privacy-preserving EAE method named FedEAE based on FL to solve the current difficulties. To better adapt to federated scenarios, we design a dataset named FedACE generated from the ACE2005 dataset under IID and Non-IID for our experiments. Extensive experiments show that FedEAE achieves promising performance compared to existing baselines, thus validates the effectiveness of our method. To the best of our knowledge, FedEAE is the first to apply FL in the EAE task. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

Deep Time Series Anomaly Detection with Local Temporal Pattern Learning

Deep Time Series Anomaly Detection with Local Temporal Patte...

引用

2025 IEEE International Conference on Acoustics, Speech, and Signal processing, ICASSP 2025

作者： Li, Yizhou Wang, Yijie Xu, Hongzuo Zhou, Xiaohui National Key Laboratory of Parallel and Distributed Computing College of Computer Science and Technology National University of Defense Technology Changsha410073 China Beijing100091 China

ISBN: (纸本)9798350368741

Self-supervised time series anomaly detection (TSAD) demonstrates remarkable performance improvement by extracting high-level data semantics through proxy tasks. Nonetheless, most existing self-supervised TSAD techniques rely on manual- or neural-based transformations when designing proxy tasks, overlooking the intrinsic temporal patterns of time series. This paper proposes a local temporal pattern learning-based time series anomaly detection (LTPAD). LTPAD first generates sub-sequences. Pairwise sub-sequences naturally manifest proximity relationships along the time axis, and such correlations can be used to construct supervision and train neural networks to facilitate the learning of temporal patterns. Time intervals between two sub-sequences serve as labels for sub-sequence pairs. By classifying these labeled data pairs, our model captures the local temporal patterns of time series, thereby modeling the temporal pattern-aware "normality". Abnormal scores of testing data are acquired by evaluating their conformity to these learned patterns shared in training data. Extensive experiments show that LTPAD significantly outperforms state-of-the-art competitors. © 2025 IEEE.

关键词： Local Temporal Pattern Self-supervised Learning Time Series Anomaly Detection

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：