检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Gao, Fei Hu, Ming Xie, Zhiyu Shi, Peichang Xie, Xiaofei Yi, Guodong Wang, Huaimin National Key Lab. of Parallel and Distributed Processing National University of Defense Technology Changsha China School of Computing and Information Systems Singapore Management University Singapore Xiangjiang Lab Changsha China

With advancements in AI infrastructure and Trusted Execution Environment (TEE) technology, Federated Learning as a Service (FLaaS) through JointCloud Computing (JCC) is promising to break through the resource constraints caused by heterogeneous edge devices in the traditional Federated Learning (FL) paradigm. Specifically, with the protection from TEE, data owners can achieve efficient model training with high-performance AI services in the cloud. By providing additional FL services, cloud service providers can achieve collab.rative learning among data owners. However, FLaaS still faces three challenges, i.e., i) low training performance caused by heterogeneous data among data owners, ii) high communication overhead among different clouds (i.e., data centers), and iii) lack of efficient resource scheduling strategies to balance training time and cost. To address these challenges, this paper presents a novel asynchronous FL approach named NebulaFL for collab.rative model training among multiple clouds. To address data heterogeneity issues, NebulaFL adopts a version control-based asynchronous FL training scheme in each data center to balance training time among data owners. To reduce communication overhead, NebulaFL adopts a decentralized model rotation mechanism to achieve effective knowledge sharing among data centers. To balance training time and cost, NebulaFL integrates a reward-guided strategy for data owners selection and resource scheduling. The experimental results demonstrate that, compared to the state-of-the-art FL methods, NebulaFL can achieve up to 5.71% accuracy improvement. In addition, NebulaFL can reduce up to 50% communication overhead and 61.94% costs under a target accuracy. Copyright © 2024, The Authors. All rights reserved.

关键词： Costs

来源：评论

学校读者我要写书评

暂无评论

Trust-worth multi-representation learning for audio classification with uncertainty estimation

引用

The Journal of the Acoustical Society of America 2023年第3_SUPPLEMENT期153卷 A125-A125页

作者： Kele Xu Kang You Ming Feng Boqing Zhu National Key Lab. of Parallel and Distributed Processing (PDL) 107 Yanwachi Changsha 410073 China kelele.xu@*** Tongji Univ. Shanghai China National Key Lab. of Parallel and Distributed Processing (PDL) Changsha China

Multi-view learning has been explored for audio classification tasks, exploiting different representations of audio signals, ranging from MFCC, CQT, to raw signals. The quality of each view may vary for different audio signals, and the appropriate uncertainty quantification for each view has not been fully explored. In this work, we explore a trusted multi-view learning framework for classification tasks in order to fully incorporate different views. Our framework consists of three parallel branches of Transformer architectures (Gammatone spectrogram, log-Mel and CQT) and they are combined using the uncertainty estimation of different branch. In addition to computing the classification probabilities, the uncertainty of each representation can also be obtained using the framework. We firstly calculate the evidence based on feature vectors to obtain the probabilities and the uncertainty of classification problems for Gammatone, log-Mel and CQT branch. By integrating the confidence from each of the different representations using the Dempster–Shafer theory, the classification framework can provide higher accuracy and confidence. To demonstrate the effectiveness of the proposed framework, we conduct the experiments on the GTZAN dataset. The obtained results show that our method can reach the accuracy of 83.0%, which significantly outperforms single representation-based methods while providing uncertainty estimation for different views.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Underwater acoustic classification using masked modeling-based swin transformer

引用

The Journal of the Acoustical Society of America 2022年第4_SUPPLEMENT期152卷 A296-A296页

作者： Kang you Kele Xu Ming Feng Boqing Zhu Tonguing Univ. Shanghai China National Key Lab. of Parallel and Distributed Processing (PDL) Changsha China 107 Yanwachi Changsha 410073 China kelele.xu@*** National Key Lab. of Parallel and Distributed Processing (PDL) Changsha China Changsha China

Underwater acoustic classification is a challenging task due to complex background noise and complicated sound propagation patterns. How to represent the signals is important for the classification task. In this paper, we propose a novel representation learning method for the underwater acoustic signals, leveraging the mask modeling-based self-supervised learning paradigm. Specifically, we first explore modifying the Swin Transformer architecture to learn general representation for the audio signals, accompanied with random masking on the log-mel spectrogram. The main goal of the pretext task is to predict the masked parts of Log-mel spectrogram and the gamma-stone spectrogram, so that the model can not only learn the local and global features but also learn complementary information. For downstream task, we utilize the lab.lled datasets to fine-tune the pre-trained model. On DeepShip datasets which consist of 47 hand 4 minof ship sounds in four categories, our model achieves state-of-the-art performance compared with competitive approaches. Our method obtains a classification accuracy of 78.03%, which is better than the separable convolution autoencoder (SCAE) and using the constant-Q transform spectrogram. This work demonstrates the potential of the masked modeling based self-supervised learning for understanding and interpretation of underwater acoustic signals.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Load Balancing a Multi-Block Grids-based Application on Heterogeneous Platform

Load Balancing a Multi-Block Grids-based Application on Hete...

引用

IEEE International Conference on Computational Science and Engineering, CSE

作者： Yonggang Che Chuanfu Xu Zhenghua Wang Institute for Quantum Information & State Key Lab. of High Performance Computing College of Computer National University of Defense Technology Changsha China Science and Technology on Parallel and Distributed Processing Lab National University of Defense Technology Changsha China

ISBN: (数字)9781665403986

ISBN: (纸本)9781665403993

This paper presents a load balancing method for a multi-block grids-based CFD (Computational Fluid Dynamics) application on heterogeneous platform. This method includes an asymmetric task scheduling scheme and a load balancing model. The idea is to balance the computing speed between the CPU and the coprocessor by adjusting the workload and the numbers of threads on both sides. Optimal load balance parameters are empirically selected, guided by a performance model. Performance evaluation is conducted on a computer server consists of two Intel Xeon E5-2670 v3 CPUs and two MIC coprocessors (Xeon Phi 5110P and Xeon Phi 7120P) for the simulation of turbulent combustion in a supersonic combustor. The results show that the performance is highly sensitive to the load balance parameters. With the optimal parameters, the heterogeneous computing achieves a maximum speedup of 2.30 × for a 6-block mesh, and a maximum speedup of 2.66 × for a 8-block mesh, over the CPU-only computing.

关键词： Computational modeling Computational fluid dynamics Load management Servers Task analysis Load modeling Coprocessors

来源：评论

学校读者我要写书评

暂无评论

Deep learning-based age estimation using B-mode ultrasound tongue imaging

引用

The Journal of the Acoustical Society of America 2021年第4_SUPPLEMENT期150卷 A190-A190页

作者： Kele Xu Tamás G. Csapó Ming Feng National Key Lab. of Parallel and Distributed Processing (PDL) 107 Yanwachi Changsha 410073 China kelele.xu@*** Budapest Univ. of Technol. and Economics Budapest Hungary Tongji Univ. Shanghai China

The feasibility of age estimation is explored using the ultrasound tongue image of the speakers. Motivated by the success of deep learning, a deep convolutional neural network model is trained on the UltraSuite dataset. The deep model achieves mean absolute error (MAE) of 2.03 years for the data from typically developing children, while MAE is 4.87 for the data from the children with speech sound disorders, which suggest that age estimation using ultrasound is more challenging for the children with speech sound disorder. Also, we explore to visualize what does the deep model learn for the age estimation task. We firstly visualize the convolutional layers in the learned convolutional neural networks. We observe that the deep model not only focuses on the contour in the ultrasound tongue image, but also pays more attention to the regions corresponding to the tendon and tongue root regions, which may provide guidance for future ultrasound tongue imaging interpretation tasks. The developed method can be used a tool to evaluate the performance of speech therapy sessions.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Low-latency last-level cache structure based on grouped cores in Chip Multi-Processor

Low-latency last-level cache structure based on grouped core...

引用

IEEE International Conference on Performance, Computing and Communications (IPCCC)

作者： Jinbo Xu Weixia Xu Kefei Wang Zhengbin Pang School of Computer National University of Defense Technology Changsha China National Lab. for Parallel and Distributed Processing National University of Defense Technology Changsha China

ISBN: (纸本)9781479975761

Last-Level Cache (LLC) plays an important role in Chip Multi-Processor (CMP). The objective of this work is to optimize the structure and management strategy of LLC. Based on 8-core CMP, a LLC structure based on grouped cores is proposed, where 8 cores are divided into 4 groups. All LLC resources are classified into three types, which are fixed private cache, dynamic private cache and dynamic shared cache. The layout of the LLC structure and the corresponding dynamic partitioning strategy are designed to achieve low access latency and high efficiency. Experimental results on full-system simulator suggest that the proposed structure and method are able to reduce the access latency by 2% to 12% compared with previous works, such as tiled structure, cache-centered structure and core-centered structure. Consequently, performance measured by IPC is improved up to 7%. The contribution of this paper is useful for CMP performance, and applies to not only 8-core CMP but also all small-scale CMPs.

关键词： Dynamic scheduling Multicore processing Educational institutions Organizations Delays Semiconductor device measurement Research and development

来源：评论

学校读者我要写书评

暂无评论

Skyline query processing on interval uncertain data

Skyline query processing on interval uncertain data

引用

作者： Li, Xiaoyong Wang, Yijie Li, Xiaoling Wang, Guangdong National Lab. for Parallel and Distributed Processing School of Computer Science National University of Defense Technology Changsha China

ISBN: (纸本)9780769546698

Many recent applications involve processing and analyzing uncertain data. Recently, several research efforts have addressed answering skyline queries efficiently on massive uncertain datasets. However, the research lacks methods to compute these queries on uncertain data, where each dimension of the uncertain object is represented as an interval or an exact value. In this paper, we extensively study the problem of skyline query on these interval based uncertain objects, which has never been studied before. We first model the problem of querying the skylines on interval datasets. Typically, we address two efficient algorithms with I/O optimal for the conventional interval skyline queries and constrained interval skyline queries, respectively. Extensive experiments demonstrate the efficiency of all our proposed algorithms. © 2012 IEEE.

关键词： Query processing

来源：评论

学校读者我要写书评

暂无评论

Improving cluster computing performance based on job futurity prediction

Improving cluster computing performance based on job futurit...

引用

2010 3rd International Conference on Advanced Computer Theory and Engineering, ICACTE 2010

作者： Salami, Hossein Saadatfar, Hamid Fard, Farhad Rahmani Shekofteh, S. Kazem Deldari, Hossein Parallel and Distributed Processing Lab. Engineering Department Ferdowsi University of Mashhad Mashhad Iran

ISBN: (纸本)9781424465408

By recognizing the necessity for preventative and proactive management for today's large scale and fault prone distributed systems, a tendency for these mechanisms has been appeared in recent researchers' efforts. From the birth point of this opinion, event prediction has been known as an effective approach to manage errors preventively. In this work, we attempt to make system performance better by using a job futurity predictor as an advisor for system scheduler. We have a clear sighted vision of failure sources and therefore use a comprehensive data set of system, user and job domains. Using predictor results leads to a significant improvement in performance metrics. © 2010 IEEE.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

JSM: Job submission manager for large-scale distributed systems based on game theory

JSM: Job submission manager for large-scale distributed syst...

引用

2010 3rd International Conference on Advanced Computer Theory and Engineering, ICACTE 2010

作者： Saadatfar, Hamid Salami, Hossein Deldari, Hossein Mashhadi, Habib Rajabi Fard, Farhad Rahmani Parallel and Distributed Processing Lab. Engineering Department Ferdowsi University of Mashhad Mashhad Iran

ISBN: (纸本)9781424465408

Nowadays by improving the richness of prediction methods and accessing to the more information about systems behavior, the role of proactive strategies in developing more reliable and efficient systems becomes more crucial. However, the goals of prediction and the way that the results can be employed to upgrade the system are still topics which draw recent researchers' attention. In this work, we attempt to decrease Jobs wait time and failure rate by using the results of a job futurity predictor. For achieving this goal, a system component called JSM is proposed. JSM consults the predictor and employs a game theory based model in order to probably rejecting the jobs which are likely to fail. Furthermore, for avoiding from rejecting safety jobs mistakenly, JSM intelligently adopts its decisions with the systems situations. Experimental results state a significant reduction in jobs wait time and failure rate in comparison with other related work. © 2010 IEEE.

关键词： Forecasting

来源：评论

学校读者我要写书评

暂无评论

Implementing Sparse Matrix-Vector multiplication using CUDA based on a hybrid sparse matrix format

Implementing Sparse Matrix-Vector multiplication using CUDA ...

引用

2010 International Conference on Computer Application and System Modeling, ICCASM 2010

作者： Cao, Wei Lu, Yao Li, Zongzhe Wang, Yongxian Wang, Zhenghua National Key Lab. for Parallel and Distributed Processing National University of Defense Technology Changsha China

ISBN: (纸本)9781424472369

The Sparse Matrix-Vector product (SpMV) is a key operation in engineering and scientific computing. Methods for efficiently implementing it in parallel are critical to the performance of many applications. Modern Graphics processing Units (GPUs) coupled with the advent of general purpose programming environments like NVIDIA's CUDA, have gained interest as a viable architecture for data-parallel general purpose computations. Currently, SpMV implementations using CUDA based on common sparse matrix format have already appeared. Among them, the performance of implementation based on ELLPACK-R format is the best. However, in this implementation, when the maximum number of nonzeros per row does substantially differ from the average, thread is suffering from load imbalance. This paper proposes a new matrix storage format called ELLPACK-RP, which combines ELLPACK-R format with JAD format, and implements the SpMV using CUDA based on it. The result proves that it can decrease the load imbalance and improve the SpMV performance efficiently. © 2010 IEEE.

关键词： Graphics processing unit

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：