检索结果-内蒙古大学图书馆

Detailed and clock-driven simulation for HPC interconnection network

Frontiers of Computer Science 2016年第5期10卷 797-811页

作者： Wenhao ZHOU Juan CHEN Chen CUI Qian WANG Dezun DONG Yuhua TANG State Key Laboratory of High Performance Computing School of Computer National University of Defense Technology Changsha 410073 China Science and Technology on Parallel and Distributed Processing Laboratory National University of Defense Technology Changsha 410073 China

Performance and energy consumption of high performance computing （HPC） interconnection networks have a great significance in the whole supercomputer, and building up HPC interconnection network simulation plat- form is very important for the research on HPC software and hardware technologies. To effectively evaluate the per- formance and energy consumption of HPC interconnection networks, this article designs and implements a detailed and clock-driven HPC interconnection network simulation plat- form, called HPC-NetSim. HPC-NetSim uses application- driven workloads and inherits the characteristics of the de- tailed and flexible cycle-accurate network simulator. Besides, it offers a large set of configurable network parameters in terms of topology and routing, and supports router＇s on/off states. We compare the simulated execution time with the real execution time of Tianhe-2 subsystem and the mean error is only 2.7%. In addition, we simulate the network behaviors with different network structures and low-power modes. The results are also consistent with the theoretical analyses.

关键词： high performance computing clock-driven sim-ulation interconnection network BookSim

来源：评论

学校读者我要写书评

暂无评论

Detecting Duplicate Contributions in Pull-Based Model CombiningTextual and Change Similarities

引用

Journal of Computer Science & Technology 2021年第1期36卷 191-206页

作者： Zhi-Xing Li Yue Yu Tao Wang Gang Yin Xin-Jun Mao Huai-Min Wang Key Laboratory of Parallel and Distributed Computing College of ComputerNational University of Defense Technology Changsha 410073China Laboratory of Software Engineering for Complex Systems College of ComputerNational University of Defense TechnologyChangsha 410073China

Communication and coordination between OSS developers who do not work physically in the same location have always been the challenging *** pull-based development model,as the state-of-art collaborative development mechanism,provides high openness and transparency to improve the visibility of contributors'***,duplicate contributions may still be submitted by more than one contributors to solve the same problem due to the parallel and uncoordinated nature of this *** not detected in time,duplicate pull-requests can cause contributors and reviewers to waste time and energy on redundant *** this paper,we propose an approach combining textual and change similarities to automatically detect duplicate contributions in pull-based model at submission *** a new-arriving contribution,we first compute textual similarity and change similarity between it and other existing *** then our method returns a list of candidate duplicate contributions that are most similar with the new contribution in terms of the combined textual and change *** evaluation shows that 83.4%of the duplicates can be found in average when we use the combined textual and change similarity compared to 54.8%using only textual similarity and 78.2%using only change similarity.

关键词： Pull-request Duplicate detection textual similarity change similarity

来源：评论

学校读者我要写书评

暂无评论

Exploiting a depth context model in visual tracking with correlation filter

引用

Frontiers of Information Technology & Electronic Engineering 2017年第5期18卷 667-679页

作者： Zhao-yun CHEN Lei LUO Da-fei HUANG Mei WEN Chun-yuan ZHANG College of Computer National University of Defense TechnologyChangsha 410073China National Key Laboratory of Parallel and Distributed Processing Changsha 410073China

Recently correlation filter based trackers have attracted considerable attention for their high computational efficiency. However, they cannot handle occlusion and scale variation well enough. This paper aims at preventing the tracker from failure in these two situations by integrating the depth information into a correlation filter based tracker. By using RGB-D data, we construct a depth context model to reveal the spatial correlation between the target and its surrounding regions. Furthermore, we adopt a region growing method to make our tracker robust to occlusion and scale variation. Additional optimizations such as a model updating scheme are applied to improve the performance for longer video sequences. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed tracker performs favourably against state-of-the-art algorithms.

关键词： Visual tracking Depth context model Correlation filter Region growing

来源：评论

学校读者我要写书评

暂无评论

Bidirectional Influence and Interaction for Multiagent Reinforcement Learning

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2024年第10期5卷 4984-4995页

作者： Sun, Shaoqi Xu, Kele Feng, Dawei Ding, Bo National University of Defense Technology National Key Laboratory of Parallel and Distributed Processing Changsha410003 China

In recent years, multiagent reinforcement learning (MARL) has demonstrated considerable potential across diverse applications. However, in reinforcement learning environments characterized by sparse rewards, the scarcity of reward signals may give rise to reward conflicts among agents. In these scenarios, each agent tends to compete to obtain limited rewards, deviating from collaborative efforts aimed at achieving collective team objectives. This not only amplifies the learning challenge but also imposes constraints on the overall learning performance of agents, ultimately compromising the attainment of team goals. To mitigate the conflicting competition for rewards among agents in MARL, we introduce the bidirectional influence and interaction (BDII) MARL framework. This innovative approach draws inspiration from the collaborative ethos observed in human social cooperation, specifically the concept of "sharing joys and sorrows." The fundamental concept behind BDII is to empower agents to share their individual rewards with collaborators, fostering a cooperative rather than competitive behavioral paradigm. This strategic shift aims to resolve the pervasive issue of reward conflicts among agents operating in sparse-reward environments. BDII incorporates two key factors—namely, the Gaussian kernel distance between agents (physical distance) and policy diversity among agents (logical distance). The two factor collectively contribute to the dynamic adjustment of reward allocation coefficients, culminating in the formation of reward distribution weights. The incorporation of these weights facilitates the equitable sharing of agents’ contributions to rewards, promoting a cooperative learning environment. Through extensive experimental evaluations, we substantiate the efficacy of BDII in addressing the challenge of reward conflicts in MARL. Our research findings affirm that BDII significantly mitigates reward conflicts, ensuring that agents consistently align with the origi

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

U-shaped Dual Attention Transformer: An Efficient Transformer Based on Channel and Spatial Attention 4

U-shaped Dual Attention Transformer: An Efficient Transforme...

引用

4th International Conference on Artificial Intelligence, Robotics, and Communication, ICAIRC 2024

作者： Zhai, Zhaoyuan Qiao, Peng Li, Rongchun Zhou, Zhen National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing Changsha China

ISBN: (纸本)9798331531225

Transformer-based methods have demonstrated remarkable performance on image super-resolution tasks. Due to high computational complexity, researchers have been working to achieve a balance between computation costs and performance. Restormer has achieved commendable balance by utilizing global channel attention. However, the performance is limited by insufficient local pixel reconstruction. In this paper, we propose a U-shaped Dual Attention Transformer (UDAT) with local-global receptive field, addressing the limitation of Restormer in local pixel reconstruction. We propose a dense window channel attention to enhance the local feature representation, more efficient in computational complexity. Experiments demonstrate that our UDAT achieve superior performance compared with Restormer on benchmark datasets, surpassing Restormer by 0.57 dB on the Urban100 dataset. On-par with SwinIR, our method reduces computational complexity by 3.2 times and improves inference speed by 2 times, achieving a better balance between computational costs and performance. © 2024 IEEE.

关键词： Pixels

来源：评论

学校读者我要写书评

暂无评论

Jammer Localization for Wireless Sensor Networks

引用

电子学报(英文版) 2011年第4期20卷 735-738页

作者： SUN Yanqiang WANG Xiaodong ZHOU Xingming National Key Laboratory for Parallel and Distributed Processing College of Computer Science National University of Defense Technology Changsha China

Jamming attack can severely affect the performance of Wireless sensor networks (WSNs) due to the broadcast nature of wireless medium. In order to localize the source of the attacker, we in this paper propose a jammer localization algorithm named as Minimum-circlecovering based localization (MCCL). Comparing with the existing solutions that rely on the wireless propagation parameters, MCCL only depends on the location information of sensor nodes at the border of the jammed region. MCCL uses the plane geometry knowledge, especially the minimum circle covering technique, to form an approximate jammed region, and hence the center of the jammed region is treated as the estimated position of the jammer. Simulation results showed that MCCL is able to achieve higher accuracy than other existing solutions in terms of jammer's transmission range and sensitivity to nodes' density.

关键词：无线传感器网络干扰定位传感器节点位置信息覆盖技术定位算法无线传播几何知识

来源：评论

学校读者我要写书评

暂无评论

A fast successive over-relaxation algorithm for force-directed network graph drawing

引用

Science China(Information Sciences) 2012年第3期55卷 677-688页

作者： WANG YongXian & WANG ZhengHua national key laboratory for parallel and distributed Processing, national University of Defense Technology, Changsha 410073, China 1. National Key Laboratory for Parallel and Distributed Processing National University of Defense Technology Changsha 410073 China

Force-directed approach is one of the most widely used methods in graph drawing research. There are two main problems with the traditional force-directed algorithms. First, there is no mature theory to ensure the convergence of iteration sequence used in the algorithm and further, it is hard to estimate the rate of convergence even if the convergence is satisfied. Second, the running time cost is increased intolerablely in drawing largescale graphs, and therefore the advantages of the force-directed approach are limited in practice. This paper is focused on these problems and presents a sufficient condition for ensuring the convergence of iterations. We then develop a practical heuristic algorithm for speeding up the iteration in force-directed approach using a successive over-relaxation (SOR) strategy. The results of computational tests on the several benchmark graph datasets used widely in graph drawing research show that our algorithm can dramatically improve the performance of force-directed approach by decreasing both the number of iterations and running time, and is 1.5 times faster than the latter on average.

关键词： graph drawing graph layout successive over-relaxation force-directed algorithm

来源：评论

学校读者我要写书评

暂无评论

Providing Virtual Cloud for Special Purposes on Demand in JointCloud computing Environment

引用

Journal of Computer Science & Technology 2017年第2期32卷 211-218页

作者： Dong-Gang Cao Member, CCF, IEEE, Bo An Pei-Chang Shi Huai-Min Wang Key Laboratory of High Confidence Software Technologies (Peking University) Ministry of Education Beijing 100871 China National Key Laboratory for Parallel and Distributed Processing National University of Defense and Technology Changsha 410073 China

Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applications demand to run across several clouds to satisfy the requirements like best cost efficiency, avoidance of vender lock-in, and geolocation sensitive service. JointCloud computing is a new research initiated by Chinese institutes to address the computing issues concerned with multiple clouds. In JointCloud, users＇ diverse and dynamic requirements on cloud resources axe satisfied by providing users virtual cloud （VC） for special purposes. A virtual cloud for special purposes is in essence a user＇s specific cloud working environment having the customized software stacks, configurations and computing resources readily available. This paper first introduces what is JointCloud computing and then describes the design rationales, motivation examples, mechanisms and enabling technologies of VC in JointCloud.

关键词： cloud computing JointCloud virtual cloud （VC） cloud working environment

来源：评论

学校读者我要写书评

暂无评论

A multidimensional approach of evaluating developers 2020

A multidimensional approach of evaluating developers

引用

2nd International Conference on Big Data Engineering, BDE 2020

作者： Zhang, Changqiang Chen, Ming Key Laboratory of Parallel and Distributed Computing College of Computer National University of Defense Technology China

ISBN: (纸本)9781450377225

In this paper, we propose an approach to assess the ability of developers based on their behavior data from OSS. Specifically, we classify developers' ability into code ability, project management ability, and social ability. Code efficiency is related to the developer's commit record and the pull-request record. The developer's project management ability is achieved by tracking the developer's commit record. We use regular matching to map the commit behavior to the project management behavior and calculate the developer's project management ability according to the proportion of different behaviors. The social ability of developers is related to the data that developers interact with in the open-source community. We dug for developer reviews on commit, issue, and gist fragments. By calculating the proportion of positive emotions in developer reviews and the proportion of developers interacting with others in the reviews, the social ability of developers is obtained. We get behavioral data from 50 random developers. Twitter's data is used to test the effect of different machine learning algorithms on the accuracy of developer comment polarity judgments. It is found that the combination of SVM, xgboost and random forest have the highest prediction accuracy. Finally, we select 5 students to use Likert scale to score the results. Our score shows that the results are basically in line with expectations. © 2020 ACM.

关键词： Decision trees

来源：评论

学校读者我要写书评

暂无评论

Communication Analysis for Multidimensional parallel Training of Large-scale DNN Models 25

Communication Analysis for Multidimensional Parallel Trainin...

引用

25th IEEE International Conferences on High Performance computing and Communications, 9th International Conference on Data Science and Systems, 21st IEEE International Conference on Smart City and 9th IEEE International Conference on Dependability in Sensor, Cloud and Big Data Systems and Applications, HPCC/DSS/SmartCity/DependSys 2023

作者： Lai, Zhiquan Hao, Yanqi Li, Shengwei Li, Dongsheng College of Computer National University of Defense Technology National Key Laboratory of Parallel and Distributed Computing Changsha China

ISBN: (纸本)9798350330014

Multidimensional parallel training has been widely applied to train large-scale deep learning models like GPT-3. The efficiency of parameter communication among training devices/processes is often the performance bottleneck of large model training. Analysis of parameter communication mode and traffic has important reference significance for the research of interconnection network design and computing task scheduling to improve the training performance. In this paper, we analyze the parametric communication modes in typical 3D parallel training (data parallelism, pipeline parallelism, and tensor parallelism), and model the traffic in different communication modes. Finally, taking GPT-3 as an example, we present the communication in its 3D parallel training. © 2023 IEEE.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：