Community detection is a vital task in many fields,such as social networks and financial analysis,to name a *** Louvain method,the main workhorse of community detection,is a popular heuristic *** apply it to large-sca...
详细信息
Community detection is a vital task in many fields,such as social networks and financial analysis,to name a *** Louvain method,the main workhorse of community detection,is a popular heuristic *** apply it to large-scale graph networks,researchers have proposed several parallel Louvain methods(PLMs),which suffer from two challenges:the latency in the information synchronization,and the community *** tackle these two challenges,we propose an isolate sets based parallel Louvain method(IPLM)and a fusion IPLM with the hashtables based Louvain method(FIPLM),which are based on a novel graph partition *** graph partition algorithm divides the graph network into subgraphs called isolate sets,in which the vertices are relatively decoupled from *** first describe the concepts and properties of the isolate *** we propose an algorithm to divide the graph network into isolate sets,which enjoys the same computation complexity as the breadth-first ***,we propose IPLM,which can efficiently calculate and update vertices information in parallel without latency or community ***,we achieve further acceleration by FIPLM,which maintains a high quality of community detection with a faster speedup than *** two methods are for shared-memory architecture,and we implement our methods on an 8-core PC;the experiments show that IPLM achieves a maximum speedup of 4.62x and outputs higher modularity(maximum 4.76%)than the serial Louvain method on 14 of 18 ***,FIPLM achieves a maximum speedup of 7.26x.
Broadcast authentication is a critical security service in wireless sensor networks. A protocol named μTESLA[1] has been proposed to provide efficient authentication service for such networks. However, when applied t...
详细信息
Broadcast authentication is a critical security service in wireless sensor networks. A protocol named μTESLA[1] has been proposed to provide efficient authentication service for such networks. However, when applied to applications such as time synchronization and fire alarm in which broadcast messages are sent infrequently, μTESLA encounters problems of wasted key resources and slow message verification. This paper presents a new protocol named GBA (Generalized broadcast authentication), for efficient broadcast authentication in these applications. GBA utilises the one-way key chain mechanism of μTESLA, but modifies the keys and time intervals association, and changes the key disclosure mechanism according to the message transmission model in these applications. The proposed technique can take full use of key resources, and shorten the message verification time to an acceptable level. The analysis and experiments show that GBA is more efficient and practical than μESLA in appli ations with various message transmission models.
On the 41st Top500 list announced in June 2013, the MilkyWay-2 system produced by National University of Defense technology (NUDT) in China won the first place with a LINPACK test result of 33.86 PFLOPS. It has been...
On the 41st Top500 list announced in June 2013, the MilkyWay-2 system produced by National University of Defense technology (NUDT) in China won the first place with a LINPACK test result of 33.86 PFLOPS. It has been one and a half year since its predecessor, MilkyWay-1 (TH-1), reached the same place for the first time. On the newest Top500 list published in November 2013, MilkyWay-2 continued to win the champion.
Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grain...
详细信息
Reconfigurable computing tries to achieve the balance between high efficiency of custom computing and flexibility of general-purpose computing. This paper presents the implementation techniques in LEAP, a coarse-grained reconfigurable array, and proposes a speculative execution mechanism for dynamic loop scheduling with the goal of one iteration per cycle and implementation techniques to support decoupling synchronization between the token generator and the collector. This paper also in- troduces the techniques of exploiting both data dependences of intra- and inter-iteration, with the help of two instructions for special data reuses in the loop-carried dependences. The experimental results show that the number of memory accesses reaches on average 3% of an RISC processor simulator with no memory optimization. In a practical image matching application, LEAP architecture achieves about 34 times of speedup in execution cycles, compared with general-purpose processors.
Many proposed P2P networks are based on traditional interconnection topologies. Given a static topology, the maintenance mechanism for node join/departure is critical to designing an efficient P2P network. Kautz graph...
详细信息
Many proposed P2P networks are based on traditional interconnection topologies. Given a static topology, the maintenance mechanism for node join/departure is critical to designing an efficient P2P network. Kautz graphs have many good properties such as constant degree, low congestion and optimal diameter. Due to the complexity in topology maintenance, however, to date there have been no effective P2P networks that are proposed based on Kautz graphs with base ~ 2. To address this problem, this paper presents the "distributed Kautz (D-Kautz) graphs", which adapt Kautz graphs to the characteristics of P2P networks. Using the D-Kautz graphs we further propose SKY, the first effective P2P network based on Kautz graphs with arbitrary base. The effectiveness of SKY is demonstrated through analysis and simulations.
On virtualization platforms, peak memory de- mand caused by hotspot applications often triggers page swapping in guest OS, causing performance degradation in- side and outside of this virtual machine (VM). Even thou...
详细信息
On virtualization platforms, peak memory de- mand caused by hotspot applications often triggers page swapping in guest OS, causing performance degradation in- side and outside of this virtual machine (VM). Even though host holds sufficient memory pages, guest OS is unable to utilize free pages in host directly due to the semantic gap between virtual machine monitor (MM) and guest operat- ing system (OS). Our work aims at utilizing the free memory scattered in multiple hosts in a virtualization environment to improve the performance of guest swapping in a transparent and implicit way. Based on the insightful analysis of behav- ioral characteristics of guest swapping, we design and im- plement a distributed and scalable framework HybridSwap. It dynamically constructs virtual swap pools using various policies, and builds up a synthetic swapping mechanism in a peer-to-peer way, which can adaptively choose different vir- tual swap pools. We implement the prototype of HybridSwap and evaluate it with some benchmarks in different scenar- ios. The evaluation results demonstrate that our solution has the ability to promote the guest swapping efficiency indeed and shows a double performance promotion in some cases. Even in the worst case, the system overhead brought by Hy- bridSwap is acceptable.
In recent years, multiagent reinforcement learning (MARL) has demonstrated considerable potential across diverse applications. However, in reinforcement learning environments characterized by sparse rewards, the scarc...
详细信息
In recent years, multiagent reinforcement learning (MARL) has demonstrated considerable potential across diverse applications. However, in reinforcement learning environments characterized by sparse rewards, the scarcity of reward signals may give rise to reward conflicts among agents. In these scenarios, each agent tends to compete to obtain limited rewards, deviating from collaborative efforts aimed at achieving collective team objectives. This not only amplifies the learning challenge but also imposes constraints on the overall learning performance of agents, ultimately compromising the attainment of team goals. To mitigate the conflicting competition for rewards among agents in MARL, we introduce the bidirectional influence and interaction (BDII) MARL framework. This innovative approach draws inspiration from the collaborative ethos observed in human social cooperation, specifically the concept of "sharing joys and sorrows." The fundamental concept behind BDII is to empower agents to share their individual rewards with collaborators, fostering a cooperative rather than competitive behavioral paradigm. This strategic shift aims to resolve the pervasive issue of reward conflicts among agents operating in sparse-reward environments. BDII incorporates two key factors—namely, the Gaussian kernel distance between agents (physical distance) and policy diversity among agents (logical distance). The two factor collectively contribute to the dynamic adjustment of reward allocation coefficients, culminating in the formation of reward distribution weights. The incorporation of these weights facilitates the equitable sharing of agents’ contributions to rewards, promoting a cooperative learning environment. Through extensive experimental evaluations, we substantiate the efficacy of BDII in addressing the challenge of reward conflicts in MARL. Our research findings affirm that BDII significantly mitigates reward conflicts, ensuring that agents consistently align with the origi
Performance and energy consumption of high performance computing (HPC) interconnection networks have a great significance in the whole supercomputer, and building up HPC interconnection network simulation plat- form...
详细信息
Performance and energy consumption of high performance computing (HPC) interconnection networks have a great significance in the whole supercomputer, and building up HPC interconnection network simulation plat- form is very important for the research on HPC software and hardware technologies. To effectively evaluate the per- formance and energy consumption of HPC interconnection networks, this article designs and implements a detailed and clock-driven HPC interconnection network simulation plat- form, called HPC-NetSim. HPC-NetSim uses application- driven workloads and inherits the characteristics of the de- tailed and flexible cycle-accurate network simulator. Besides, it offers a large set of configurable network parameters in terms of topology and routing, and supports router's on/off states. We compare the simulated execution time with the real execution time of Tianhe-2 subsystem and the mean error is only 2.7%. In addition, we simulate the network behaviors with different network structures and low-power modes. The results are also consistent with the theoretical analyses.
Recently correlation filter based trackers have attracted considerable attention for their high computational efficiency. However, they cannot handle occlusion and scale variation well enough. This paper aims at preve...
详细信息
Recently correlation filter based trackers have attracted considerable attention for their high computational efficiency. However, they cannot handle occlusion and scale variation well enough. This paper aims at preventing the tracker from failure in these two situations by integrating the depth information into a correlation filter based tracker. By using RGB-D data, we construct a depth context model to reveal the spatial correlation between the target and its surrounding regions. Furthermore, we adopt a region growing method to make our tracker robust to occlusion and scale variation. Additional optimizations such as a model updating scheme are applied to improve the performance for longer video sequences. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed tracker performs favourably against state-of-the-art algorithms.
Adaptivity is the capacity of software to adjust itself to changes in its environment. A common approach to achieving adaptivity is to introduce dedicated code during software development stage. However,since those co...
详细信息
Adaptivity is the capacity of software to adjust itself to changes in its environment. A common approach to achieving adaptivity is to introduce dedicated code during software development stage. However,since those code fragments are designed a priori, self-adaptive software cannot handle situations adequately when the contextual changes go beyond those that are originally anticipated. In this case, the original builtin adaptivity should be tuned. For example, new code should be added to provide the capacity to sense the unexpected environment or to replace outdated adaptation decision logic. The technical challenges in this process, especially that of tuning software adaptivity at runtime, cannot be understated. In this paper,we propose an architecture-centric application framework for self-adaptive software named Auxo. Similar to existing work, our framework supports the development and running of self-adaptive software. Furthermore,our framework supports the tuning of software adaptivity without requiring the running self-adaptive software to be terminated. In short, the architecture style that we are introducing can encapsulate not only general functional logic but also the concerns in the self-adaptation loop(such as sensing, decision, and execution)as architecture elements. As a result, a third party, potentially the operator or an augmented software entity equipped with explicit domain knowledge, is able to dynamically and flexibly adjust the self-adaptation concerns through modifying the runtime software architecture. To truly exercise, validate, and evaluate our approach,we describe a self-adaptive application that was deployed on the framework, and conducted several experiments involving self-adaptation and the online tuning of software adaptivity.
暂无评论