Fault resilience has became a major issue for HPC systems, particularly, in the perspective of future E-scale systems, which will consist of millions of CPU cores and other components. MPI-level fault tolerant constru...
详细信息
Dangling pointer error is pervasive in C/C++ programs and it is very hard to detect. This paper introduces an efficient detector to detect dangling pointer error in C/C++ programs. By selectively leave some memory acc...
Dangling pointer error is pervasive in C/C++ programs and it is very hard to detect. This paper introduces an efficient detector to detect dangling pointer error in C/C++ programs. By selectively leave some memory accesses unmonitored, our method could reduce the memory monitoring overhead and thus achieves better performance over previous methods. Experiments show that our method could achieve an average speed up of 9% over previous compiler instrumentation based method and more than 50% over previous page protection based method.
Evidence and protocol based medicine decreases the complexity and in the same time also standardizes the healing process. Intervention descriptions moderately open for the public, and they differ more or less at every...
详细信息
Evidence and protocol based medicine decreases the complexity and in the same time also standardizes the healing process. Intervention descriptions moderately open for the public, and they differ more or less at every medical service provider. Normally patients are not much familiar about the steps of the intervention process. There is a certain need expressed by patients to view the whole healing process through intervention plans, thus they can prepare themselves in advance to the coming medical interventions. Intervention plan tracking is a game changer for practitioners too, so they can follow the clinical pathway of the patients, and can receive objective feedbacks from various sources about the impact of the services. Resource planning (with time, cost and other important parameters) and resource pre-allocation became feasible tasks in the healthcare sector. The evolution of consensus protocols developed by medical professionals and practitioners requires accurate measurement of the difference between plans and real world scenarios. To support these comparisons we have developed the Intervention Process Analyzer and Explorer software solution. This software solution enables practitioners and healthcare managers to review in an objective way the effectiveness of interventions targeted at health care professionals and aimed at improving the process of care and patient outcomes.
In this research, we apply the Green's theory for converting the partial differential equation to the boundary integral equation for geometric transformation. Green's theory is designed specifically for integr...
详细信息
ISBN:
(纸本)9781509029181
In this research, we apply the Green's theory for converting the partial differential equation to the boundary integral equation for geometric transformation. Green's theory is designed specifically for integral equation. It is efficient in detecting the singularity point to the geometric transformation that has been verified. Experimental results show that the Green's theory has good performance.
Natural graphs with skewed distribution raise unique challenges to graph computation and partitioning. Existing graph-parallel systems usually use a "one size fits all" design that uniformly processes all ve...
详细信息
Graph-structured analytics has been widely adopted in a number of big data applications such as social computation, web-search and recommendation systems. Though much prior research focuses on scaling graph-analytics ...
详细信息
ISBN:
(纸本)9781450332057
Graph-structured analytics has been widely adopted in a number of big data applications such as social computation, web-search and recommendation systems. Though much prior research focuses on scaling graph-analytics on distributed environments, the strong desire on performance per core, dollar and joule has generated considerable interests of processing large-scale graphs on a single server-class machine, which may have several terabytes of RAM and 80 or more cores. However, prior graph-analytics systems are largely neutral to NUMA characteristics and thus have suboptimal performance. This paper presents a detailed study of NUMA characteristics and their impact on the efficiency of graph-analytics. Our study uncovers two insights: 1) either random or interleaved allocation of graph data will significantly hamper data locality and parallelism;2) sequential inter-node (i.e., remote) memory accesses have much higher bandwidth than both intra- and inter-node random ones. Based on them, this paper describes Polymer, a NUMA-aware graph-analytics system on multicore with two key design decisions. First, Polymer differentially allocates and places topology data, application-defined data and mutable runtime states of a graph system according to their access patterns to minimize remote accesses. Second, for some remaining random accesses, Polymer carefully converts random remote accesses into sequential remote accesses, by using lightweight replication of vertices across NUMA nodes. To improve load balance and vertex convergence, Polymer is further built with a hierarchical barrier to boost parallelism and locality, an edge-oriented balanced partitioning for skewed graphs, and adaptive data structures according to the proportion of active vertices. A detailed evaluation on an 80-core machine shows that Polymer often outperforms the state-of-the-art single-machine graph-analytics systems, including Ligra, X-Stream and Galois, for a set of popular real-world and synthetic grap
In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features e...
详细信息
In this paper, we present the Tianhe-2 interconnect network and message passing services. We describe the architecture of the router and network interface chips, and highlight a set of hardware and software features effectively supporting high performance communications, ranging over remote direct memory access, collective optimization, hardwareenable reliable end-to-end communication, user-level message passing services, etc. Measured hardware performance results are also presented.
Large-scale graph-structured computation usually exhibits iterative and convergence-oriented computing nature, where input data is computed iteratively until a convergence condition is reached. Such features have led ...
详细信息
ISBN:
(纸本)9781450332057
Large-scale graph-structured computation usually exhibits iterative and convergence-oriented computing nature, where input data is computed iteratively until a convergence condition is reached. Such features have led to the development of two different computation modes for graph-structured programs, namely synchronous (Sync) and asynchronous (Async) modes. Unfortunately, there is currently no in-depth study on their execution properties and thus programmers have to manually choose a mode, either requiring a deep understanding of underlying graph engines, or suffering from suboptimal performance. This paper makes the first comprehensive characterization on the performance of the two modes on a set of typical graph-parallel applications. Our study shows that the performance of the two modes varies significantly with different graph algorithms, partitioning methods, execution stages, input graphs and cluster scales, and no single mode consistently outperforms the other. To this end, this paper proposes Hsync, a hybrid graph computation mode that adaptively switches a graph-parallel program between the two modes for optimal performance. Hsync constantly collects execution statistics on-the-fly and leverages a set of heuristics to predict future performance and determine when a mode switch could be profitable. We have built online sampling and offline profiling approaches combined with a set of heuristics to accurately predicting future performance in the two modes. A prototype called PowerSwitch has been built based on PowerGraph, a state-of-the-art distributed graph-parallel system, to support adaptive execution of graph algorithms. On a 48-node EC2-like cluster, PowerSwitch consistently outperforms the best of both modes, with a speedup ranging from 9% to 73% due to timely switch between two modes. Copyright 2015 ACM.
The wide adoption of smart devices has stimulated a fast shift of security-critical data from desktop to mobile devices. However, recurrent device theft and loss expose mobile devices to various security threats and e...
详细信息
Software reuse is critical in open source based software development, but it is very difficult to find a excellent reusable from large amount of similar candidate software in communities. Currently, lots of research w...
详细信息
Software reuse is critical in open source based software development, but it is very difficult to find a excellent reusable from large amount of similar candidate software in communities. Currently, lots of research works evaluate software by analyzing artifacts created by software developers, few of them reveals the power of feedbacks generated by software users, which we believe very valuable for software ranking. In this paper, we connect open source software from different communities with user feedbacks in Stack Overflow, and explore the correlation between the popularity of posts and time. Finally we rank open source software through using information of connected posts in Stack Overflow and compare our ranking result with several influential ranking results like DB-Engines and personal blogs. The comparison results show that our approach can amazingly give similar ranking results to that given by experienced professionals or commercial ranking systems.
暂无评论