According to Moore's law the complexity of VLSI circuits has doubled approximately every two years, resulting in simulation becoming the major bottleneck in the circuit design process. parallel and distributed sim...
详细信息
Reputation systems provide a promising way to build trust relationships between users in distributed cooperation systems, such as file sharing, streaming, distributed computing and social network, through which a user...
详细信息
Encryption technology has become an important mechanism of securing data stored in the outsourced database. However, it is a difficulty to query efficiently the encrypted data and many researchers take it into conside...
详细信息
Event-driven programming has been a relatively hot topic in distributed systems development. Having worked on these systems for years, we now believe that it is not the best choice. Besides the well-known "stack ...
详细信息
As the wide application of multi-core processor architecture in the domain of high performance computing, fault tolerance for shared memory parallel programs becomes a hot spot of research. For years, checkpointing ha...
详细信息
Force directed approach is one of the most widely used methods in graph drawing research. However, the running time is increased intolerablely along with the enlargement of the graph size, which restricts the algorith...
详细信息
Force directed approach is one of the most widely used methods in graph drawing research. However, the running time is increased intolerablely along with the enlargement of the graph size, which restricts the algorithm's practicability. By the aid of GPU (graphics processing unit) computing platform, we can speed-up the graph layout with low cost, but the existing GPU implementation mainly employees an “one-by-one” style to update the vertex' coordination per iteration, which has a lower convergent rate than the “batch” style which is instead used commonly in traditional CPU implementation. As a result, the aesthetics of graph layout would be decreased if the total running time is restricted. It is hard to achieve both a high speedup factor of GPU over CPU and a high convergent rate in existing GPU computing implementation. In order to solve this problem partially, this paper presents two new strategies to implement the large-scale graph layout on CPU+GPU heteromerous platform to accelerate the force directed layout for graph drawing problem. The numerical computation results show that our GPU implementation can dramatically improve the performance of force-direct layout and is 20 times on a NVIDIA GeForce 9800 GT GPU at 1.44 GHz faster than the one on single-CPU core of Intel Pentium 4 PC at 3.0 GHz for the graph layout with moderate size (typically 1000 vertices).
In this paper, we introduce a generic model to deal with the event matching problem of content-based publish/ subscribe systems over structured P2P overlays. In this model, we claim that there are three methods (event...
详细信息
Building distributed applications is difficult mostly because of concurrency management. Existing approaches primarily include events and threads. Researchers and developers have been debating for decades to prove whi...
详细信息
ISBN:
(纸本)9781424477548;9781424477555
Building distributed applications is difficult mostly because of concurrency management. Existing approaches primarily include events and threads. Researchers and developers have been debating for decades to prove which is superior. Although the conclusion is far from obvious, this long debate clearly shows that neither of them is perfect. One of the problems is that they are both complex and error-prone. Both events and threads need the programmers to explicitly manage concurrency, and we believe it is just the source of difficulties. In this paper, we propose a novel approach-automatic concurrency management by the runtime system. It dynamically analyzes the programs to discover potential concurrency opportunities; and it dynamically schedules the communication and the computation tasks, resulting in automatic concurrent execution. This approach is inspired by the instruction scheduling technologies used in modern microprocessors, which dynamically exploits instruction-level parallelism. However, hardware scheduling algorithms do not fit software in many aspects, thus we have to design a new scheme completely from scratch. automatic concurrency management is a runtime technique with no modification to the language, compiler or byte code, so it is good at backward compatibility. It is essentially a dynamic optimization for networking programs.
The efficiency of communication is a key factor to the performance of networking applications, and concurrent communication is an important approach to the efficiency of communication. However, many concurrency opport...
详细信息
The efficiency of communication is a key factor to the performance of networking applications, and concurrent communication is an important approach to the efficiency of communication. However, many concurrency opportunities are very difficult to exploit because they depend on some undeterministic conditions. If these conditions are highly predictable, speculative execution can be a very effective approach to cope with the uncertainties. Existing researches on speculation seldom target at networking systems, and none of them can handle the event-driven model that is very popular in such systems. In this paper, we propose Nexus, a novel speculation scheme that supports event-driven networking applications. Nexus analyzes the dependence relationship of events, and performs speculation according to the duality of events and threads. Evaluation on a prototype implementation of nexus shows that this approach can significantly reduces the time needed to complete an event-driven program.
In light of its powerful computing capacity and high energy efficiency, GPU (graphics processing unit) has become a focus in the research field of HPC (High Performance Computing). CPU-GPU heterogeneous parallel syste...
详细信息
In light of its powerful computing capacity and high energy efficiency, GPU (graphics processing unit) has become a focus in the research field of HPC (High Performance Computing). CPU-GPU heterogeneous parallel systems have become a new development trend of super-computer. However, the inherent unreliability of the GPU hardware deteriorates the reliability of super-computer. We have researched on the fault-tolerance(FT) technique for CPU-GPU heterogeneous parallel systems, and introduced a new checkpointing mechanism, i.e., the hierarchical application-level checkpointing, for such systems. The basic idea of this new checkpointing mechanism is checkpointing at two independent levels, i.e., CPU level and GPU level, to tolerate CPU and GPU faults respectively. Based on the idea, we have also designed and implemented a hierarchical application-level checkpointing tool ”HiAL-Ckpt”. Using this tool, programmers can insert two kinds of directives, i.e., CPU directives and GPU directives into a program, and the compiler will transform the directives into CPU or GPU checkpointing codes according to their nature. From the case study of SWIM, a test bench from spec2000 benchmark suite, we have demonstrated the validity of the hierarchical application-level checkpointing technique. The experimental results show that the falut-tolerance temporal cost of HiAL-Ckpt for SWIM is only 2.25%, compared with the executing time of SWIM without any FT work.
暂无评论