&ACE is a high performance parallel Prolog System developed at the Laboratory for Logic, databases, and Advanced Programming that exploits and-parallelism from Prolog programs. &ACE was developed to exploit MI...
详细信息
&ACE is a high performance parallel Prolog System developed at the Laboratory for Logic, databases, and Advanced Programming that exploits and-parallelism from Prolog programs. &ACE was developed to exploit MIMD parallelism. However, SPMD parallelism also arises naturally in many Prolog programs. In this paper we develop runtime techniques that allow systems that have primarily been designed to exploit MIMD parallelism (such as &ACE) to also efficiently exploit SPMD parallelism. These runtime techniques have been incorporated in the &ACE system. Performance of &ACE augmented with these techniques on programs containing SPMD parallelism is presented.
We present a general framework with which we can evaluate the flexibility and efficiency of various replay systems for parallel programs. In our approach, program monitoring is modeled by making a virtual dataflow pro...
详细信息
We present a general framework with which we can evaluate the flexibility and efficiency of various replay systems for parallel programs. In our approach, program monitoring is modeled by making a virtual dataflow program graph, referred to as a VDG, that includes all the instructions executed by the program. The behavior of the program replay is modeled on the parallel interpretation of a VDG based on two basic parallel execution models for dataflow program graphs: a data-driven model and a demand-driven model. Previous attempts to replay parallel programs, known as Instant Replay and P-Sequence, are also modeled as variations of the data-driven replay, i.e. the data-driven interpretation of a VDG. We show that the demand-driven replay, i.e, the demand-driven interpretation of a VDG, is more flexible in program replay than the data-driven replay since it allows better control of parallelism and a more selective replay. We also show that we can implement a demand-driven replay that requires almost the same amount of data to be saved during program monitoring as does the data-driven replay, and which eliminates any centralized bottleneck during program monitoring by optimizing the demand propagation and using an effective data structure.
Recent years the Hadoop distributed File System(HDFS) has been deployed as the bedrock for many parallel big data processing systems, such as graph processing systems, MPI-based parallel programs and scala/java-based ...
详细信息
ISBN:
(纸本)9781479980062
Recent years the Hadoop distributed File System(HDFS) has been deployed as the bedrock for many parallel big data processing systems, such as graph processing systems, MPI-based parallel programs and scala/java-based Spark frameworks, which can efficiently support iterative and interactive data analysis in memory. The first part of my dissertation mainly focuses on studying parallel data access on distributed file systems, e.g, HDFS. Since the distributed I/O resources and global data distribution are often not taken into consideration, the data requests from parallel processes/executors will unfortunately be served in a remote or imbalanced fashion on the storage servers. In order to address these problems, we develop I/O middleware systems and matching-based algorithms to map parallel data requests to storage servers such that local and balanced data access can be achieved. The last part of my dissertation presents our plans to improve the performance of interactive data access in big data analysis. Specifically, most interactive analysis programs will scan through the entire data set regardless of which data is actually required. We plan to develop a content-aware method to quickly access required data without this laborious scanning process.
This paper describes a new approach to the scheduling problem that assigns tasks of a parallel program described as a task graph onto parallel machines. The approach handles interprocessor communication and heterogene...
详细信息
This paper describes a new approach to the scheduling problem that assigns tasks of a parallel program described as a task graph onto parallel machines. The approach handles interprocessor communication and heterogeneity, based on using both the theoretical results developed so far and a lookahead scheduling strategy. The experimental results on randomly generated task graphs demonstrate the effectiveness of this scheduling heuristic.
The proceedings contains 35 papers. Topics discussed include databases, parallel processing systems, distributed computer systems, data processing, large scale systems, data transfer, data storage, storage allocation,...
详细信息
The proceedings contains 35 papers. Topics discussed include databases, parallel processing systems, distributed computer systems, data processing, large scale systems, data transfer, data storage, storage allocation, information management, computer architectures, computer operating systems and data structures.
This paper proposes and analyzes a parallel implementation of the matrix product algorithm for the all pairs shortest path problem for a distributed memory MIMD model. The results of experiments conducted on a 128-pro...
详细信息
ISBN:
(纸本)078031915X
This paper proposes and analyzes a parallel implementation of the matrix product algorithm for the all pairs shortest path problem for a distributed memory MIMD model. The results of experiments conducted on a 128-processor hypercube machine show that the parallel implementation achieves the performance predicted by the analysis.
Because a massively parallel computer processes vast amounts of data and generates many access requests from multiple processors simultaneously, parallel secondary storage requires large capacity and high concurrency....
详细信息
Because a massively parallel computer processes vast amounts of data and generates many access requests from multiple processors simultaneously, parallel secondary storage requires large capacity and high concurrency. One effective method of implementation of such secondary storage is to use disk arrays which have multiple disks connected in parallel. In this paper, we propose a parallel file access method named DECODE (dynamic express changing of data entry) in which load balancing of each disk is achieved by dynamic determination of the write data position. For resolution of the problem of data fragmentation which is caused by the relocation of data during a write process, the concept of ''Equivalent Area'' is introduced. We have performed a preliminary performance evaluation using software simulation under various access statuses by changing the access pattern, access size and stripe size and confirmed the effectiveness of load balancing with this method.
Current synchronization engines are mainly designed to reconcile data repositories between multiple clients and a central server on a star-like topology. A different approach is needed to achieve synchronization on pe...
详细信息
ISBN:
(纸本)9780889866379
Current synchronization engines are mainly designed to reconcile data repositories between multiple clients and a central server on a star-like topology. A different approach is needed to achieve synchronization on peer-to-peer topologies where any node can be both client and server and updates may happen independently. Version vectors are one solution to the problem, ensuring global convergence of the datasets and providing straightforward conflict detection, while letting applications to control the conflict resolution semantics in their specific domain. In this paper an implementation of a synchronization engine for contact data in mobile devices using version vectors is presented. The engine is capable of optimistically synchronizing databases among many nodes in a peer-to-peer fashion.
Due to the exponential growth of biological DNA sequence databases, some parallel gene prediction solutions on different high performance platforms have been proposed. Nevertheless, few exact parallel solutions to the...
详细信息
ISBN:
(纸本)9781479976157
Due to the exponential growth of biological DNA sequence databases, some parallel gene prediction solutions on different high performance platforms have been proposed. Nevertheless, few exact parallel solutions to the spliced alignment problem to gene prediction in eukaryotic organisms have been proposed and none of these solutions use GPUs as the target platform. In this paper, we present the development of two GPU accelerators for an exact solution to the spliced alignment problem applied to gene prediction. Our main contributions are: (a) the identification of two forms to exploit parallelism in the spliced alignment algorithm;(b) two GPU accelerators that achieve speedups up to 52.62 and 90.86, respectively, when compared to a sequential implementation. The accelerators performance scales with input data size, outperforming related work results;(c) a particular organization for the data structures of the accelerators in order to optimize their efficiency;(d) a potential parallelism analysis of the biological data set with the goal of measuring the amount of parallelism that would in fact be available to be exploited by a parallel implementation;and (e) an accurate performance estimation model that enabled estimating the accelerators performance, before implementing them.
The large software applications of today provide abstractions of the real-life systems that they support. A digital model of the system, and of the changes that occur within, are being maintained and updated, as trigg...
详细信息
ISBN:
(纸本)9781538608623
The large software applications of today provide abstractions of the real-life systems that they support. A digital model of the system, and of the changes that occur within, are being maintained and updated, as triggered by real-life events. Morphologically, such applications contain several distinct architectural entities: databases holding the state, central components describing how the system reacts to external events and mechanisms through which the user can view the current state and issue new commands. Each of these entities may use distinct paradigms and employ different technologies. A production-ready software application ends up assembling a relatively high technology stack and provides the final abstractions for both the problem and its solution. In this paper we propose a short-circuit for the long chain of technologies that are usually employed in large, production-ready software applications. The resulting architecture is a distributed, message-based system which behaves as a hybrid between a database and a runtime environment. The system operates with persistent and live entities, encapsulating both state and operations and therefore easily assimilated with OOP classes.
暂无评论