A dynamic task distribution and scheduling technology of a message middleware for parallel data delivery on massive database systems is presented. The techniques of computer cluster and parallel processing are used to...
详细信息
A dynamic task distribution and scheduling technology of a message middleware for parallel data delivery on massive database systems is presented. The techniques of computer cluster and parallel processing are used to implement the task definition and distribution, processor scheduling, parallel data delivery for massive different structure databases. It implements the massive data parallel exchange between isomerism databases on a large-scale network and multi layer networks, with the method of database synchronization mechanism , configuration technique.
In this paper we present a software tool for the simulation of distributed real-time embedded systems. Our tool is based on the popular NS-2 package for simulating the networking aspects, and on the RTSim package for ...
详细信息
In this paper we present a software tool for the simulation of distributed real-time embedded systems. Our tool is based on the popular NS-2 package for simulating the networking aspects, and on the RTSim package for the realtime operating system aspects. By reusing much of the existing code, our simulator covers a very wide range of network protocols and real-time mechanisms. After describing the architecture of our tool, we tested it in a simple wireless sensor networks scenario, and we measured the latency in transmitting and receiving messages due to the concurrent activities in the nodes. These effects have been tested against two node scheduling policies, and under different load conditions in the CPU of the nodes.
We propose a generic algorithmic model called STAMP(synchronous, transactional, and asynchronous multi-processing) as a universal performance and power complexity model for multithreaded algorithms and systems. We pro...
详细信息
We propose a generic algorithmic model called STAMP(synchronous, transactional, and asynchronous multi-processing) as a universal performance and power complexity model for multithreaded algorithms and systems. We provide examples to illustrate how to design and analyze algorithms using STAMP and how to apply the complexity estimates to better utilize CMP(chip multiprocessor)-based machines within given constraints such as power.
Compositional performance analysis iteratively alternates local scheduling analysis techniques and output event model propagation between system components to enable performance analysis of heterogeneous distributed s...
详细信息
Compositional performance analysis iteratively alternates local scheduling analysis techniques and output event model propagation between system components to enable performance analysis of heterogeneous distributedsystems. In spite of its high scalability and adaptability, the compositional approach may suffer from overestimated results compared with other system performance verification techniques. The main reason is an incomplete consideration of event sequence correlations. In this paper we present a new technique that improves the output jitter calculation by correlating jitter and response times and offers significantly tighter analysis bounds.
OpenMP has emerged as an important model and language extension for shared-memory parallel programming. On shared-memory platforms, OpenMP offers an intuitive, incremental approach to parallel programming. In this pap...
详细信息
OpenMP has emerged as an important model and language extension for shared-memory parallel programming. On shared-memory platforms, OpenMP offers an intuitive, incremental approach to parallel programming. In this paper, we present techniques that extend the ease of shared-memory parallel programming in OpenMP to distributed-memory platforms as well. First, we describe a combined compile-time/runtime system that uses an underlying software distributed shared memory system and exploits repetitive data access behavior in both regular and irregular program sections. We present a compiler algorithm to detect such repetitive data references and an API to an underlying software distributed shared memory system to orchestrate the learning and proactive reuse of communication patterns. Second, we introduce a direct translation of standard OpenMP into MPI message-passing programs for execution on distributed memory systems. We present key concepts and describe techniques to analyze and efficiently handle both regular and irregular accesses to shared data. Finally, we evaluate the performance achieved by our approaches on representative OpenMP applications.
We present the first experimental results on the implementation of a multi-core model checking algorithm for the SPIN model checker. These algorithms specifically target shared-memory systems, and are initially restri...
详细信息
We present the first experimental results on the implementation of a multi-core model checking algorithm for the SPIN model checker. These algorithms specifically target shared-memory systems, and are initially restricted to dual-core systems. The extensions we have made require only small changes in the SPIN source code, and preserve virtually all existing verification modes and optimization techniques supported by SPIN, including the verification of both safety and liveness properties and the verification of SPIN models with embedded C code fragments.
Association rule mining is one of the most important techniques in data mining. It extracts significant patterns from transaction databases and generates rules used in many decision support applications. Many organiza...
详细信息
Association rule mining is one of the most important techniques in data mining. It extracts significant patterns from transaction databases and generates rules used in many decision support applications. Many organizations such as industrial, commercial, or even scientific sites may produce large amount of transactions and attributes. Mining effective rules from such large volumes of data requires much time and computing resources. In this paper, we propose a parallel Fl-growth association rule mining algorithm for rapid extraction of frequent itemsets from large dense databases. We also show that this algorithm can efficiently be parallelized in a cluster computing environment. The preliminary experiments provide quite promising results, with nearly ideal scaling on small clusters and about half of ideal (15 fold speedup) on a thirty-two processor cluster.
Targeted optimization of program segments can provide an additional program speedup over the highest default optimization level, such as -O3 in GCC. The key challenge is how to automatically search for performance sen...
详细信息
Targeted optimization of program segments can provide an additional program speedup over the highest default optimization level, such as -O3 in GCC. The key challenge is how to automatically search for performance sensitive program segments in a given code, to which a customized set of optimization compiler options could be applied. In this paper we propose a method for automatic detection of performance sensitive program segments based on program segment similarity. First we create a proxy segment template database trained over a set of random input programs. The compiler identifies program segments by correlating them to the pre-build proxy segment templates using the syntax structure and architecture-dependent behavior similarity. We argue that the identified program segments can be custom optimized to improve the overall program performance. The method is evaluated on the Intel XScale PXA255 platform using randomly selected benchmarks. The experimental results show that our method can provide additional speedups over the highest optimization level in GCC 3.3 (-O3) for an arbitrary set of applications.
This paper revisits multi-point range query (MPRQ) for 2-d spatial database. In a previous paper, we introduced an efficient algorithm, PRQ, to answer the query for the case where database resides in main memory. This...
详细信息
ISBN:
(纸本)9780889866560
This paper revisits multi-point range query (MPRQ) for 2-d spatial database. In a previous paper, we introduced an efficient algorithm, PRQ, to answer the query for the case where database resides in main memory. This paper extends the algorithm to the general case in which the database is large and has to reside on disk. The MPRQ is defined as: Given a set of query points, P = {p1, p2, ..., pn}, and a search distance d, report all points in the spatial database that are within a distance d of some point pi in P. The simple method of performing Repeated Range Query (RRQ), i.e. the standard range query for each query point pi (1 £ i £ n) and combining the results is inefficient as it involves multiple searches on the database. We show that PRQ-Disk still achieve better results and outperform RRQ-Disk, as in the case of main memory. Extensive experiments using various real-life datasets, different Rtree variants (including bulk-loaded ones), over different query paths P, search distances d, and LRU buffering show that PRQ-Disk outperforms RRQ-Disk in terms of both query time and I/Os.
Process replication is used for providing highly available and fault-tolerant systems. Traditionally, for simplicity reasons they have assumed the crash-stop failure model. This paper, instead, encourages the use of t...
详细信息
ISBN:
(纸本)9780769529172;0769529178
Process replication is used for providing highly available and fault-tolerant systems. Traditionally, for simplicity reasons they have assumed the crash-stop failure model. This paper, instead, encourages the use of the crash-recovery with partial amnesia failure model when managing large state amounts, presenting the arising problems of this assumption and outlining how they can be managed. Finally, an overhead analysis is presented.
暂无评论