This paper presents the StreamGen load generator, which is targeted at distributed information flow applications. These include the event streaming services used in wide-area publish/subscribe systems or in operationa...
详细信息
ISBN:
(纸本)0769521975
This paper presents the StreamGen load generator, which is targeted at distributed information flow applications. These include the event streaming services used in wide-area publish/subscribe systems or in operational information systems, the data streaming services used in remote visualization or collaboration, and the continuous data streams occurring in download services. Running across heterogeneous distributed platforms, these services are implemented by computational component that capture, manipulate, and produce information streams and are linked via overlay topologies. StreamGen can be used to produce the distributed computational and communication loads imposed by these applications. Dynamic application behaviors can be created with mathematical specifications or with behavior traces collected from application-level traces. An interesting set of traces presented in this paper is derived from long-term observations of the FTP download patterns observed at the Linux mirror site being run by the CERCS research center at the Georgia Institute of Technology. Two different flow-based applications are created and evaluated with StreamGen. The first emulates the data streaming behavior in a distributed scientific collaboration, where a scientific simulation (i.e., a molecular dynamics code) produces simulation data sent to and displayed for multiple, interactive remote users. The second emulates portions of the event-streaming behavior of an operational information system used by a large U.S. corporation. Parametric studies with StreamGen's FTP traces applied to these applications are used to evaluate different load balancing strategies for the cluster machines manipulating these applications' data streams.
Many high-dimensional index structures have been proposed, but they suffer from the so called 'dimensional curse' problem, i.e., the retrieval performance becomes increasingly degraded as the dimensionality is...
详细信息
ISBN:
(纸本)1601320841
Many high-dimensional index structures have been proposed, but they suffer from the so called 'dimensional curse' problem, i.e., the retrieval performance becomes increasingly degraded as the dimensionality is increased. To solve this problem, the cell-based filtering (CBF) scheme has been proposed, but it shows a linear decrease in performance as the dimensionality is increased. In this paper, we propose a parallel CBF scheme for indexing high-dimensional vector data, so as to cope with the linear decrease in retrieval performance. In addition, we devise data insertion, range query and k-NN query processing algorithms which are suitable for a parallel architecture. Finally, we show that our parallel CBF scheme achieves good retrieval performance in proportion to the number of servers in the parallel architecture and it outperforms a parallel version of the VA- File when the dimensionality is over 10.
Traditional software distributed shared memory (SDSM) systems modify the semantics of a real hardware shared memory system by relaxing the coherence semantic and by limiting the memory regions that are actually shared...
详细信息
Traditional software distributed shared memory (SDSM) systems modify the semantics of a real hardware shared memory system by relaxing the coherence semantic and by limiting the memory regions that are actually shared. These semantic modifications are done to improve performance of the applications using it. In this paper. we will show that a SDSM system that behaves like a real shared memory system (without the afore-mentioned relaxations) can also be used to execute OpenMP applications and achieve similar speedups as the ones obtained by traditional SDSM systems. This performance can be achieved by encouraging the cooperation between the SDSM and the OpenMP runtime instead of relaxing the semantics of the shared memory. In addition, techniques like boundaries alignment and page presend are demonstrated as very useful to overcome the limitations of the current SDSM systems. (c) 2005 Elsevier Inc. All rights reserved.
Many real-time control systems in industry are designed today for single processor architectures. At the same time, more functionality needs to be integrated into the software system. In order to enable correct timely...
详细信息
ISBN:
(纸本)1892512459
Many real-time control systems in industry are designed today for single processor architectures. At the same time, more functionality needs to be integrated into the software system. In order to enable correct timely execution of the control and protection applications, designers may need to optimize application code aggressively. Unwanted simplifications of algorithms or low sampling frequencies of the environment may be the result. Functionality In a system, which already has a degree of concurrency, may enable the system to scale onto a multiprocessor environment. This paper discusses and presents results from a study, which separates a substation automation real-time I/O communication system from application level threads in order to exploit existing concurrency. Within the system model described here, as well as in many other system models, it is possible to execute communication mechanisms and applications in parallel. The motivation for this work Is let parallel execution of the I/O System and the application enable higher performance for application functionality. The result Is more flexibility for the application designers. By describing a model of the real-time substation automation I/O System and extending that model with a mechanism to enable execution in a multiprocessor architecture, we contribute to the understanding of both the composition and the performance issues concerning parallel execution In such industrial systems. Measurements and results originate from execution in an existing system and from the multiprocessor system created.
The skyline queries help users handle the huge amount of available data by finding a set of interesting points. As the dataset sizes are constantly increasing and skyline queries are computationally expensive, it is c...
详细信息
ISBN:
(纸本)9781538637906
The skyline queries help users handle the huge amount of available data by finding a set of interesting points. As the dataset sizes are constantly increasing and skyline queries are computationally expensive, it is critical to compute such queries by utilizing parallelism. Existing works deal exclusively with the totally ordered attribute domains. In this paper, we present a framework, named PSLP, for parallel skyline evaluation for data with both totally and partially ordered domains. We introduce a new partial-to-order mapping scheme that guarantees the correctness of the mapping by preserving incomparability and preference with low mapping cost. We also propose a novel logical partitioning for parallelprocessing where data space are partitioned according to their incomparability and preference relationships by using a pivot point. The logical partitioning can prune away partitions that do not contain any skyline point at the partitioning processing. An extensive performance evaluation confirms the efficiency and effectiveness of the proposed approach.
Given an n x n binary image of white and black pixels, we present an optimal parallel algorithm for computing the distance transform and the nearest feature transform using the Euclidean metric. The algorithm employs ...
详细信息
Given an n x n binary image of white and black pixels, we present an optimal parallel algorithm for computing the distance transform and the nearest feature transform using the Euclidean metric. The algorithm employs the systolic computation to achieve O(n) running time on a linear array of n processors.
An approach is provided in this article for converting a sequential program to a distributed program. An Architecture Description Language (ADL) is used in this approach as an interface model between the sequential co...
详细信息
ISBN:
(纸本)1601320841
An approach is provided in this article for converting a sequential program to a distributed program. An Architecture Description Language (ADL) is used in this approach as an interface model between the sequential code and distributed code. The implementation process is fulfilled based upon the aforementioned descriptive language in a direct fashion, and this has resulted in the better understanding of the system. First, the required information for the creation of ADL is provided by making use of a sequential code. Then, ADL is produced by making use of such information and the implementation framework is established according to it. Among the specifications of this environment we can refer you to the achievement of behavioral description for each component in the framework of communicational protocol with other components as well as the offer of a procedure for the implementation of asynchronous between components in architecture.
Nowadays parallel software system is very common and practical. However, it is difficult to test parallel software, because the state space of parallel software is very large. Therefore, a parallel model simplificatio...
详细信息
ISBN:
(纸本)9781538637906
Nowadays parallel software system is very common and practical. However, it is difficult to test parallel software, because the state space of parallel software is very large. Therefore, a parallel model simplification method based on CPN (Color Petri Net) is proposed. Based on the original CPN, the CPN model for the tested behavior(Tested Behavior of CPN, TBoCPN) is proposed. The target of the test is described as the tested behavior. The relevant behavior is described as the behavior related to the tested behavior, then, the homogeneous concurrent branch group and the selection branch set are divided. Finally, the branches of the concurrent branch group and the selected branch set, which satisfied the condition of algorithm, are sequentially processed by the inhibitor arcs. The experiment shows that the reduction rate is at least 60%, and before and after the reduction, the full coverage test path generated by the tested behavior is not affected, thus proving that the method is an effective test method.
Software applications for biological networks analysis rely on graphs to model the structure interactions. A great part of them requires searching for subgraphs in a target graph or in collections of graphs. Even thou...
详细信息
ISBN:
(纸本)9781728116440
Software applications for biological networks analysis rely on graphs to model the structure interactions. A great part of them requires searching for subgraphs in a target graph or in collections of graphs. Even though very efficient algorithms have been defined to solve such a subgraph isomorphisms problem, the complexity of current real biological networks make their sequential execution time prohibitive. On the other hand, parallel architectures, from multi-core to many-core, have become pervasive to deal with the problem of the data size. Nevertheless, the sequential nature of the graph searching algorithms makes their implementation for parallel architectures very challenging. This paper presents three different parallel solutions for the graph searching problem. The first two target the exact search for multi-core CPUs and many-core GPUs, respectively. The third one targets the approximate search for GPUs, which handles node, edge, and node label mismatches. The paper shows how different techniques have been developed in all the solutions to reduce the search space complexity. The paper shows the performance of the proposed solutions on representative biological networks containing antiviral chemical compounds and protein interactions networks.
The real-time scheduling schemes proposed for RT CORBA are mostly priority-based, soft real-time scheduling schemes. The problem of the previous scheme is that the priority giving and the request allocating procedure ...
详细信息
ISBN:
(纸本)1892512459
The real-time scheduling schemes proposed for RT CORBA are mostly priority-based, soft real-time scheduling schemes. The problem of the previous scheme is that the priority giving and the request allocating procedure are considered as two different things. In the worst case, the tasks of imminent deadlines can be allocated on the same sever and the continuous deadline violations can occur. In general real-time system, the punctuality of deadline is more emphasized than the task throughput. Therefore, a modified scheduling algorithm is required, which takes the priority distribution into account when allocating a request. Our scheduling scheme, Priority-based RR tries to evenly distribute the task priorities on local severs by controlling the Round-Robin scheduling order according to the task urgency. Simulation says that Priority-based RR distribution can show the cost effective performance when the system load isn't too high.
暂无评论