The shared memory paradigm offers a well known programming model for parallel systems. But it lacks from its bad performance in conventional implementations if it is used in large grain or page based systems. The main...
详细信息
ISBN:
(纸本)0780320182
The shared memory paradigm offers a well known programming model for parallel systems. But it lacks from its bad performance in conventional implementations if it is used in large grain or page based systems. The main problems are (1) the transparent view on the system level, (2) the false sharing caused by locating several consistency units into the same transportation unit, and that (3) high level software implementations are not integrated within the system architecture. The first point is addressed by annotating programming objects and deriving a specific configuration of system functionalities. The second point is solved by Game, the General and Autonomous Merging Environment which allows a multiple reader, multiple writer approach. The third point is directed by three implementation models of Game. A hardware based implementation and even a software based implementation are able to hide the costs of the local activities to perform Game by the network latency.
A one-shot semi-join reduction approach was recently proposed to execute all semi-joins on the same relation simultaneously such that the relation only needs to be scanned once. The one-shot semi-join reduction approa...
详细信息
A one-shot semi-join reduction approach was recently proposed to execute all semi-joins on the same relation simultaneously such that the relation only needs to be scanned once. The one-shot semi-join reduction approach was applied to reducing distributed query response time under the assumption that one copy of each referenced relation has been chosen prior to the production of an execution plan. The estimations of both semi-join reduction effect and local join cost, employed in previous work, were restricted to a special case. In this paper, we extend the previous work in three ways: 1) remove the requirement for copy selection before the production of a semi-join reduction program, 2) allow the choice of redundant copies for the execution of semi-joins, 3) employ a general cost model which covers a large class of possible estimations of semi-join reduction effect and local join cost. Then, an algorithm to produce an optimal parallel one-shot semi-join reduction program for minimizing response time shall be presented, addressing the above three aspects.
Compact, portable systems capable of quickly identifying contaminants in the field are of great importance when monitoring the environment. In this paper, we examine the effectiveness of using artificial neural networ...
详细信息
ISBN:
(纸本)0819418455
Compact, portable systems capable of quickly identifying contaminants in the field are of great importance when monitoring the environment. In this paper, we examine the effectiveness of using artificial neural networks for real-time data analysis of a sensor array. Analyzing the sensor data in parallel may allow for rapid identification of contaminants in the field without requiring highly selective individual sensors. We use a prototype sensor array which consists of nine tin-oxide Taguchi-type sensors, a temperature sensor, and a humidity sensor. We illustrate that by using neural network based analysis of the sensor data, the selectivity of the sensor array may be significantly improved, especially when some (or all) of the sensors are not highly selective.
This paper presents the performance analysis of realizing median filtering on a distributed multiprocessor system. The results of the performance analysis give a good indication of the performance gain in using multi-...
详细信息
This paper presents the performance analysis of realizing median filtering on a distributed multiprocessor system. The results of the performance analysis give a good indication of the performance gain in using multi-processor for median filtering over uni-processor. Such performance gain is proportional to the problem size as shown by varying the size of the image. Furthermore, through the analysis, it is clear that the computation time and inter-processor communications scale well with the number of processors in the system. However, the overall system performance does not have such behavior because of the initialization overhead dominating the computation time as the number of processors increases beyond a certain point. It is because of this relationship that an optimal performance is achievable with a certain number of processors. It is also found that this number varies with the problem size. In addition, the subimage model is found to be an acceptable approach for this type of processing as only the necessary parts of the image are sent to the other processors. The master and slave scheme proves to be easy for programming, control and data manipulation. As a whole, this type of non-linear processing seems to fit well into the MIMD architecture.
Multistage interconnection networks are used in a number of application areas such as parallel computers and high-speed communication systems. As the performance of these systems lies on an efficient design of the int...
详细信息
Multistage interconnection networks are used in a number of application areas such as parallel computers and high-speed communication systems. As the performance of these systems lies on an efficient design of the interconnection network, a thorough analysis of the network's performance is important. Mathematical analysis so far provides inadequate results and simulation analysis using a uniprocessor usually requires extremely long run time to evaluate large networks. This paper addresses the use of parallel simulation techniques to speedup the simulation of multistage interconnection networks. The conventional null-message approach for resolving deadlock problem in conservative simulation may cause livelock if lookahead is not guaranteed. We propose a deadlock/livelock free scheme using null messages, but without the guaranteed lookahead, to coordinate the simulation, and different partitioning techniques for mapping of the simulation program onto multi-computers. A flushing mechanism is also used to resolve the null-message explosion problem. Our analysis shows that the proposed flushing mechanism effectively reduces the number of null messages from exponential to linear.
This paper is about the definition of deadlocks in asynchronous messages communication systems. The considered system model covers unspecified receptions, not FIFO channels, and general resource (message) requests inc...
详细信息
This paper is about the definition of deadlocks in asynchronous messages communication systems. The considered system model covers unspecified receptions, not FIFO channels, and general resource (message) requests including, among others, AND, OR, AND-OR and k-out-of-n requests.< >
This paper studies properties of messages communication modes in distributedsystems. It establishes a simple, hierarchical and homogeneous characterization of logically instantaneous, causally ordered and first-in-fi...
详细信息
This paper studies properties of messages communication modes in distributedsystems. It establishes a simple, hierarchical and homogeneous characterization of logically instantaneous, causally ordered and first-in-first-out communications. It is shown that a distributed computation obeys one of the previous communication modes iff a communication graph of messages does not include a cycle. This characterization plays a key role when one is interested in designing, analyzing, testing or debugging asynchronous distributed computations. This graph-based approach shows there is some unity in the characterization of deadlock, concurrency control, memory consistency and communication modes.< >
This paper analyzes the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things it is shown that an N-node butterfly containing N/sup 1-/spl ep...
详细信息
This paper analyzes the ability of several bounded degree networks that are commonly used for parallel computation to tolerate faults. Among other things it is shown that an N-node butterfly containing N/sup 1-/spl epsiv// worst-case faults (for any constant /spl epsiv/>0) can emulate a fault-free butterfly of the same size with only constant slowdown. Similar results are proven for the shuffle-exchange graph. Hence, these networks become the first connected bounded-degree networks known to be able to sustain more than a constant number of worst-case faults without suffering more than a constant-factor slowdown in performance.< >
In massively parallel SIMD machines, communication bottlenecks have been a major problem due to the limitation of available topologies. Especially they are not well suited to broadcast-type communications. Some sugges...
详细信息
In massively parallel SIMD machines, communication bottlenecks have been a major problem due to the limitation of available topologies. Especially they are not well suited to broadcast-type communications. Some suggested approaches are not practical, even though they are asymptotically fast, because they incur large minimum latency. In this paper, a simple and practical linear broadcast-type communication algorithm which is based on associative computing and does not use interconnection networks at all, is presented.< >
This paper presents the results for the verification of the *** cache coherence protocol. The *** protocol uses a distributed directory with limited number of pointers and hardware supported overflow handling that kee...
详细信息
暂无评论