A model for virtual memory in a distributed memory parallel computer is proposed. It uses a novel parallel computing operating system framework and leads to the definition of two strategies for implementing parallel v...
详细信息
A model for virtual memory in a distributed memory parallel computer is proposed. It uses a novel parallel computing operating system framework and leads to the definition of two strategies for implementing parallel virtual memory. Careful analysis and simulation results indicate that dynamic page allocation performs better for applications that exhibit some locality of reference of public data and for applications whose data space does not fit in the physical memory available. Static page allocation is more efficient in cases of poor locality and small data space (no virtual memory needed).
Highways is a distributed-programming system, we are building, with high-performance as a major goal. The suite of send primitives implemented in Highways, called Global-Flush Primitives, have three notable aspects. (...
详细信息
ISBN:
(纸本)081864222X
Highways is a distributed-programming system, we are building, with high-performance as a major goal. The suite of send primitives implemented in Highways, called Global-Flush Primitives, have three notable aspects. (1) Global-Flush Primitives permit making an assertion about messages sent in the past of sending m, in the future of sending m, about both, or neither. (2) The past and the future of an event is defined using the relation `happened before.' (3) A message can be sent to any subgroup of processes specified as a parameter.
A distributed algorithm is proposed in order to control block motion of a reconfigurable micro-electro-mechanical modular surface. The modular surface is designed to convey fragile and tiny micro-parts. The distribute...
详细信息
ISBN:
(纸本)9781479941162
A distributed algorithm is proposed in order to control block motion of a reconfigurable micro-electro-mechanical modular surface. The modular surface is designed to convey fragile and tiny micro-parts. The distributed algorithm solves a discrete trajectory optimization problem. In particular, the algorithm computes the shortest path between two points of the modular surface using a strategy based on minimum hop count. The proposed method based on distributed asynchronous iterative elections is scalable.
We study parallel algorithms for the minimum spanning tree problem, based on the sequential algorithm of Boruvka. The target architectures for our algorithm are asynchronous, distributed-memory machines. Analysis of o...
详细信息
ISBN:
(纸本)0818672552
We study parallel algorithms for the minimum spanning tree problem, based on the sequential algorithm of Boruvka. The target architectures for our algorithm are asynchronous, distributed-memory machines. Analysis of our parallel algorithm, on a simple model that is reminiscent of the LogP model, shows that in principle a speedup proportional to the number of processors can be achieved, but that communication costs can be significant. To reduce these costs, we develop a new randomized linear work pointer jumping scheme that performs better than previous linear work algorithms. We also consider empirically the effects of data imbalance on the running time. For the graphs used in our experiments, load balancing schemes result in little improvement in running times. Our implementations on sparse graphs with 64,000 vertices on Thinking Machine's CM-5 achieve a speedup factor of about 4 on 16 processors. On this environment, packaging of messages turns out to be the most effective way to reduce communication costs.
Performance of I/O intensive applications on a multiprocessor system depends mostly on the variety of disk access delays encountered in the I/O system. Over the years, the improvement in disk performance has taken pla...
详细信息
ISBN:
(纸本)0818675829
Performance of I/O intensive applications on a multiprocessor system depends mostly on the variety of disk access delays encountered in the I/O system. Over the years, the improvement in disk performance has taken place slower than corresponding increase in processor speeds. It is therefore necessary to model I/O delays and evaluate performance benefits of moving an application to a better multiprocessor system. In this work, we perform such an analysis by measuring I/O delays for a synthesized application that uses paralleldistributed File System. The aim of this study was to evaluate the performance benefits of better disks in a multiprocessor system which was designed few years back. We report how the I/O performance would get affected if an application were to be run on a system which would have better disks and communication links. In this study, we show a substantial improvement in the performance of I/O system with better disks and communication links with respect to the existing system.
Relay transmission is a promising technology for improving the throughput and energy efficiency in multi-rate wireless personal area networks (WPANs). In this paper, we propose a distributed relay MAC (DR-MAC) protoco...
详细信息
ISBN:
(纸本)9780769534718
Relay transmission is a promising technology for improving the throughput and energy efficiency in multi-rate wireless personal area networks (WPANs). In this paper, we propose a distributed relay MAC (DR-MAC) protocol in WiMedia WPANs. DR-MAC extends a distributed reservation protocol (DRP) in WiMedia MAC and neighbor information for relay transmission can be collected during the beacon period. Therefore, DR-MAC can minimize control overhead for relay transmission and is compatible to the standard WiMedia MAC protocol. We also introduce a medium access slot (MAS) allocation procedure for maximizing the efficiency in DR-MAC. Compared with direct transmission, extensive simulation results demonstrate that DR-MAC can improve the throughput by 10% and reduce the energy consumption by 26% when the number of devices is 20.
State Machine Replication (SMR) is a well-known technique to implement fault-tolerant systems. In SMR, servers are replicated and client requests are deterministically executed in the same order by all replicas. To im...
详细信息
ISBN:
(纸本)9781538616796
State Machine Replication (SMR) is a well-known technique to implement fault-tolerant systems. In SMR, servers are replicated and client requests are deterministically executed in the same order by all replicas. To improve performance in multi-processor systems, some approaches have proposed to parallelize the execution of non-conflicting requests. Such approaches perform remarkably well in workloads dominated by non-conflicting requests. Conflicting requests introduce expensive synchronization and result in considerable performance loss. Current approaches to parallel SMR define the degree of parallelism statically. However, it is often difficult to predict the best degree of parallelism for a workload and workloads experience variations that change their best degree of parallelism. This paper proposes a protocol to reconfigure the degree of parallelism in parallel SMR on-the-fly. Experiments show the gains due to reconfiguration and shed some light on the behavior of parallel and reconfigurable SMR.
This paper describes a parallel algorithm for correlating or "fusing" streams of data from sensors and other sources of information. The algorithm is useful for applications where composite conditions over m...
详细信息
ISBN:
(纸本)0769523129
This paper describes a parallel algorithm for correlating or "fusing" streams of data from sensors and other sources of information. The algorithm is useful for applications where composite conditions over multiple data streams must be detected rapidly, such as intrusion detection or crisis management. The implementation of this algorithm on a multithreaded system and the performance of this implementation are also briefly described.
Most traditional distributed Shared Memory (DSM) systems support data sharing in multi-process applications. This paper proposed a Multi-threaded Multi-home DSM system (MM-DSM) to support both data sharing and computa...
详细信息
ISBN:
(纸本)9780769537474
Most traditional distributed Shared Memory (DSM) systems support data sharing in multi-process applications. This paper proposed a Multi-threaded Multi-home DSM system (MM-DSM) to support both data sharing and computation synchronization in multi-threaded applications whose threads are grouped into bundles and distributed across multiple computers for parallel execution. Globally shared data are rearranged and assigned to different thread bundles based on their access patterns. As thread bundles move around, their hosting nodes will act as the homes of the associated data blocks to reduce communication cost. Programmers can still stick to the shared memory programming paradigm whereas data consistency, distributed lock, false sharing and multiple writes are taken care of by MM-DSM. Experimental results demonstrate its effectiveness and correctness.
Designing distributed and parallel applications is an important issue in the context of programming and execution environments. Designing applications as independently and as transparently as possible of the distribut...
暂无评论