In the group mutual exclusion problem, each critical section has a type or a group associated with it. Processes requesting critical sections belonging to the same group (that is, of the same type) may execute their c...
详细信息
In the group mutual exclusion problem, each critical section has a type or a group associated with it. Processes requesting critical sections belonging to the same group (that is, of the same type) may execute their critical sections concurrently. However, processes requesting critical sections belonging to different groups (that is, of different types) must execute their critical sections in a mutually exclusive manner. Most algorithms for solving the group mutual exclusion problem that have been proposed so far in the literature treat all groups equally. this is quite acceptable if a process, at the time of making a request for critical section, selects a group for the critical section uniformly. However, if some groups are more likely to be selected than others, then better performance can be achieved by treating different groups in a different manner. In this paper, we propose an efficient algorithm for solving the group mutual exclusion problem when group selection probabilities are nonuniformly distributed. Our algorithm has a message complexity of 2n - 1 per request for critical section, where n is the number of processes in the system. It has low synchronization delay of one message hop and low waiting time of two message hops. the maximum concurrency of our algorithm is n, which implies that if all processes have requested critical sections of the same type then all of them may execute their critical sections concurrently. Finally, the amortized message-size complexity of our algorithm is O(1). Our experimental results indicate that our algorithm outperforms the existing algorithms, whose complexity measures are comparable to that of ours, by as much as 50% in some cases. (C) 2007 Elsevier Inc. All rights reserved.
the proceedings contain 51 papers. the topics discussed include: using an updating of DHCP in mobile ad-hoc networks;searching and detecting spatial LSB steganographic images on the Internet;development of parallel di...
详细信息
ISBN:
(纸本)0889865701
the proceedings contain 51 papers. the topics discussed include: using an updating of DHCP in mobile ad-hoc networks;searching and detecting spatial LSB steganographic images on the Internet;development of parallel direct sparse linear solvers within a parallel finite element code;checkpointing and rollback-recovery protocol integrated with VsSG protocol for RYW session guarantee;achieving realtime capabilities in Ethernet networks by edge-coloring of communication conflict-multigraphs;cross-layer designs for mitigating range attacks in ad hoc networks;grid load balancing using an echo system of intelligent ants;critical path routing (CPR) protocol for mobile ad hoc networks;ADPROC: an adaptive routing framework to provide QoS in wireless sensor networks;evaluating the use of Motes and Tinyos for a mobile sensor platform;and a comparison study of optical MIN networks withparallel planes.
Next generation multimedia mobile phones that use the high bandwidth 3G cellular radio network consume more power. Multimedia algorithms such as speech, video transcodecs have very large instruction foot prints and co...
详细信息
ISBN:
(纸本)9781604233926
Next generation multimedia mobile phones that use the high bandwidth 3G cellular radio network consume more power. Multimedia algorithms such as speech, video transcodecs have very large instruction foot prints and consequently stalled due to instruction cache misses. the conflicts in on-chip caches contribute a large fraction of the CPU cycle penalty and hence increase in power consumption. Many commercial tools are developed to minimize such cache misses by adequately placing the frequently called procedures in an application. Followed by profile extraction, these tools use cache line coloring and post compilation techniques for cache hit optimization. the major impediment in the optimal performance of such tools is their static layout profile, which does not consider the dynamic behavior of the application. We propose a methodology called DCP (dynamic code placement) for positioning code at run time for good instruction cache performance and have implemented in high end processors. the dynamic application profile is completely transparent to the developer’s code. this technique optimizes the code footprint in memory layout of a program. It improves i-cache mapping to increase the number of cache hits and eventually reduce the CPU stalls. Our optimization is powered with static as well as detail run time profile information that extracts the relevant, temporal behavior of the applications. Moreover, while mapping code in instruction cache, the effect of inter-procedural code positioning is also considered. Improvement over the Pettis and Hansen approach (PH) is also shown in results. though majority of multimedia applications can be optimized by our framework, application dominated withthe function pointers do not work correctly. the technique incurs low overheads and enhances the cache hits architecture correlation. For a range of applications we show that instruction miss rates can be reduced by 19-68%. Using a simple model this corresponds to execution time re
the proceedings contain 45 papers. the topics discussed include: on the number of spanning trees in a mesh structure;polylogarihmic gap between meshes with dynamically separable buses and meshes with statically partit...
ISBN:
(纸本)9781604236446
the proceedings contain 45 papers. the topics discussed include: on the number of spanning trees in a mesh structure;polylogarihmic gap between meshes with dynamically separable buses and meshes with statically partitioned buses;a bus-based recursive mesh for high-performance on-chip systems;designing parallel algorithms on the star graph using orthogonal decomposition;a virtual-channel free mapping for application-specific on-chip torus networks;using abstract state machines to specify a protocol that implements the GWO consistency model;optimizing timing and code size using maximum direct loop fusion;a constraint-based framework for concurrent and distributed programming;and topology and power management routing for reliability in ad-hoc networks.
the co-operation of parallel simulated annealing processes to solve the vehicle routing problem with time windows (VRPTW) is considered. the objective is to investigate how the number of parallel processes and the fre...
详细信息
ISBN:
(纸本)9780889866386
the co-operation of parallel simulated annealing processes to solve the vehicle routing problem with time windows (VRPTW) is considered. the objective is to investigate how the number of parallel processes and the frequency of processes co-operation influence the accuracy of solutions to the VRPTW. the accuracy of solutions is measured by their proximity to the best known solution.
In this paper, we present a new method of static load balancing for parallel mining of all frequent itemsets on a distributed-memory (DM) parallel machine. the method partitions the space of all frequent itemsets into...
详细信息
ISBN:
(纸本)9780889866386
In this paper, we present a new method of static load balancing for parallel mining of all frequent itemsets on a distributed-memory (DM) parallel machine. the method partitions the space of all frequent itemsets into subspaces of approximately the same size. Hence, it allows to balance the computational load for an arbitrary frequent itemset mining algorithm.
To keep up withthe pace of fast development of Internet, cluster architecture has been proposed for next generation core routers. In a cluster router, parallel computation is expected. computing shortest path tree (S...
详细信息
ISBN:
(纸本)9780889866386
To keep up withthe pace of fast development of Internet, cluster architecture has been proposed for next generation core routers. In a cluster router, parallel computation is expected. computing shortest path tree (SPT) is a fundamental problem implementing OSPF, which is one of the most popular routing protocols. this paper presents a parallel algorithm BPA (Branching parallel Algorithm) for computing SPT, analyzes the performance of BPA, and finally validates the BPA performance by experiments
In this paper, we discuss parallelization of a high-level computer vision application in medical imaging, namely, multi-scale active shape description of MR (magnetic resonance) brain images of epileptic patients usin...
详细信息
ISBN:
(纸本)9780889866386
In this paper, we discuss parallelization of a high-level computer vision application in medical imaging, namely, multi-scale active shape description of MR (magnetic resonance) brain images of epileptic patients using active contour models, on a cluster of workstations. the paper gives a comparative study and analysis of three different approaches of parallel implementation using corresponding parallelcomputing patterns such as Temporal Multiplexing, Pipeline, and Composite Pipeline. the outcome of the cluster-based parallel implementations has shown encouraging results.
distributed testing is often hard to implement. this is due to difficulties in handling heterogeneous environments, complex configurations, synchronization, error probing, result maintenance and automation in distribu...
详细信息
ISBN:
(纸本)9780889866386
distributed testing is often hard to implement. this is due to difficulties in handling heterogeneous environments, complex configurations, synchronization, error probing, result maintenance and automation in distributed testing. this paper describes a practical testing framework that permits automated distributed testing of distributedsystems and applications. the framework extends the capability of JUnit to support test execution over heterogeneous environments and complex configurations.
Memory consistency model is crucial to the performance of shared-memory multiprocessors, and in current architectures several different models are adopted. In this paper, using graph algorithms for illustrative purpos...
详细信息
ISBN:
(纸本)9780889866386
Memory consistency model is crucial to the performance of shared-memory multiprocessors, and in current architectures several different models are adopted. In this paper, using graph algorithms for illustrative purposes, we consider the impact of memory model on the implementation and performance of parallel algorithms on shared-memory multiprocessors. We show that the implementation of PRAM algorithm's is largely "oblivious" of the underlying memory model, and has good performance on relaxed models. More importantly, we show that different memory models can favor drastically different algorithm designs.
暂无评论