Consistency is an important issue in distributed Shared Memory (DSM) systems. these systems share a set of objects or virtual memory pages. the data sharing enables the applications in workloads to access the data con...
详细信息
ISBN:
(纸本)0769523129
Consistency is an important issue in distributed Shared Memory (DSM) systems. these systems share a set of objects or virtual memory pages. the data sharing enables the applications in workloads to access the data concurrently. But, these concurrent accesses can generate some inconsistencies in the shared data state. the consistency models are responsible for managing consistency of shared data for the workloads. In this work, we propose, present and analyze a reconfigurable consistency model for object based DSMs. We called this consistency model ROCoM (Reconfigurable Object Consistency Model). ROCoM behavior was represented using a reconfigurable algorithm (RA) and it analysis was made using a simulation tool (ClusterSim - Cluster Simulation Tool). Our results show that ROCoM, on average, had 55% better performance than the other traditional consistency models.
Many of the world's most critical systems are distributed real-time embedded (ORE) systems, with mission-critical quality of service (QoS) requirements. However, because of their nature - heterogeneous nodes and l...
详细信息
ISBN:
(纸本)0769523129
Many of the world's most critical systems are distributed real-time embedded (ORE) systems, with mission-critical quality of service (QoS) requirements. However, because of their nature - heterogeneous nodes and links, shared and constrained resources, and deployment in dynamic environments - providing QoS requires coordinated QoS management throughout the system of multiple end-to-end application streams competing for shared resources. It requires dynamic resource allocation to these end-to-end application streams based on potentially changing mission requirements and shaping application behaviors to effectively use the resources that are allocated. In this paper, we describe the issues involved with providing end-to-end QoS management in DRE systems, an architecture we have designed to support system-wide end-to-end QoS management, and a multi-UAV surveillance and target tracking application we are using to evaluate these technologies.
Withthe heavy reliance of modern scientific applications upon the MPI Standard, it has become critical for the implementation of MPI to be as capable and as fast as possible. this has led some of the fastest modem ne...
详细信息
ISBN:
(纸本)0769523129
Withthe heavy reliance of modern scientific applications upon the MPI Standard, it has become critical for the implementation of MPI to be as capable and as fast as possible. this has led some of the fastest modem networks to introduce the capability to offload aspects of MPI processing to an embedded processor on the network interface. Withthis important capability has come significant performance implications. Most notably, the time to process long queues of posted receives or unexpected messages is substantially longer on embedded processors. this paper presents an associative list matching structure to accelerate the processing of moderate length queues in MPI. Simulations are used to compare the performance of an embedded processor augmented withthis capability to a baseline implementation. the proposed enhancement significantly reduces latency for moderate length queues while adding virtually no overhead for extremely short queues.
A low-power dynamic reconfigurable FFT fabric is proposed in this paper. the architecture is served as a scalable IP Core which is suitable for System on Chip applications. the system can be configured as 16, 32, 64, ...
详细信息
ISBN:
(纸本)0769523129
A low-power dynamic reconfigurable FFT fabric is proposed in this paper. the architecture is served as a scalable IP Core which is suitable for System on Chip applications. the system can be configured as 16, 32, 64, 128, 256, 512 and 1024-point FFT. Compared with a conventional ASIC FFT processor, this FFT fabric is characterized by having dynamic reconfigurability while incurring only a 12-19% increase in energy consumption, and requiring 14% more area than a 1024-point non-reconfigurale FFT fabric. On the other hand, compared with a FFT processor which is mapped onto a general purpose reconfigurable architecture, it has 30-94% less energy consumption.
Reconfigurable computing offers the promise of performing computations in hardware to increase performance and efficiency while retaining much of the flexibility of a software solution. Recently, the capacities of rec...
详细信息
ISBN:
(纸本)0769523129
Reconfigurable computing offers the promise of performing computations in hardware to increase performance and efficiency while retaining much of the flexibility of a software solution. Recently, the capacities of reconfigurable computing devices, like field programmable gate arrays, have risen to levels that make it possible to execute 64b floating-point operations, SRC Computers has designed the SRC-6 MAP station to blend the benefits of commodity processors withthe benefits of reconfigurable computing. In this paper, we describe our effort to accelerate the performance of several scientific applications on the SRC-6. We describe our methodology, analysis, and results. Our early evaluation demonstrates that the SRC-6 provides a unique software stack that is applicable to many scientific solutions and our experiments reveal the performance benefits of the system.
this paper investigates the overhead of a dynamic load balancing library for large irregular data-parallel scientific applications on general-purpose clusters. the library is based on an integrated approach combining ...
详细信息
ISBN:
(纸本)0769523129
this paper investigates the overhead of a dynamic load balancing library for large irregular data-parallel scientific applications on general-purpose clusters. the library is based on an integrated approach combining the advantages of novel dynamic loop scheduling strategies as data migration policies withthe advances in resource management and task migration capabilities offered by a recently developed parallel runtime system. the paper focuses on the contribution of the runtime system software layer to the total overhead of the library. Experiments to compare the performance of two applications using the library, the N-body simulations and the profiling of a quadrature routine, withthe performance of the same applications using an MPI-only implementation of the dynamic scheduling techniques indicate only a slight decrease in performance due to the overhead of the runtime system software layer. the results validate the suitability of the runtime system as an implementation platform for dynamic load balancing schemes, and underscore the significance of using the integrated approach, as well as the benefits of using the library especially in cluster applications characterized by irregular and unpredictable behavior.
Frequency and intensity of Internet attacks are rising with an alarming pace. Several technologies and concepts were proposed for fighting distributed denial of service (DDoS) attacks: traceback, pushback, i3, SOS and...
详细信息
ISBN:
(纸本)0769523129
Frequency and intensity of Internet attacks are rising with an alarming pace. Several technologies and concepts were proposed for fighting distributed denial of service (DDoS) attacks: traceback, pushback, i3, SOS and Mayday. this paper shows that in the case of DDoS reflector attacks they are either ineffective or even counterproductive. We then propose a novel concept and system that extends the control over network traffic by network users to the Internet using adaptive traffic processing devices. We safely delegate partial network management capabilities from network operators to network users. All network packets with a source or destination address owned by a network user can now also be controlled within the Internet instead of only at the network user's Internet uplink. By limiting the traffic control features and by restricting the realm of control to the "owner" of the traffic, we can rule out misuse of this system. applications of our system are manifold: prevention of source address spoofing, DDoS attack mitigation, distributed firewall-like filtering, new ways of collecting traffic statistics, traceback, distributed network debugging, support for forensic analyses and many more.
A large emerging class of interactive multimedia streaming applicationsthat are highly parallel can be represented as a coarse-grain, pipelined, data-flow graph. One common characteristic of these applications is the...
详细信息
ISBN:
(纸本)0769523129
A large emerging class of interactive multimedia streaming applicationsthat are highly parallel can be represented as a coarse-grain, pipelined, data-flow graph. One common characteristic of these applications is their use of current data: A task would obtain the latest data from preceding stages, skipping over older data items if necessary to perform its computation. When parallelized, such applications waste resources because they process and keep data in memory that is eventually dropped from the application pipeline. To overcome this problem, we have designed and implemented an Adaptive Resource Utilization (ARU) mechanism that uses feedback to dynamically adjusts the resources each task running thread utilizes so as to minimize wasted resource use by the entire application. A color-based people tracker application is used to explore the performance benefits of the proposed mechanism. We show that ARU reduces the application's memory footprint by two-thirds compared to our previously published results, while also improving latency and throughput of the application.
Grid applications typically deal with huge amount of data and often the same data have to be transferred and processed on many resources. Nevertheless, the majority of existing middleware platforms for Grid computing ...
详细信息
ISBN:
(纸本)0769523129
Grid applications typically deal with huge amount of data and often the same data have to be transferred and processed on many resources. Nevertheless, the majority of existing middleware platforms for Grid computing do not provide suitable programming and communication models to make easy software development and to improve communication performances when a large set of receivers is involved. Some middlewares for wide area network computing, such as ProActive, provide the group abstraction to transparently deal with a number of similar receivers. We propose an extension of such a mechanism in order to improve its features for Grid environments. In particular, ProActive native groups have been extended both at programming and communication levels in order to support both different internal behaviors and high performance communication based on IP multicast. A case study shows the effectiveness of the new mechanism and its efficiency compared withthe original one.
Sparse and irregular computations constitute a large fraction of applications in the data-intensive scientific domain. While every effort is made to balance the computational workload in such computations across paral...
详细信息
ISBN:
(纸本)0769523129
Sparse and irregular computations constitute a large fraction of applications in the data-intensive scientific domain. While every effort is made to balance the computational workload in such computations across parallel processors, achieving sustained near machine-peak performance with close-to-ideal load balanced computation-to-processor mapping is inherently difficult. As a result, most of the time, the loads assigned to parallel processors can exhibit significant variations. While there have been numerous past efforts that study this imbalance from the performance viewpoint, to our knowledge, no prior study has considered exploiting the imbalance for reducing power consumption during execution. Power consumption in large-scale clusters of workstations is becoming a critical issue as noted by several recent research papers from both industry and academia. Focusing on sparse matrix computations in which underlying parallel computations and data dependencies can be represented by trees, this paper proposes schemes that save power through voltage/frequency scaling. Our goal is to reduce overall energy consumption by scaling the voltages/frequencies of those processors that are not in the critical path;i.e., our approach is oriented towards saving power without incurring performance penalties.
暂无评论