In this paper we describe DREAM, a distributed environment that provides run-time support for parallel computations on asynchronous multiprocessors. The system supports global distributed arrays as collections of suba...
详细信息
In this paper we describe DREAM, a distributed environment that provides run-time support for parallel computations on asynchronous multiprocessors. The system supports global distributed arrays as collections of subarrays in the local memories of the intervening processors. Nodes allocate and deallocate array portions dynamically, and access external array sections without the intervention of the user program running in the remote node. Remote accesses can be performed while the program continues its execution, thus overlapping communication and computation. This feature allows the user to implement dynamic communication patterns by accessing external array elements on demand without incurring a heavy performance penalty. DREAM also vectorizes requests into larger network messages for efficiency. We report performance results for an application running on top of a prototype of the system, showing good scalability and masking the network latency with computation.
Recently there has been a massive increase in computing requirements for parallelapplications. These parallelapplications and supporting cluster services often need to share system-wide resources. The coordination o...
详细信息
ISBN:
(纸本)9780769528335
Recently there has been a massive increase in computing requirements for parallelapplications. These parallelapplications and supporting cluster services often need to share system-wide resources. The coordination of these applications is typically managed by a distributed lock manager. The performance of the lock manager is extremely critical for application performance. Researchers have shown that the use of two sided communication protocols, like TCP/IP (used by current generation lock managers), can have significant impact on the scalability of distributed lock managers. in addition, existing one-sided communication based locking designs support locking in exclusive access mode only and can pose significant scalability limitations on applications that need both shared and exclusive access modes like cooperative/file-system caching. Hence the utility of these existing designs in high performance scenarios can be limited. In this paper we present a novel protocol, for distributed locking services, utilizing the advanced network-level one-sided atomic operations provided by InfiniBand. Our approach augments existing approaches by eliminating the need for two sided communication protocols in the critical locking path. Further we also demonstrate that our approach provides significantly higher performance in scenarios needing both shared and exclusive mode access to resources. Our experimental results show 39% improvement in basic locking latencies over traditional send/receive based implementations. Further we also observe a significant (upto 317% for 16 nodes) improvement over existing RDMA based distributed queuing schemes for shared mode locking scenarios.
Image processing is becoming more and more present in our everyday life. With the requirements of miniaturization, low-power, performance in order to provide some intelligent processing directly into the camera, embed...
详细信息
ISBN:
(纸本)9781457706608
Image processing is becoming more and more present in our everyday life. With the requirements of miniaturization, low-power, performance in order to provide some intelligent processing directly into the camera, embedded camera will dominate the image processing landscape in the future. While the common approach of developing such embedded systems is to use sequentially operating processors, image processing algorithms are inherently parallel, thus hardware devices like FPGAs provide a perfect match to develop highly efficient systems. Unfortunately hardware development is more difficult and there are less experts available compared to software. Automatizing the design process will leverage the existing infrastructure, thus providing faster time to market and quick investigation of new algorithms. We exploit ASP (answer set programming) for system synthesis with the goal of genarating an optimal hardware software partitioning, a viable communication structure and the corresponding scheduling, from an image processing application.
The proceedings contain 25 papers. The topics discussed include: Blue Danube: a large-scale, end-to-end synchronous, distributed data stream processing architecture for time-sensitive applications;a pedestrian movemen...
ISBN:
(纸本)9781665497992
The proceedings contain 25 papers. The topics discussed include: Blue Danube: a large-scale, end-to-end synchronous, distributed data stream processing architecture for time-sensitive applications;a pedestrian movement model for 3D visualization in a driving simulation environment;enabling simulation interoperability between international standards in the space domain;the use of the ieee HLA standard to tackle interoperability issues between heterogeneous components;a distributed digital twin implementation of a hemodialysis unit aimed at helping prevent the spread of the Omicron COVID-19 variant;Cell-DEVS CO
MapReduce is a partition-based parallel programming model and framework enabling easy development of scalable parallel programs on clusters of commodity machines. In order to make time-intensive applications benefit f...
详细信息
This paper proposes two distributed algorithms for the heuristic solution of the Steiner Tree Problem in Networks (SPN). The problem has a practical application in the construction of a minimum cost distribution tree ...
详细信息
ISBN:
(纸本)0780383796
This paper proposes two distributed algorithms for the heuristic solution of the Steiner Tree Problem in Networks (SPN). The problem has a practical application in the construction of a minimum cost distribution tree for multicast transmission. Multicast transmission represents a-necessary lower network service for the wide diffusion of new multimedia network applications. Currently, given the lack of efficient distributed methods, the existing protocols build the multicast distribution tree using some selected central node. The proposed distributed algorithms allow the construction of effective distribution trees using a coordination protocol among the network nodes. The algorithms have been implemented and tested both in simulation and on real networks, and their performance values are presented.
processing-in-Memory (PIM) technology encompasses a range of research leveraging a tight coupling of memory and processing. The most unique features of the technology are extremely wide paths to memory, extremely low ...
详细信息
ISBN:
(纸本)0769523129
processing-in-Memory (PIM) technology encompasses a range of research leveraging a tight coupling of memory and processing. The most unique features of the technology are extremely wide paths to memory, extremely low memory latency, and wide functional units. Many PIM researchers are also exploring extremely fine-grained multi-threading capabilities. This paper explores a mechanism for leveraging these features of PIM technology to enhance commodity architectures in a seemingly mundane way: accelerating MPI. Modern network interfaces leverage simple processors to offload portions of the MPI semantics, particularly the management of posted receive and unexpected message queues. Without adding cost or increasing clock frequency, using PIMs in the network interface can enhance performance. The results are a significant decrease in latency and increase in small message bandwidth, particularly when long queues are present.
This paper presents a high level architectural specification of MediOGRID, a research, project aiming at implementing a, real-time satellite image processing system for extracting relevant environmental and meteorolog...
详细信息
ISBN:
(纸本)0769526381
This paper presents a high level architectural specification of MediOGRID, a research, project aiming at implementing a, real-time satellite image processing system for extracting relevant environmental and meteorological parameters on a, Grid system. The presentation focuses on the key architectural decisions of the GRID-aware satellite image processing system, highlighting the technologies for each of the major components. An essential part of managing a global Data Grid is a, monitoring system that is able to monitor and track: all the site facilities, networks, and tasks in progress, all in real time. Considering this issue the paper analyzes the possible grid monitoring approaches, proposes a, solution and presents a set, of monitoring results for the MediOGRID data management, subsystem.
The proceedings contains 72 papers. Topics discussed include theoretical issues on parallelprocessing, distributed computing, software tools for parallelism, performance analysis, scheduling and debugging, fault tole...
详细信息
The proceedings contains 72 papers. Topics discussed include theoretical issues on parallelprocessing, distributed computing, software tools for parallelism, performance analysis, scheduling and debugging, fault tolerance and security, applications in electromagnetic and neural networks, performance analysis in communications, and parallel algorithms.
parallel invocation of edge servers within a Web infrastructure has been shown to provide benefits, in terms of system responsiveness, for both content delivery applications and non-transactional Web services. This is...
详细信息
ISBN:
(纸本)0769525083
parallel invocation of edge servers within a Web infrastructure has been shown to provide benefits, in terms of system responsiveness, for both content delivery applications and non-transactional Web services. This is achieved thanks to the exploitation of path-diversity proper of multi-hop networks over the Internet, which typically reduces the likelihood of client perceived link congestion. In this work we address parallel invocation of geographically spread edge servers in the context of transactional Web-based applications. Actually, parallel invocation protocols are not trivial for this type of applications, since we need to deal with a set of issues not present in classical content delivery, applications, such as (i) non-idempotent business logic and (ii) increase of the workload on the data centers. In this paper we propose a simple and lightweight parallel invocation protocol for distributed transactions over the Web, which addresses all those issues in a scalable manner by requiring no form of coordination among (geographically spread) edge servers. The results of a simulation study are also reported to show the advantages from our protocol in terms of user-perceived system responsiveness.
暂无评论