We have developed a combined network and service management and diagnostics solution for our in-house developed remote patient monitoring system. The developed system has included into the ALPHA eHealth/remote patient...
详细信息
We have developed a combined network and service management and diagnostics solution for our in-house developed remote patient monitoring system. The developed system has included into the ALPHA eHealth/remote patient monitoring system and was successfully used in our large Living Lab infrastructure operating in three different Hungarian regions with 40 patients. In this paper we will identify the key elements of the combined Network and Service Management solution of the remote patient monitoring system.
The main aim of this work is to show, how GPGPUs can facilitate certain type of image processing methods. The software used in this paper is used to detect special tissue part, the nuclei on (HE - hematoxilin eosin) s...
详细信息
The main aim of this work is to show, how GPGPUs can facilitate certain type of image processing methods. The software used in this paper is used to detect special tissue part, the nuclei on (HE - hematoxilin eosin) stained colon tissue sample images. Since pathologists are working with large number of high resolution images - thus require significant storage space -, one feasible way to achieve reasonable processing time is the usage of GPGPUs. The CUDA software development kit was used to develop processing algorithms to NVIDIA type GPUs. Our work focuses on how to achieve better performance with coalesced global memory access when working with three-channel RGB tissue images, and how to use the on-die shared memory efficiently.
This paper presents the design and implementation of a generic cyclic convolution architecture for imaging applications on field programmable gate array (FPGA). Two main architectures are implemented. A parallel archi...
详细信息
This paper presents the design and implementation of a generic cyclic convolution architecture for imaging applications on field programmable gate array (FPGA). Two main architectures are implemented. A parallel architecture using distributed arithmetic (DA) and a sequential implementation using FPGA digital signal processor (DSP) resources were implemented using VHSIC hardware description language (VHDL) and synthesised on Xilinx Virtex-5 FPGA. Experimental results, comparisons and performance analysis of the area, power consumption, maximum frequency and throughput are analysed in this paper. Also a performance comparison between a software implementation and the FPGA implementation has been done. Finally, an evaluation of the generic cyclic convolution has been carried out and reveals a significant trade-off of throughput and maximum frequency.
The Blue Gene/P (BG/P) supercomputer consists of thousands of compute nodes interconnected by multiple networks. Out of these, a 3D torus equipped with direct memory access (DMA) engine is the primary network. BG/P al...
详细信息
The Blue Gene/P (BG/P) supercomputer consists of thousands of compute nodes interconnected by multiple networks. Out of these, a 3D torus equipped with direct memory access (DMA) engine is the primary network. BG/P also features a collective network which supports hardware accelerated collective operations such as broadcast and all reduce. One of the operating modes on BG/P is the virtual node mode where the four cores can be active MPI tasks, performing inter-node and intra-node communication. This paper proposes software techniques to enhance MPI Collective communication primitives, MPI Bcast and MPI Allreduce in virtual node mode by using cache coherent memory subsystem as the communication method within the node. The paper describes techniques leveraging atomic operations to design concurrent data structures such as broadcast-FIFOs to enable efficient collectives. Such mechanisms are important as we expect the core counts to rise in the future and having such data structures makes programming easier and efficient. We also demonstrate the utility of shared address space techniques for MPI collectives, wherein a process can access the peer's memory by specialized system calls. Apart from cutting down the copy costs, such techniques allow for seamless integration of network protocols with intra-node communication methods. We propose intra-node extensions to multi-color network algorithms for collectives using light weight synchronizing structures and atomic operations. Further, we demonstrate that shared address techniques allow for good load balancing and are critical for efficiently using the hardware collective network on BG/P. When compared to current approaches on the 3D torus, our optimizations provide performance up to almost 3 folds for MPI Bcast and a 33% performance gain for MPI Allreduce(in virtual node mode). We also see improvements up to 44% for MPI Bcast using the collective tree network.
Simulation-based decision support is an important tool in business, science, engineering, and many other areas. Although traditional simulation analysis can be used to generate and test possible plans, it suffers from...
详细信息
ISBN:
(纸本)9780769545530
Simulation-based decision support is an important tool in business, science, engineering, and many other areas. Although traditional simulation analysis can be used to generate and test possible plans, it suffers from a long cycle time for model update, analysis and verification. It is thus very difficult to carry out prompt "what-if' analysis to respond to abrupt changes in the physical systems being modeled. Symbiotic simulation has been proposed as a way of solving this problem by having the simulation system and the physical system interact in a mutually beneficial manner. The simulation system benefits from real-time input data which is used to adapt the model and the physical system benefits from the optimized performance that is obtained from the analysis of simulation results. This talk will present a classification of symbiotic simulation systems with examples of applications from the literature. An analysis of these applications reveals some common aspects and issues that are important for symbiotic simulation systems. From this analysis, we have specified an agent-based generic framework for symbiotic simulation. We show that it is possible to identify a few basic functionalities that can be provided by corresponding agents in our framework. These can then be composed together by a specific workflow to form a particular symbiotic simulation system. Finally, the talk will discuss the use of symbiotic simulation as a decision support tool in understanding and steering complex adaptive systems. Some examples of current applications being developed at Nanyang Technological University will be described.
The proceedings contain 93 papers. The topics discussed include: earliest start time estimation for advance reservation-based resource brokering within computational grids;a web-based parallel file transferring system...
ISBN:
(纸本)9780769541907
The proceedings contain 93 papers. The topics discussed include: earliest start time estimation for advance reservation-based resource brokering within computational grids;a web-based parallel file transferring system on grid and cloud environments;scalable hierarchical scheduling for multiprocessor systems using adaptive feedback-driven policies;parallel numerical computing of finite element model of conductors and floating potentials;energy-efficient sink location service protocol to support mobile sinks in wireless sensor networks;depth balancing and multipath transmission algorithms for P2P streaming media using PeerCast;direct mapping OFDM-based transmission scheme for underwater acoustic multimedia;experimental analysis of coordination strategies to support wireless sensor networks composed by static ground sensors and UAV-carried sensors;and implement a RFID-based indoor location sensing system using virtual signal mechanism.
The proceedings contain 128 papers. The topics discussed include: distributed advance network reservation with delay guarantees;a general algorithm for detecting faults under the comparison diagnosis model;broadcastin...
ISBN:
(纸本)9781424464432
The proceedings contain 128 papers. The topics discussed include: distributed advance network reservation with delay guarantees;a general algorithm for detecting faults under the comparison diagnosis model;broadcasting on large scale heterogeneous platforms under the bounded multi-port model;on the importance of bandwidth control mechanisms for scheduling on large scale heterogeneous platforms;scalable failure recovery for high-performance data aggregation;high performance comparison-based sorting algorithm on many-core GPUs;improving the performance of Uintah: a large-scale adaptive meshing computational framework;optimizing and tuning the fast multipole method for state-of-the-art multicore architectures;first experiences with congestion control in infiniband hardware;power-aware MPI task aggregation prediction for high-end computing systems;and a hybrid interest management mechanism for peer-to-peer networked virtual environments.
Components can be used to implement coarse-grained parallelism on large parallel systems. A parallel component is a piece of parallel code that can be executed in parallel on a set of processors or cores and has a pre...
详细信息
ISBN:
(纸本)9781617828409
Components can be used to implement coarse-grained parallelism on large parallel systems. A parallel component is a piece of parallel code that can be executed in parallel on a set of processors or cores and has a predefined interface to be coupled with other components. Depending on the internal programming and memory model, a component may consist of computation and communication phases or, alternatively, of shared memory code. The interfaces are used for data exchange. More complex parallel programs are built up using parallel components and a flexible component interaction structure. In this article, the programming with parallel components for designing efficient programs for parallel execution platforms with a distributed address space is discussed and a mechanism for the specification of parallel components with communication interfaces is presented. The execution of these components can be adapted to the architectural characteristics of multicore clusters with their specific communication structure. The approach is applied to application programs for the solution of large systems of ordinary differential equations.
It has been identified that as complexity of computing and communication devices increases, fault-tolerance will gain more and more importance. Wireless sensor networks(WSNs) are exceptionally complex distributed syst...
详细信息
It has been identified that as complexity of computing and communication devices increases, fault-tolerance will gain more and more importance. Wireless sensor networks(WSNs) are exceptionally complex distributed systems where a variety of components interact in a complex way and should therefore help narrow down failures and diagnose their causes, as much as possible, with minimal physical access and interactivity. In this paper, we present an algorithm for isolating malfunctioning nodes in WSNs and provide two parallel variants of it: Na?ve and Greedy. The algorithm is based on the idea that a covered node can be turned off and that turning off a malfunctioning node causes the WSN to function properly. The experiments we conducted show that the Na?ve Approach is very precise in locating malfunctioning nodes whereas the Greedy Approach is very fast in finding a cover free of such nodes.
The proceedings contain 106 papers. The topics discussed include: Horizon: efficient deadline-driven disk I/O management for distributed storage systems;run-time optimizations for replicated dataflows on heterogeneous...
ISBN:
(纸本)9781605589428
The proceedings contain 106 papers. The topics discussed include: Horizon: efficient deadline-driven disk I/O management for distributed storage systems;run-time optimizations for replicated dataflows on heterogeneous environments;DataSpaces: an interaction and coordination framework for coupled simulation workflows;ParaTrac: a fine-grained profiler for data-intensive workflows;performance analysis of dynamic workflow scheduling in multicluster grids;software architecture definition for on-demand cloud provisioning;high occupancy resource allocation for grid and cloud systems, a study with DRIVE;MOON: MapReduce on opportunistic environments;MRAP: a novel MapReduce-based framework to support HPC analytics applications with access patterns;data centric highly parallel debugging;thermal aware server provisioning and workload distribution for internet data centers;and I/O scheduling model of virtual machine based on multi-core dynamic partitioning.
暂无评论