this paper describes a parallel algorithm for correlating or "fusing" streams of data from sensors and other sources of information. the algorithm is useful for applications where composite conditions over m...
详细信息
this paper describes a parallel algorithm for correlating or "fusing" streams of data from sensors and other sources of information. the algorithm is useful for applications where composite conditions over multiple data streams must be detected rapidly, such as intrusion detection or crisis management. the implementation of this algorithm on a multithreaded system and the performance of this implementation are also briefly described.
We introduce basic concepts and some of the applications of quantum computing and quantum information theory. We discuss first the physical limitations of solid state technology, then we present a few experiments whic...
详细信息
We introduce basic concepts and some of the applications of quantum computing and quantum information theory. We discuss first the physical limitations of solid state technology, then we present a few experiments which reveal quantum effects. We survey the basic principles of quantum mechanics necessary to understand the behaviour of quantum devices.
We propose in this paper an original design space exploration method for reconfigurable architectures adapted to fine and coarse grain resources. the exploration flow deals with communication hierarchical distribution...
详细信息
We propose in this paper an original design space exploration method for reconfigurable architectures adapted to fine and coarse grain resources. the exploration flow deals with communication hierarchical distribution and processing resources use rate for the architecture under exploration. Withthis information, designer can explore the architectural design space to define a power-efficient architecture. Exploration results for image computing and cryptography applications are provided to demonstrate the efficiency of the method.
Large-scale scientific computing applications frequently make use of closely-coupled distributedparallel components. the performance of such scientific applications is therefore dependent on the component parts and t...
详细信息
Large-scale scientific computing applications frequently make use of closely-coupled distributedparallel components. the performance of such scientific applications is therefore dependent on the component parts and their interaction at run-time. this paper describes a methodology for predictive performance modelling of parallelapplications composed of multiple interacting components. In this paper, the fundamental steps and required operations involved in the modelling process are identified - including inter-component dataflow analysis, M/spl times/N communication performance evaluation and composite performance model evaluation. A case study is presented to illustrate the modelling process and the methodology is verified through experimental analysis.
We introduce a continuous convergence protocol for handling locally committed and possibly conflicting updates to replicated data. the protocol supports local consistency and predictability while allowing replicas to ...
详细信息
We introduce a continuous convergence protocol for handling locally committed and possibly conflicting updates to replicated data. the protocol supports local consistency and predictability while allowing replicas to deterministically diverge and converge as updates are committed and replicated. We discuss how applications may exploit the protocol characteristics and describe an implementation where conflicting updates are detected, qualified by a partial update order, and resolved using application-specific forward conflict resolution.
this paper presents a compiler methodology for memory-aware mapping on 2-Dimensional coarse-grained reconfigurable architectures that aims in improving the mapped applications' performance. By exploiting data reus...
详细信息
this paper presents a compiler methodology for memory-aware mapping on 2-Dimensional coarse-grained reconfigurable architectures that aims in improving the mapped applications' performance. By exploiting data reuse opportunities, the methodology tries to overcome the data memory bandwidth bottleneck, which negatively influences the applications' performance. this is achieved by using foreground memory in the architecture and by properly placing operations in the processing elements. the methodology considers a realistic 2-Dimensional coarse-grained reconfigurable architecture template, which can model the majority of the existing coarse-grained architectures. the experimental results show that the execution time and memory accesses are reduced.
the parallel-Horus framework, developed at the University of Amsterdam, is a unique software architecture that allows non-expert parallel programmers to develop fully sequential multimedia applications for efficient e...
详细信息
the parallel-Horus framework, developed at the University of Amsterdam, is a unique software architecture that allows non-expert parallel programmers to develop fully sequential multimedia applications for efficient execution on homogeneous Beowulf-type commodity clusters. Previously obtained results for realistic, but relatively small-sized applications have shown the feasibility of the parallel-Horus approach, withparallel performance consistently being found to be optimal with respect to the abstraction level of message passing programs. In this paper we discuss the most serious challenge parallel-Horus has had to deal with so far: the processing of over 184 hours of video included in the 2004 NIST TRECVID evaluation, i.e. the de facto international standard benchmark for content-based video retrieval. Our results and experiences confirm that parallel-Horus is a very powerful support-tool for state-of-the-art research and applications in multimedia processing.
A task-parallel execution has been shown to be successful on homogeneous parallel systems for many applications providing a suitable degree of multiprocessor task parallelism. In this paper, we extend the model of tas...
详细信息
A task-parallel execution has been shown to be successful on homogeneous parallel systems for many applications providing a suitable degree of multiprocessor task parallelism. In this paper, we extend the model of task-parallel executions so that the same program can also be executed in heterogeneous systems and grid environments. the new model is particularly suited for large applications consisting of independent modules which can be mapped onto different parts of a distributed execution platform. We show that a suitable representation of the execution activities is crucial for combining a flexible multi-level specification with a dynamic scheduling that can be adapted to a dynamically changing execution environment. We also show how a collection of distributed task-managers organizes the distributed execution based on CORBA.
Summary form only given. In a few short years, computers capable of over one Petaflops performance will become a reality. the most likely approach for first successfully reaching this performance level will involve se...
详细信息
Summary form only given. In a few short years, computers capable of over one Petaflops performance will become a reality. the most likely approach for first successfully reaching this performance level will involve several thousands of parallelprocessing elements. What are the key considerations for building such systems? What are the software requirements and demands? How will applications scale? How reliable are they likely to be? What will they be good for? We will address these questions and more based on early experience withthe BlueGene system.
Scheduling is a fundamental issue in achieving high performance on metacomputers and computational grids. For the first time, the job scheduling problem for grid computing on metacomputers is studied as a combinatoria...
详细信息
ISBN:
(纸本)0769523129
Scheduling is a fundamental issue in achieving high performance on metacomputers and computational grids. For the first time, the job scheduling problem for grid computing on metacomputers is studied as a combinatorial optimization problem. It is proven that the list scheduling algorithm can achieve reasonable worst-case performance bound in grid environments supporting distributed super computing with large applications. It is also observed that communication heterogeneity does have significant impact on schedule lengths.
暂无评论