Explores the role of learning techniques motivated by the approaches in artificial intelligence in the context of improving parallel program performance. The authors present an adaptive control model to improve the pa...
详细信息
ISBN:
(纸本)0852965095
Explores the role of learning techniques motivated by the approaches in artificial intelligence in the context of improving parallel program performance. The authors present an adaptive control model to improve the parallel program performance through dynamic modification of scheduling parameters under various run-time environments. Optimal or improved scheduling strategies learned from previous program executions provide feedback to further program executions.< >
Cloud computing has gained significant traction in recent years. The Map-Reduce framework is currently the most dominant programming model in cloud computing settings. In this paper, we describe Granules, a lightweigh...
详细信息
Cloud computing has gained significant traction in recent years. The Map-Reduce framework is currently the most dominant programming model in cloud computing settings. In this paper, we describe Granules, a lightweight, streaming-based runtime for cloud computing which incorporates support for the Map-Reduce framework. Granules provides rich lifecycle support for developing scientific applications with support for iterative, periodic and data driven semantics for individual computations and pipelines. We describe our support for variants of the Map-Reduce framework. The paper presents a survey of related work in this area. Finally, this paper describes our performance evaluation of various aspects of the system, including (where possible) comparisons with other comparable systems.
The authors describe the design and implementation of C40PVM, a PVM runtime environment for TMS320C40 networks. With our C40PVM runtime environment, parallel applications can then be easily developed on C40 systems an...
详细信息
The authors describe the design and implementation of C40PVM, a PVM runtime environment for TMS320C40 networks. With our C40PVM runtime environment, parallel applications can then be easily developed on C40 systems and be ported over to other parallel computing platforms. The performance of our runtime environment is also analyzed by using a DSP application on vector quantization.
The conventional model of parallel programming today involves either copying data across cores (and then having to track its most recent value), or not copying and requiring deep software stacks to perform even the si...
详细信息
ISBN:
(纸本)9781665475075
The conventional model of parallel programming today involves either copying data across cores (and then having to track its most recent value), or not copying and requiring deep software stacks to perform even the simplest operation on data that is “remote”, i.e., out of the range of loads and stores from the current core. As application requirements grow to larger data sets, with more irregular access to them, both conventional approaches start to exhibit severe scaling limitations. This paper reviews some growing evidence of the potential value of a new model of computation that skirts between the two: data does not move (i.e., is not copied), but computation instead moves to the data. Several different applications involving large sparse computations, streaming of data, and complex mixed mode operations have been coded for a novel platform where thread movement is handled invisibly by the hardware. The evidence to date indicates that parallel scaling for this paradigm can be significantly better than any mix of conventional models.
In this paper, we describe PEGASUS, an open source peta graph mining library which performs typical graph mining tasks such as computing the diameter of the graph, computing the radius of each node and finding the con...
详细信息
In this paper, we describe PEGASUS, an open source peta graph mining library which performs typical graph mining tasks such as computing the diameter of the graph, computing the radius of each node and finding the connected components. as the size of graphs reaches several giga-, tera- or peta-bytes, the necessity for such a library grows too. To the best of our knowledge, PEGASUS is the first such library, implemented on the top of the HADOOP platform, the open source version of MAPREDUCE. Many graph mining operations (PageRank, spectral clustering, diameter estimation, connected components etc.) are essentially a repeated matrix-vector multiplication. In this paper we describe a very important primitive for PEGASUS, called GIM-V (generalized iterated matrix-vector multiplication). GIM-V is highly optimized, achieving (a) good scale-up on the number of available machines (b) linear running time on the number of edges, and (c) more than 5 times faster performance over the non-optimized version of GIM-V. Our experiments ran on M45, one of the top 50 supercomputers in the world. We report our findings on several real graphs, including one of the largest publicly available Web graphs, thanks to Yahoo!, with ¿ 6,7 billion edges.
In this paper, we present "rules of thumb" for the efficient and straight-forward parallelization of cellular neural networks (CNNs) processing image data on cluster architectures. The rules result from the ...
详细信息
In this paper, we present "rules of thumb" for the efficient and straight-forward parallelization of cellular neural networks (CNNs) processing image data on cluster architectures. The rules result from the application and optimization of the simple but effective structural data parallel approach, which is based on the SPMD model. Digital gray-scale images were used to evaluate the optimized parallel cellular neural network program. The process of parallelizing the algorithm employs HPF to generate an MPI-based program.
Traditional software design methodologies have been shown to have drawbacks in designing and implementing software systems for robotics. A novel dual-hierarchical object-oriented design methodology is presented, which...
详细信息
Traditional software design methodologies have been shown to have drawbacks in designing and implementing software systems for robotics. A novel dual-hierarchical object-oriented design methodology is presented, which is well suited to problems of this type. A practical example of the application of this methodology is presented, utilizing CLOS as the implementation vehicle. The methodology developed is shown to facilitate the programming and planning of complex robot tasks, and the provision of generic recovery procedures for exception handling.< >
Typical grid computing scenarios involve many distributed hardware and software components. The more components that are involved, the more likely it is that one of them may fail. In order for grid computing to succee...
详细信息
ISBN:
(纸本)9780769520261
Typical grid computing scenarios involve many distributed hardware and software components. The more components that are involved, the more likely it is that one of them may fail. In order for grid computing to succeed, there must be a simple mechanism to determine which component failed and why. Instrumentation of all grid applications and middleware is an important part of the solution to this problem. However, it must be possible to control and adapt the amount of instrumentation data produced in order to not be flooded by this data. We describe a scalable, high-performance instrumentation activation mechanism that addresses this problem.
In this paper we present a parallel band selection approach, referred to as parallel simulated annealing band selection (PSABS), for hyperspectral imagery. The approach is based on the simulated annealing band sele...
详细信息
In this paper we present a parallel band selection approach, referred to as parallel simulated annealing band selection (PSABS), for hyperspectral imagery. The approach is based on the simulated annealing band selection (SABS) scheme. The SABS algorithm is originally designed to group highly correlated hyperspectral bands into a smaller subset of band modules regardless of the original order in terms of wavelengths. SABS selects sets of non-correlated hyperspectral bands based on simulated annealing (SA) algorithm and utilizes the inherent separability of different classes in hyperspectral images to reduce dimensionality. In order to be effective, the proposed PSABS is introduced to improve the computational speed by using parallel computing techniques. It allows multiple Markov chains (MMC) to be traced simultaneously and fully utilizes the significant parallelism embedded in SABS to create a set of PSABS modules on each parallel node implemented by the message passing interface (MPI) cluster-based library and the open multi-processing (OpenMP) multicore-based application programming interface. The effectiveness of the proposed PSABS is evaluated by MODIS/ASTER airborne simulator (MASTER) hyperspectral images for hyperspectral band selection during the PACRIM II campaign. The experimental results demonstrated that PSABS can significantly improve the computational loads and provide a more reliable quality of solution compared to the original SABS method.
暂无评论