The Master/Worker paradigm is one of the most commonly used by parallel/distributed application developers. This paradigm is easy to understand and is fairly close to the abstract concept of a wide range of applicatio...
详细信息
ISBN:
(纸本)3540287000
The Master/Worker paradigm is one of the most commonly used by parallel/distributed application developers. This paradigm is easy to understand and is fairly close to the abstract concept of a wide range of applications. However, to obtain adequate performance indexes, such a paradigm must be managed in a very precise way. There are certain features, such as data distribution or the number of workers, that must be tuned properly in order to obtain such performance indexes, and in most cases they cannot be tuned statically since they depend on the particular conditions of each execution. In this context, dynamic tuning seems to be a highly promising approach since it provides the capability to change the parameters during the execution of the application to improve performance. In this paper, we demonstrate the usage of a dynamic tuning environment that allows for adaptation of the number of workers based on a theoretical model of Master/Worker behavior. The results show that such an approach significantly improves the execution time when the application modifies its behavior during execution.
To execute distributed joins in parallel on compute clusters, systems partition and exchange data records between workers. With large datasets, workers spend a considerable amount of time transferring data over the ne...
详细信息
The advent of sensor technologies has led to the sheer amount volume of remote sensing data containing fruitful spatial and spectral information. Insights into Earth's surface's objects are gained with the hel...
详细信息
ISBN:
(数字)9789819723034
ISBN:
(纸本)9789819723027;9789819723034
The advent of sensor technologies has led to the sheer amount volume of remote sensing data containing fruitful spatial and spectral information. Insights into Earth's surface's objects are gained with the help of remote sensing processing methods and techniques and are applied in various applications. Recently, deep-learning-based methods are widely used in remote sensing data processing due to their ability to mine relationships using multiple layers. However, the time spent by deep learning-based methods with numerous layers and large parameter sizes in processing remote sensing data with "big data" characteristics is unacceptable in real-time applications. Combining deep learning with distributed computing namely distributed deep learning, has become an emerging topic in deep learning-based remote sensing processing. This paper first surveys recent methods and open-source solutions of Apache Spark-based distributed deep learning. Then, the pros and cons of each distributed deep learning open-source solution in processing remote sensing data are summarized. Later, the geological remote sensing interpretation is chosen as the case study by implementing the online training of a deep learning-based interpretation model called D-AMSDFNet for geological environments on Apache Spark. Experiments on Landsat 8 and Sentinel 2 satellite images investigate the effectiveness of the proposed D-AMSDFNet, which also indicates the promising development of distributed deep learning in processing remote sensing data.
Compressed Sensing (CS) methods impact is important in the health care systems. The acquisition and the processing speed of many medical imaging applications is highly improved using this technique. Orthogonal Matchin...
详细信息
Hyperspectral remote sensing image data has been widely used in a variety of applications due to its continuous spectrum and high spectral resolution. However, reducing huge dimensions with high data relevance is time...
详细信息
ISBN:
(纸本)9781538630662
Hyperspectral remote sensing image data has been widely used in a variety of applications due to its continuous spectrum and high spectral resolution. However, reducing huge dimensions with high data relevance is time-consuming, and parallelprocessing is required to accelerate this process. In this paper, we studied the KPCA (Kernel Principal Component Analysis), a nonlinear dimensionality reduction method, and proposed a parallel KPCA algorithm (KPCA_G) based on the CPU/GPU heterogeneous system. The experimental results prove that our KPCA_G algorithm achieves up to 173x speedup over the original serial KPCA. Moreover, to tackle the limitation of insufficient memory caused by the reduction of large-scale hyperspectral data dimension, we exploit the intra-node parallelization using multi-core CPUs and many-core GPUs to improve the parallel hierarchy of distributed-storage KPCA. Finally, we designed and implemented a multilevel hybrid parallel KPCA algorithm that achieves 2.56 similar to 9.03 times speedup compared to the traditional coarse-grained parallel KPCA method on MPI.
The conference materials contain 43 papers. Knowledge-based systems, model-based diagnosis, parallel architectures/distributed artificial intelligence, machine learning, natural language processing, model-based reasoning, applications of genetic algorithms, and engineering applications of artificial intelligence are the main topics covered.
The increasing volumes of relational data let us find an alternative to cope with them. Recently, several hybrid approaches (e. g., HadoopDB and Hive) between parallel databases and Hadoop have been introduced to the ...
详细信息
ISBN:
(纸本)9781467349093;9781467349086
The increasing volumes of relational data let us find an alternative to cope with them. Recently, several hybrid approaches (e. g., HadoopDB and Hive) between parallel databases and Hadoop have been introduced to the database community. Although these hybrid approaches have gained wide popularity, they cannot avoid the choice of suboptimal execution strategies. We believe that this problem is caused by the inherent limits of their architectures. In this demo, we present Tajo, a relational, distributed data warehouse system on shared-nothing clusters. It uses Hadoop distributed File System (HDFS) as the storage layer and has its own query execution engine that we have developed instead of the MapReduce framework. A Tajo cluster consists of one master node and a number of workers across cluster nodes. The master is mainly responsible for query planning and the coordinator for workers. The master divides a query into small tasks and disseminates them to workers. Each worker has a local query engine that executes a directed acyclic graph of physical operators. A DAG of operators can take two or more input sources and be pipelined within the local query engine. In addition, Tajo can control distributed data flow more flexible than that of MapReduce and supports indexing techniques. By combining these features, Tajo can employ more optimized and efficient query processing, including the existing methods that have been studied in the traditional database research areas. To give a deep understanding of the Tajo architecture and behavior during query processing, the demonstration will allow users to submit TPC-H queries to 32 Tajo cluster nodes. The web-based user interface will show (1) how the submitted queries are planned, (2) how the query are distributed across nodes, (3) the cluster and node status, and (4) the detail of relations and their physical information. Also, we provide the performance evaluation of Tajo compared with Hive.
The improvement of a new changed algorithms for virtual photograph sampling and quantization in serial and parallel image processing is proposed. The set of rules improves upon existing methods with the aid of combini...
详细信息
We explore the use of today's high-end Graphics processing units on desktops to perform hierarchical agglomerative clustering with the Compute Unified Device Architecture - CUDA of NVIDIA. Although the advancement...
详细信息
ISBN:
(纸本)9780769536545
We explore the use of today's high-end Graphics processing units on desktops to perform hierarchical agglomerative clustering with the Compute Unified Device Architecture - CUDA of NVIDIA. Although the advancement in graphics cards has made the gaming industry to flourish, there is a lot more to be gained the field of scientific computing, high performance computing and their applications. Previous works have illustrated considerable speed gains on computing pair wise Euclidean distances between vectors, which is the fundamental operation in hierarchical clustering. We have used CUDA to implement the complete hierarchical agglomerative clustering algorithm and show almost double the speed gain using much cheaper desk top graphics card. In this paper we briefly explain the highly parallel and internally distributed programming structure of CUDA. We explore CUDA capabilities and propose methods to efficiently handle data within the graphics hardware for data intense, data independent, iterative or repetitive general-purpose algorithms such as the hierarchical clustering. We achieved results with speed gains of about 30 to 65 times over the CPU implementation using micro array gene expressions.
This paper presents the development of a real-time system for recognition of textured objects. In contrast to current approaches which mostly rely on specialized multiprocessor architectures for fast processing, we us...
详细信息
ISBN:
(纸本)3540606971
This paper presents the development of a real-time system for recognition of textured objects. In contrast to current approaches which mostly rely on specialized multiprocessor architectures for fast processing, we use a distributed network architecture to support parallelism and attain real-time performance. In this paper, a new approach to image matching is proposed as the basis of object localization and positioning, which involves dynamic texture feature extraction and hierarchical image matching. A mask based stochastic method is introduced to extract feature points for matching. Our experimental results demonstrate that the combination of texture feature extraction and interesting point detection provides a better solution to the search of the best matching between two textured images. Furthermore, such an algorithm is implemented on a low cost heterogeneous PVM (parallel Virtual Machine) network to speed up the processing without specific hardware requirements.
暂无评论