With the rapid development of computational sensing technologies, the volume of available sensing data has been increasing daily as sensor systems grow in scale. This is sometimes referred to as the "data deluge&...
With the rapid development of computational sensing technologies, the volume of available sensing data has been increasing daily as sensor systems grow in scale. This is sometimes referred to as the "data deluge". Many physical computing applications have to spend great effort on meeting the challenges of this environment, which has prompted a need for rapid and efficient processing of massive datasets. Fortunately, many algorithms used in these applications can be decomposed and partially or fully cast into a parallel computing framework. This dissertation discusses three sensing models—gigapixel image formation, X-ray transmission and X-ray scattering—and proposes methods to formulate each task as a scalable and distributed problem which is adapted to the massively parallel architecture of Graphics processing Units (GPUs). For the gigapixel images, this dissertation presents a scalable and flexible image formation pipeline based on the MapReduce framework. The presented implementation was developed to operate on the AWARE multiscale cameras, which consist of microcamera arrays imaging through a shared hemispherical objective. The microcamera field-of-views slightly overlap and are capable of generating high-resolution and high dynamic range panoramic images and videos. The proposed GPU implementation takes advantage of the prior knowledge regarding the alignment between microcameras and exploits the multiscale nature of the AWARE image acquisition, enabling the rapid composition of panoramas ranging from display-scale views to gigapixel-scale full resolution images. On a desktop computer, a 1.6-gigapixel color panorama captured by the AWARE-10 can be delivered in less than a minute, while 720p and 1080p panoramas can be stitched at the video frame rate. We next present a pipeline that rapidly simulates X-ray transmission imaging via ray-tracing on GPU. This pipeline was initially designed for statistical analysis of X-ray threat detection in the context of aviatio
We investigate efficient sensitivity analysis (SA) of algorithms that segment and classify image features in a large dataset of high-resolution images. Algorithm SA is the process of evaluating variations of methods a...
详细信息
ISBN:
(纸本)9781538623268
We investigate efficient sensitivity analysis (SA) of algorithms that segment and classify image features in a large dataset of high-resolution images. Algorithm SA is the process of evaluating variations of methods and parameter values to quantify differences in the output. A SA can be very compute demanding because it requires re-processing the input dataset several times with different parameters to assess variations in output. In this work, we introduce strategies to efficiently speed up SA via runtime optimizations targeting distributed hybrid systems and reuse of computations from runs with different parameters. We evaluate our approach using a cancer image analysis workflow on a hybrid cluster with 256 nodes, each with an Intel Phi and a dual socket CPU. The SA attained a parallel efficiency of over 90% on 256 nodes. The cooperative execution using the CPUs and the Phi available in each node with smart task assignment strategies resulted in an additional speedup of about 2x. Finally, multi-level computation reuse lead to an additional speedup of up to 2.46x on the parallel version. The level of performance attained with the proposed optimizations will allow the use of SA in large-scale studies.
On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challeng...
详细信息
On mobile devices, image sequences are widely used for multimedia applications such as computer vision, video enhancement, and augmented reality. However, the real-time processing of mobile devices is still a challenge because of constraints and demands for higher resolution images. Recently, heterogeneous computing methods that utilize both a central processing unit (CPU) and a graphics processing unit (GPU) have been researched to accelerate the image sequence processing. This paper deals with various optimizing-techniques such as parallelprocessing by the CPU and GPU, distributedprocessing on the CPU, frame buffer object, and double buffering for parallel and/or distributed tasks. Using the optimizing techniques both individually and combined, several heterogeneous computing structures were implemented and their effectiveness were analyzed. The experimental results show that the heterogeneous computing facilitates executions up to 3.5 times faster than CPU-only processing.
Neural networks hold a critical domain in machine learning algorithms because of their self-adaptiveness and state-of-the-art performance. Before the testing (inference) phases in practical use, sophisticated training...
详细信息
ISBN:
(纸本)9781538630938
Neural networks hold a critical domain in machine learning algorithms because of their self-adaptiveness and state-of-the-art performance. Before the testing (inference) phases in practical use, sophisticated training (learning) phases are required, calling for efficient training methods with higher accuracy and shorter converging time. Many existing studies focus on the training optimization on high-performance servers or computing clusters, e.g. GPU clusters. However, training neural networks on resource-constrained devices, e.g. mobile platforms, is an important research topic barely touched. In this paper, we implement AdaLearneran-adaptive distributed mobile learning system for neural networks that trains a single network with heterogenous mobile resources under the same local network in parallel. To exploit the potential of our system, we adapt neural networks training phase to mobile device-wise resources and fiercely decrease the transmission overhead for better system scalability. On three representative neural network structures trained from two image classification datasets, AdaLearner boosts the training phase significantly. For example, on LeNet, 1.75-3.37x speedup is achieved when increasing the worker nodes from 2 to 8, thanks to the achieved high execution parallelism and excellent scalability.
The rapid growth of digital images has caused the traditional image retrieval technology to be faced with new challenge. In this paper we introduce a new approach for large-scale scene image retrieval to solve the pro...
详细信息
The rapid growth of digital images has caused the traditional image retrieval technology to be faced with new challenge. In this paper we introduce a new approach for large-scale scene image retrieval to solve the problems of massive imageprocessing using traditional image retrieval methods. First, we improved traditional k-Means clustering algorithm, which optimized the selection of the initial cluster centers and iteration procedure. Second, we presented a parallel design and realization method for improved k-Means algorithm applied it to feature clustering of scene images. Finally, a storage and retrieval scheme for large-scale scene images was put forward using the large storage capacity and powerful parallel computing ability of the Hadoop distributed platform. The experimental results demonstrated that the proposed method achieved good performance. Compared with the traditional algorithms with single node architecture and parallel k-Means algorithm, the proposed method has obvious advantages for use in large-scale scene image data retrieval in terms of retrieval accuracy, retrieval time overhead, and computational performance (speedup and efficiency, sizeup, and scaleup), which is a significant improvement from applying parallelprocessing to intelligent algorithms with large-scale datasets.
This paper introduces an effective processing framework nominated image Cloud processing (ICP) to powerfully cope with the data explosion in imageprocessing field. While most previous researches focus on optimizing t...
详细信息
In the recent decades, remote sensing data are rapidly growing in size and variety, and considered as “big geo data” because of their huge data volume, significant heterogeneity and challenge of fast analysis. In th...
详细信息
ISBN:
(纸本)9781538637913
In the recent decades, remote sensing data are rapidly growing in size and variety, and considered as “big geo data” because of their huge data volume, significant heterogeneity and challenge of fast analysis. In the traditional remote sensing analysis workflows, the data transfer for downloading raw image files to local workstations often costs a lot of time and slows down the data analysis workflows. Because results of remote sensing data analysis models are usually much smaller than raw data to be processed, “on-demand processing”, which tries to upload data analysis models and execute them “near” where data stores, can significantly accelerate the execution of remote sensing analysis workflows. In this paper, a framework for on-demand remote sensing data analysis is proposed based on three-layered architecture; XML/JSON based runtime environment description; and on-demand model deployment methods. The evaluation on a prototype system shows that on-demand processing framework accelerates the execution of analysis models in 2.8 ~ 12.7 times by reducing data transfers, especially for those analysis workflows which transfer data through low bandwidth Internet. By on-demand processing, classical remote sensing data service systems can evolve into remote sensing data processing infrastructures, which provide IaaS (Infrastructure-as-a-Service) and PaaS (Platform-as-a-Service) services, and make it possible to exchange knowledge among scientists by sharing models. Furthermore, a remote sensing data analysis platform for carbon satellites is designed based on the on-demand processing proposed by this paper and will soon be implemented under the support of SunWay-TaihuLight, the world's most powerful super computer.
We study direct numerical computations of the stationary radiative transport equation for light propagation in tissue with a three dimensional MR image and bio-optical parameters. We employ an upwind finite difference...
详细信息
ISBN:
(纸本)9781538621639
We study direct numerical computations of the stationary radiative transport equation for light propagation in tissue with a three dimensional MR image and bio-optical parameters. We employ an upwind finite difference scheme which has mathematical reliability. The computational times in modern three parallel architectures are compared. Although its discretization is a large scale problem with several billions unknowns, our result shows practicality of direct numerical computation from viewpoints of computational time and modern computer resources.
In order to reduce the real-time processing gap of distributed algorithms on smart camera networks, we have evaluated a parallelprocessing method which exploits the processing capabilities of free cameras as auxiliar...
详细信息
ISBN:
(纸本)9781450347860
In order to reduce the real-time processing gap of distributed algorithms on smart camera networks, we have evaluated a parallelprocessing method which exploits the processing capabilities of free cameras as auxiliary for busy ones. The communication infrastructure or camera processing capabilities are not included in the traditional evaluation methods like datasets or even virtual reality tools, so they are not suitable to analyze the complexity of distributed algorithms. For this purpose, we have developed a modular framework based on OMNET++ simulation environment named CAM-DIST, in which the velocity model of soccer players are emulated. Simulation results show that parallelprocessing improves overall efficiency, but could have side effects on individual camera estimations;so choosing optimal sharing value and destinations will enhance the performance.
暂无评论