Dataflow accelerators feature simplicity, programmability, and energy-efficiency and are visualized as a promising architecture for accelerating perfectly nested loops that dominate several important applications, inc...
详细信息
Dataflow accelerators feature simplicity, programmability, and energy-efficiency and are visualized as a promising architecture for accelerating perfectly nested loops that dominate several important applications, including image and media processing and deep learning. Although numerous accelerator designs are being proposed, how to discover the most efficient way to execute the perfectly nested loop of an application onto computational and memory resources of a given dataflow accelerator (execution method) remains an essential and yet unsolved challenge. In this paper, we propose dMazeRunner - to efficiently and accurately explore the vast space of the different ways to spatiotemporally execute a perfectly nested loop on dataflow accelerators (execution methods). The novelty of dMazeRunner framework is in: i) a holistic representation of the loop nests, that can succinctly capture the various execution methods, ii) accurate energy and performance models that explicitly capture the computation and communication patterns, data movement, and data buffering of the different execution methods, and iii) drastic pruning of the vast search space by discarding invalid solutions and the solutions that lead to the same cost. Our experiments on various convolution layers (perfectly nested loops) of popular deep learning applications demonstrate that the solutions discovered by dMazeRunner are on average 9.16x better in Energy-Delay-Product (EDP) and 5.83x better in execution time, as compared to prior approaches. With additional pruning heuristics, dMazeRunner reduces the search time from days to seconds with a mere 2.56% increase in EDP, as compared to the optimal solution.
Red blood cell segmentation in microscopic images is the first step for various clinical studies carried out on blood samples such as cell counting, cell shape identification, etc. Conventional methods while often sho...
详细信息
ISBN:
(纸本)9781728111421;9781728111414
Red blood cell segmentation in microscopic images is the first step for various clinical studies carried out on blood samples such as cell counting, cell shape identification, etc. Conventional methods while often showing a high accuracy are heavily depending on the acquisition modality. Deep learning approaches have shown to be more robust regarding such modalities and still showing a comparable accuracy. In this paper, we first investigate necessary steps to apply a specific type of deep learning methods, namely fully convolutional networks, to red blood cell segmentation. Based on data given and constraints imposed by our partners mainly regarding a high throughput of their data we then describe an exemplary application. First results show, that even with a focus on high performance a good accuracy above 90% can be reached.
The increasing complexity of Cloud Architecture and the introduction of new paradigms like Internet of Things introduced the problem of creating Value Added Services by composition, not only of Resources, but of Servi...
详细信息
ISBN:
(纸本)9781538637906
The increasing complexity of Cloud Architecture and the introduction of new paradigms like Internet of Things introduced the problem of creating Value Added Services by composition, not only of Resources, but of Services too. In this work we describe an architectural solution for Orchestration at all Cloud Layers. Here we described a language for orchestrating both resources and services in Cloud. The language manages composition of services and resources in order to create composite service based on Cloud Design Patterns. It is based on a Workflow language for description of composition and it enables verification of composite services by means of Model Driven Engineering techniques, providing a precious and easy-to-use tool for Cloud Engineering.
Remote sensing (RS) image segmentation is an essential step in geographic object-based image analysis (GEOBIA) to ultimately derive "meaningful objects". While many segmentation methods exist, most of them a...
详细信息
Remote sensing (RS) image segmentation is an essential step in geographic object-based image analysis (GEOBIA) to ultimately derive "meaningful objects". While many segmentation methods exist, most of them are not efficient for large data sets. Thus, the goal of this research is to develop an efficient parallel multi-scale segmentation method for RS imagery by combining graph theory and the fractal net evolution approach (FNEA). Specifically, a minimum spanning tree (MST) algorithm in graph theory is proposed to be combined with a minimum heterogeneity rule (MHR) algorithm that is used in FNEA. The MST algorithm is used for the initial segmentation while the MHR algorithm is used for object merging. An efficient implementation of the segmentation strategy is presented using data partition and the " reverse searching-forward processing" chain based on message passing interface (MPI) parallel technology. Segmentation results of the proposed method using images from multiple sensors (airborne, SPECIM AISA EAGLE ii, WorldView-2, RADARSAT-2) and different selected landscapes (residential/industrial, residential/agriculture) covering four test sites indicated its efficiency in accuracy and speed. We conclude that the proposed method is applicable and efficient for the segmentation of a variety of RS imagery (airborne optical, satellite optical, SAR, high-spectral), while the accuracy is comparable with that of the FNEA method.
Arising user-centric graph applications such as route planning and personalized social network analysis have initiated a shift of paradigms in modern graph processing systems towards multiquery analysis, i.e., process...
详细信息
An unbiased estimator for the ellipticity of an object in a noisy image is given in terms of the image moments. Three assumptions are made: (i) the pixel noise is normally distributed, although with arbitrary covarian...
详细信息
An unbiased estimator for the ellipticity of an object in a noisy image is given in terms of the image moments. Three assumptions are made: (i) the pixel noise is normally distributed, although with arbitrary covariance matrix;(ii) the image moments are taken about a fixed centre;and (iii) the point spread function is known. The relevant combinations of image moments are then jointly normal and their covariance matrix can be computed. A particular estimator for the ratio of the means of jointly normal variates is constructed and used to provide the unbiased estimator for the ellipticity. Furthermore, an unbiased estimate of the covariance of the new estimator is also given.
image denoising is a fundamental operation in imageprocessing and holds considerable practical importance for various real-world applications. Arguably several thousands of papers are dedicated to image denoising. In...
详细信息
image denoising is a fundamental operation in imageprocessing and holds considerable practical importance for various real-world applications. Arguably several thousands of papers are dedicated to image denoising. In the past decade, state-of-the-art denoising algorithms have been clearly dominated by nonlocal patch-based methods, which explicitly exploit patch self-similarity within the targeted image. However, in the past two years, discriminatively trained local approaches have started to outperform previous nonlocal models and have been attracting increasing attention due to the additional advantage of computational efficiency. Successful approaches include cascade of shrinkage fields (CSF) and trainable nonlinear reaction diffusion (TNRD). These two methods are built on the filter response of linear filters of small size using feed forward architectures. Due to the locality inherent in local approaches, the CSF and TNRD models become less effective when the noise level is high and consequently introduce some noise artifacts. In order to overcome this problem, in this paper we introduce a multiscale strategy. To be specific, we build on our newly developed TNRD model, adopting the multiscale pyramid image representation to devise a multiscale nonlinear diffusion process. As expected, all the parameters in the proposed multiscale diffusion model, including the filters and the influence functions across scales, are learned from training data through a loss-based approach. Numerical results on Gaussian and Poisson denoising substantiate that the exploited multiscale strategy can successfully boost the performance of the original TNRD model with a single scale. As a consequence, the resulting multiscale diffusion models can significantly suppress the typical incorrect features for those noisy images with heavy noise. It turns out that multiscale TNRD variants achieve better performance than state-of-the-art denoising methods.
In the recent decades, remote sensing data are rapidly growing in size and variety, and considered as "big geo data" because of their huge data volume, significant heterogeneity and challenge of fast analysi...
详细信息
ISBN:
(纸本)9781538637906
In the recent decades, remote sensing data are rapidly growing in size and variety, and considered as "big geo data" because of their huge data volume, significant heterogeneity and challenge of fast analysis. In the traditional remote sensing analysis workflows, the data transfer for downloading raw image files to local workstations often costs a lot of time and slows down the data analysis workflows. Because results of remote sensing data analysis models are usually much smaller than raw data to be processed, "on-demand processing", which tries to upload data analysis models and execute them "near" where data stores, can significantly accelerate the execution of remote sensing analysis workflows. In this paper, a framework for on-demand remote sensing data analysis is proposed based on three-layered architecture;XML/JSON based runtime environment description;and on-demand model deployment methods. The evaluation on a prototype system shows that on-demand processing framework accelerates the execution of analysis models in 2.8 similar to 12.7 times by reducing data transfers, especially for those analysis workflows which transfer data through low bandwidth Internet. By on-demand processing, classical remote sensing data service systems can evolve into remote sensing data processing infrastructures, which provide IaaS (Infrastructure-as-a-Service) and PaaS (Platform-as-a Service) services, and make it possible to exchange knowledge among scientists by sharing models. Furthermore, a remote sensing data analysis platform for carbon satellites is designed based on the on-demand processing proposed by this paper and will soon be implemented under the support of SunWay-TaihuLight, the world's most powerful super computer.
Synthetic aperture radar (SAR)-based platforms have to process increasingly large number of complex floating-point operations and have to meet hard real-time deadlines. However, real-time use of SAR is severely restri...
详细信息
ISBN:
(纸本)9789811024719;9789811024702
Synthetic aperture radar (SAR)-based platforms have to process increasingly large number of complex floating-point operations and have to meet hard real-time deadlines. However, real-time use of SAR is severely restricted by computation time taken for image formation. One of the classical methods of reducing this computation time to make it suitable for real-time application is multi-processing. A successful attempt has been made by the authors to develop and test a parallel algorithm for synthetic aperture radar image formation, and the results are presented in this paper.
Ant colony optimization (ACO) can be used to solve complex optimization problems in engineering, economic management and military strategy. Most of these are NP hard problems, which are difficult to solve with traditi...
详细信息
ISBN:
(纸本)9781538637906
Ant colony optimization (ACO) can be used to solve complex optimization problems in engineering, economic management and military strategy. Most of these are NP hard problems, which are difficult to solve with traditional methods. An improved parallel ACO algorithm based on pattern learning is proposed in this paper. It extracts parameters automatically to reduce solution space and enhance calculation efficiency. Various parameters in the algorithm are analyzed, and a refining strategy is formed according to ACOs characteristics. The parallel ACO algorithm is carried out under the MIC/CPU architecture, and it can significantly enhance performance.
暂无评论