This paper introduces a new method to register images that are rotated and translated with respect to each other. The method works by transforming each image to a gradient distribution space. This space represents the...
详细信息
This paper introduces a new method to register images that are rotated and translated with respect to each other. The method works by transforming each image to a gradient distribution space. This space represents the likelihood of finding a particular gradient in the image and is invariant to translation. Once transformed the rotation between the images is efficiently found using correlation. Unlike Fourier based methods, phase information is retained in the gradient distribution space, thus a larger class of images can be accurately registered. The method is computationally efficient and does not require nonlinear optimization or iterative methods. Furthermore, large rotations and translations can easily be handled.
Self-organizing maps can discover topological and multidimensional patterns using a variety of methods. We apply a parallel algorithm proposed by the authors (ParaSOM), which yields closer and denser approximations th...
详细信息
Self-organizing maps can discover topological and multidimensional patterns using a variety of methods. We apply a parallel algorithm proposed by the authors (ParaSOM), which yields closer and denser approximations than other methods in a fraction of iterations, to a two-dimensional pattern in a parallel environment to demonstrate a high degree of neuron independence. In a second implementation, pieces of a two-dimensional input space are distributed over a network and processed by independent ParaSOM algorithms.
This paper presents a parallel implementation of connected component labeling algorithms for gray and binary images on a one-dimensional DSP array. The system is a distributed memory MIMD and all the algorithms are de...
详细信息
This paper presents a parallel implementation of connected component labeling algorithms for gray and binary images on a one-dimensional DSP array. The system is a distributed memory MIMD and all the algorithms are developed considering this platform. Performance results of several parallel connected component labeling methods are evaluated. The multi-DSP system has demonstrated a viable performance.
Vision-language models such as CLIP are pretrained on large volumes of internet sourced image and text pairs, and have been shown to sometimes exhibit impressive zero- and low-shot image classification performance. Ho...
详细信息
This paper proposes a new way of managing the cache by exploiting the difference of behavior in the memory system between read-only data and read-write data. A division of the existing cache-based memory hierarchy is ...
详细信息
This paper proposes a new way of managing the cache by exploiting the difference of behavior in the memory system between read-only data and read-write data. A division of the existing cache-based memory hierarchy is proposed in order to create a dedicated data path for read-only data. In order to justify this approach, an analysis performed on a set of benchmarks shows that read-only data count for significant part of the working set and are less reused than read-write data. A transparent solution is proposed based on specific compilation support to separate automatically the memory accesses of read-only data at L1-level. This organization exploits the properties of the different sub-workloads in order to increase the overall data locality and data reuse. Simulated in a multicore environment, the evaluation of the new memory organization shows reduction of L1 misses up to 28.5%. Moreover, the messages issued on the interconnection network can be reduced up to 14.7% without any penalty on the performance.
The paper presents an analysis of the dynamics of an asynchronous motor performed by speculative parallelprocessing using 41 processors. A simulation of the motor dynamics showed an important speed-up of computation,...
详细信息
The paper presents an analysis of the dynamics of an asynchronous motor performed by speculative parallelprocessing using 41 processors. A simulation of the motor dynamics showed an important speed-up of computation, which depends on the number of parallel processes organised in the analysed time sub-intervals.
The segmentation of tissue regions in high-resolution microscopy is a challenging problem due to both the size and appearance of digitized pathology sections. The two point correlation function (TPCF) has proved to be...
详细信息
The segmentation of tissue regions in high-resolution microscopy is a challenging problem due to both the size and appearance of digitized pathology sections. The two point correlation function (TPCF) has proved to be an effective feature to address the textural appearance of tissues. However the calculation of the TPCF functions is computationally burdensome and often intractable in the gigapixel images produced by slide scanning devices for pathology application. In this paper we present several approaches for accelerating deterministic calculation of point correlation functions using theory to reduce computation, parallelization on distributed systems, and parallelization on graphics processors. Previously we show that the correlation updating method of calculation offers an 8-35x speedup over frequency domain methods and decouples efficient computation from the select scales of Fourier methods. In this paper, using distributed computation on 64 compute nodes provides a further 42x speedup. Finally, parallelization on graphics processors (GPU) results in an additional 11-16x speedup using an implementation capable of running on a single desktop machine.
Iterative methods, such as the conjugate gradient method, have long been used for the computation of optical flow along contours. For a contour with N points, the conjugate gradient method requires O(N) iterations (i....
详细信息
Iterative methods, such as the conjugate gradient method, have long been used for the computation of optical flow along contours. For a contour with N points, the conjugate gradient method requires O(N) iterations (i.e., O(N/sup 2/) operations) to converge to a solution. We propose direct analytical methods for computing optical flow along both open contours and closed contours. These methods use O(N) operations, are numerically stable, and can be easily implemented on parallel hardware.< >
The paper concerns methods for using Extremal Optimization (EO) for processor load balancing during execution of distributed programs. A load balancing algorithm for clusters of multicore processors is presented and d...
详细信息
The paper concerns methods for using Extremal Optimization (EO) for processor load balancing during execution of distributed programs. A load balancing algorithm for clusters of multicore processors is presented and discussed. In this algorithm the EO approach is used to periodically detect the best tasks as candidates for migration and for a guided selection of the best processors to receive the migrated tasks. To decrease the complexity of selection for migration, we propose a guided EO algorithm which assumes a two step stochastic selection during the solution improvement based on two separate fitness functions. The functions are based on specific program models which estimate relations between the programs and the executive hardware. The proposed load balancing algorithm is assessed by experiments with simulated load balancing of distributed program graphs. The algorithm is compared against an EO - based algorithm with random placement of migrated tasks and a classic genetic algorithm.
This work studies the optimization problem of assigning multiple labels with the minimum perimeter, namely Potts model, in the spatially continuous setting. It was extensively studied within recent years and used to m...
详细信息
ISBN:
(纸本)9783319587714;9783319587707
This work studies the optimization problem of assigning multiple labels with the minimum perimeter, namely Potts model, in the spatially continuous setting. It was extensively studied within recent years and used to many different applications of imageprocessing and computer vision, especially image segmentation. The existing convex relaxation approaches use total-variation functionals directly encoding perimeter costs, which result in pixelwise simplex constrained optimization problems and can be efficiently solved under a primal-dual perspective in numerics. Among most efficient approaches, such challenging simplex constraints are tackled either by extra projection steps to the simplex set at each pixel, which requires intensive simplex-projection computations, or by introducing extra dual variables resulting in the dual optimization-based continuous max-flow formulation to the studied convex relaxed Potts model. However, dealing with such extra dual flow variables needs additional loads in both computation and memory;particularly for the cases with many labels. To this end, we propose a novel optimization approach upon the Bregman-Proximal Augmented Lagrangian Method (BPALM), for which the Bregman distance function, instead of the classical quadratic Euclidean distance function, is integrated in the algorithmic framework of Augmented Lagrangian methods. The new optimization method has significant numerical advantages;it naturally avoids extra computational and memory burden in enforcing the simplex constraints and allows parallel computations over different labels. Numerical experiments show competitive performance in terms of quality and significantly reduced memory load compared to the state-of-the-art convex optimization methods for the convex relaxed Potts model.
暂无评论