We consider the computation of Bernoulli, Tangent (zag), and Secant (zig or Euler) numbers. In particular, we give asymptotically fast algorithms for computing the first n such numbers O(n(2)(logn)(2+o(1))). We also g...
详细信息
ISBN:
(数字)9781461476214
ISBN:
(纸本)9781461476214;9781461476207
We consider the computation of Bernoulli, Tangent (zag), and Secant (zig or Euler) numbers. In particular, we give asymptotically fast algorithms for computing the first n such numbers O(n(2)(logn)(2+o(1))). We also give very short in-place algorithms for computing the first n Tangent or Secant numbers in O(n(2)) integer operations. These algorithms are extremely simple and fast for moderate values of n. They are faster and use less space than the algorithms of Atkinson (for Tangent and Secant numbers) and Akiyama and Tanigawa (for Bernoulli numbers).
The shortest path problem is fundamental to many areas of image processing, and we present ways to solve it in environments where computation space is scarce. We propose two constant-work-space algorithms for solving ...
详细信息
ISBN:
(纸本)9783642419133;9783642419140
The shortest path problem is fundamental to many areas of image processing, and we present ways to solve it in environments where computation space is scarce. We propose two constant-work-space algorithms for solving the shortest path problem for cactus graphs and clique-cactus graphs of arbitrary size;both algorithms perform in polynomial time. We also present an in-place algorithm for finding the shortest path in cactus graphs, which performs in polynomial time of lower degree.
Sorting is a classic algorithmic problem and its importance has led to the design and implementation of various sorting algorithms on many-core graphics processing units (GPUs). CUDPP Radix sort is the most efficient ...
详细信息
Sorting is a classic algorithmic problem and its importance has led to the design and implementation of various sorting algorithms on many-core graphics processing units (GPUs). CUDPP Radix sort is the most efficient sorting on GPUs and GPU Sample sort is the best comparison-based sorting. Although the implementations of these algorithms are efficient, they either need an extra space for the data rearrangement or the atomic operation for the acceleration. Sorting applications usually deal with a large amount of data, thus the memory utilization is an important consideration. Furthermore, these sorting algorithms on GPUs without the atomic operation support can result in the performance degradation or fail to work. In this paper, an efficient implementation of a parallel shellsort algorithm, CUDA shellsort, is proposed for many-core GPUs with CUDA. Experimental results show that, on average, the performance of CUDA shellsort is nearly twice faster than GPU quicksort and 37% faster than Thrust mergesort under uniform distribution. Moreover, its performance is the same as GPU sample sort up to 32 million data elements, but only needs a constant space usage. CUDA shellsort is also robust over various data distributions and could be suitable for other many-core architectures.
In recent years image analysis has become a research field of exceptional significance, due to its relevance to real life problems in important societal and governmental sectors, Such as medicine, defense, and securit...
详细信息
In recent years image analysis has become a research field of exceptional significance, due to its relevance to real life problems in important societal and governmental sectors, Such as medicine, defense, and security. The explicit purpose of the present Perspective is to suggest a number of strategic objectives for theoretical research, with all emphasis oil the combinatorial approach in image analysis. Most of the proposed objectives relate to the need to make the theoretical foundations of combinatorial image analysis better integrated within a number of well-established subjects of theoretical computer science and discrete applied mathematics, such as the theory of algorithms and problem complexity, combinatorial optimization and polyhedral combinatorics, integer and linear programming, and computational geometry. Published by Elsevier B.V.
This paper presents an algorithm for rotating a subimage in place without using any extra working array. Due to this constraint, we have to overwrite pixel values by interpolated values. Key ideas are local reliabilit...
详细信息
This paper presents an algorithm for rotating a subimage in place without using any extra working array. Due to this constraint, we have to overwrite pixel values by interpolated values. Key ideas are local reliability test which determines whether interpolation at a pixel is carried out correctly without using interpolated values, and lazy interpolation which stores interpolated values in a region which is never used for output images and then fills in interpolated values after safety is guaranteed. It is shown that linear interpolation is always safely implemented. An extension to cubic interpolation is also discussed.
The optimal prefix-free code problem is to determine, for a given array p = [p(i) \i is an element of {1...n}] of n weights, an integer array I = [l(i) \i is an element of (1...n)] of n codeword lengths such that Sigm...
详细信息
ISBN:
(纸本)3540602208
The optimal prefix-free code problem is to determine, for a given array p = [p(i) \i is an element of {1...n}] of n weights, an integer array I = [l(i) \i is an element of (1...n)] of n codeword lengths such that Sigma(i=1)(n) 2(-li) less than or equal to 1 and Sigma(i=1)(n) p(i)l(i) is minimized. Huffman's famous greedy algorithm solves this problem in O(n log n) time, if p is unsorted;and can be implemented to. execute in O(n) time, if the input array p is sorted. Here we consider the space requirements of the greedy method. We show that if p is sorted then it is possible to calculate the array 1 in-place, with l(i) overwriting pi, in O(n) time and using O(1) additional space. The new implementation leads directly to an O(n log n)-time and n + O(1) words of extra space implementation for the case when p is not sorted. The proposed method is simple to implement and executes quickly.
This paper first presents a theory for rasterizing the class of two-dimensional problems which include signal/image processing, computer vision, and linear algebra. The rasterization theory is steered by an isomorphis...
详细信息
This paper first presents a theory for rasterizing the class of two-dimensional problems which include signal/image processing, computer vision, and linear algebra. The rasterization theory is steered by an isomorphism relationship between the two-dimensional shuffle exchange network (2DSE) and the two-dimensional butterfly network (2DBN). Since in real-time applications, data are often acquired in a raster scan format, it is important to develop architectures to support the raster data structure. algorithms are developed first by using 2DSE network, then transformed into 2DBN format. Rasterization architectures can be derived for the algorithms described by 2DBN format. In the PEACE project, we have been able to show that a single, fixed communication topology, namely 2DSE, provides solution times on a 2DSE parallel computing system that for many problems approach known theoretical lower bounds. Secondly, this paper presents the generic architectures and VLSI implementation examples for the rasterization structures.
暂无评论