This paper describes a new formulation of the problem of object recognition under a bounded-error noise model and an object recognition methodology called PERFORM that finds matches by establishing correspondences bet...
详细信息
This paper describes a new formulation of the problem of object recognition under a bounded-error noise model and an object recognition methodology called PERFORM that finds matches by establishing correspondences between model and image features using this formulation. PERFORM evaluates correspondences by intersecting error regions in the image space. The algorithm is analyzed with respect to theoretical complexity as well as actual running times. When a single solution to the matching problem is sought, the time complexity of the sequential matching algorithm for 2D-2D matching using point features is of the order O(l(3) N-2), where N is the number of model features and I is the number of image features. When line features are used, the sequential complexity is of the order O(l(2) N-2). When a single solution is sought, PERFORM runs faster than the fastest known algorithm [8] to solve the bounded-error matching problem. The PERFORM method, which was developed with parallelizability as an important requirement, is shown to be easily realizable on both SIMD and MIMD architectures. The parallel versions of PERFORM are scalable, achieving linear speedups on both the MasPar and the KSR machines. When implemented in parallel, PERFORM does not require a targe number of processors or memory, needs minimal to no inter-processor communication, requires no load balancing, and can produce all or just one solution to the matching problem. The communication-efficient version of PERFORM described in this paper has minimal memory requirements, since it only needs to store the model and image features and computes everything else on the fly.
A new algorithm for clipping lines against convex polyhedron with O(N) complexity is given with modification for non-convex polyhedron. The suggested algorithm is faster for higher number of facets of the given polyhe...
详细信息
A new algorithm for clipping lines against convex polyhedron with O(N) complexity is given with modification for non-convex polyhedron. The suggested algorithm is faster for higher number of facets of the given polyhedron than the traditional Cyrus-Beck's algorithm. Some principal results of comparison of all algorithms are shown and give some ideas how the proposed algorithm could be used effectively.
Pure Adaptive Search is a stochastic algorithm which has been analyzed for continuous global optimization. When a uniform distribution is used in PAS, it has been shown to have complexity which is linear in dimension,...
详细信息
Pure Adaptive Search is a stochastic algorithm which has been analyzed for continuous global optimization. When a uniform distribution is used in PAS, it has been shown to have complexity which is linear in dimension, We define strong and weak variations of PAS in the setting of finite global optimization and prove analogous results. Ln particular, for the n-dimensional lattice (1,...,k)(n), the expected number of iterations to find the global optimum is linear in n. Many discrete combinatorial optimization problems, although having intractably large domains, have quite small ranges. The strong version of PAS for all problems, and the weak version of PAS for a limited class of problems, has complexity the order of the size of the range.
Proper distribution of operations among parallel processors in a large scientific computation executed on a distributed-memory machine can significantly reduce the total computation time. In this paper, we propose an ...
详细信息
Proper distribution of operations among parallel processors in a large scientific computation executed on a distributed-memory machine can significantly reduce the total computation time. In this paper, we propose an operation called simultaneous parallel reduction(SPR), that is amenable to such optimization. SPR performs reduction operations in parallel, each operation reducing a one-dimensional consecutive section of a distributed array. Each element of the distributed array is used as an operand to many reductions executed concurrently over the overlapping array's sections. SPR is distinct from a more commonly considered parallel reduction which concurrently evaluates a single reduction. In this paper we consider SPR on Single Instruction Multiple Data (SIMD) machines with different interconnection networks. We focus on SPR over sections whose size is not a power of 2 with the result shifted relative to the arguments. Several algorithms achieving some of the lower bounds on SPR complexity are presented under various assumptions about the properties of the binary operator of the reduction and of the communication cost of the target architectures.
Multiple-query processing has received a lot of attention recently. The problem arises in many areas, such as extended relational database systems and deductive systems. In this paper we describe a heuristic search al...
详细信息
Multiple-query processing has received a lot of attention recently. The problem arises in many areas, such as extended relational database systems and deductive systems. In this paper we describe a heuristic search algorithm for this problem. This algorithm uses an improved heuristic function that enables it to expand only a small fraction of the nodes expanded by an algorithm that has been proposed in the past. In addition, it handles implied relationships without increasing the size of the search space or the number of nodes generated in this space. We include both theoretical analysis and experimental results to demonstrate the utility of the algorithm.
The support of Boolean set operations in free-form solid modeling systems requires the repeated intersection of parametric surfaces. Present approaches to this problem are sequential and must make trade-offs between a...
详细信息
The support of Boolean set operations in free-form solid modeling systems requires the repeated intersection of parametric surfaces. Present approaches to this problem are sequential and must make trade-offs between accuracy, robustness and efficiency. In this paper, we investigate a parallel approach to the surface intersection problem that shows, both theoretically and empirically, that with parallelism we can achieve both speed and precision simultaneously. We first develop a theoretical foundation for a subdivision method and derive complexity bounds. We show that the basic algorithm can be improved by parallelism. We then design two tolerance-based parallel subdivision algorithms, a macro-subdivision algorithm designed for MIMD shared memory machines and a lookahead-subdivision algorithm for pipelined MIMD machines. Empirical results on the Sequent Balance 21000, the Alliant FX/8, and the Cray-2 verify that significant speed-up is achievable.
Residue number systems (RNS) present the advantage of fast addition and multiplication over other number systems. However, the problem of division by fixed divisors in RNS must be considered. Consequently, 2 divisio...
详细信息
Residue number systems (RNS) present the advantage of fast addition and multiplication over other number systems. However, the problem of division by fixed divisors in RNS must be considered. Consequently, 2 division algorithms for fixed divisors that achieve time complexity of O(n) are presented. The first algorithm is based on the well-known division method of multiplying by the divisor reciprocal. The 2nd algorithm is based on the Chinese Remainder Theorem (CRT) decoding and table lookup and requires that the divisor D be relatively prime to all moduli. The latter requires more storage but is faster. Furthermore, the 2nd algorithm leads to an efficient RSA implementation, with 4m/b steps per modular multiplication.
In this paper we present a new method for the efficient implementation of the fast transversal filter (FTF) algorithm. Reduction of the arithmetic complexity is obtained by making use of the redundancy in the successi...
详细信息
In this paper we present a new method for the efficient implementation of the fast transversal filter (FTF) algorithm. Reduction of the arithmetic complexity is obtained by making use of the redundancy in the successive computations of the forward prediction error and the filtering error in the joint process. The resulting algorithm is exactly equivalent to the original FTF algorithm, hence retaining the same theoretical convergence characteristics and offering the least squares (LS) estimate at each recursion step without delay. Furthermore, the algorithm can be numerically stabilized by using a simple and effective stabilization measure which needs only one additional multiplication per recursion step. The equivalence of the proposed algorithm to the original FTF algorithm is demonstrated by simulations of an acoustic room impulse response identification.
We present a new fast algorithm for Recursive Least-Squares (RLS) adaptive filtering that uses displacement structure and subsampled updating. The FSU FTF algorithm is based on the Fast Transversal Filter (FTF) algori...
详细信息
We present a new fast algorithm for Recursive Least-Squares (RLS) adaptive filtering that uses displacement structure and subsampled updating. The FSU FTF algorithm is based on the Fast Transversal Filter (FTF) algorithm, which exploits the shift invariance that is present in the RLS adaptation of a FIR filter. The FTF algorithm is in essence the application of a rotation matrix to a set of filters and in that respect resembles the Levinson algorithm. In the subsampled updating approach, we accumulate the rotation matrices over some time interval before applying them to the filters. It turns out that the successive rotation matrices themselves can be obtained from a Schur type algorithm which, once properly initialized, does not require inner products. The various convolutions that thus appear in the algorithm are done using the Fast Fourier Transform (FFT). For relatively long filters, the computational complexity of the new algorithm is smaller than the one of the well-known LMS algorithm, rendering it especially suitable for applications such as acoustic echo cancellation.
This paper presents a novel and successful logic synthesis method for optimizing ternary logic functions of any given number of input variables. A new optimization algorithm to synthesize and minimize an arbitrary ter...
详细信息
This paper presents a novel and successful logic synthesis method for optimizing ternary logic functions of any given number of input variables. A new optimization algorithm to synthesize and minimize an arbitrary ternary logic function of n-input variables can always lead this function to optimal or very close to optimal solution, where [n(n + 1)/2]-1 searches are necessary to achieve the optimal solution. Therefore, the complexity number of this algorithm has been greatly reduced from O(3n) into O(n2). The advantages of this synthesis and optimization algorithm are: (1) Very easy logic synthesis method. (2) algorithm complexity is O(n2). (3) Optimal solution can be obtained in very short time. (4) The method can solve the interconnection problems (interconnection delay) of VLSI and ULSI processors, where very fast and parallel operations can be achieved. A transformation method between operational and polynomial domains of ternary logic functions of n-input variables is also discussed. This transformation method is very effective and simple. Design of the circuits of GF(3) operators, addition and multiplication mod-3, have been proposed, where these circuits are composed of Josephson junction devices. The simulation results of these circuits and examples show the following advantages: very good performances, very low power consumption, and ultra high speed switching operation.
暂无评论