In multiprocessors system, crossbar scheduling networks have been widely used for system interconnection among processors or modules in SOC (System on Chip). In this paper, a parallel SAR (Synthetic Aperture Radar) im...
详细信息
In multiprocessors system, crossbar scheduling networks have been widely used for system interconnection among processors or modules in SOC (System on Chip). In this paper, a parallel SAR (Synthetic Aperture Radar) imageprocessing system which is composed of five FPGA PE (Process Element)s is taken for the basic research platform. Then a novel compact crossbar scheduling network for multi distributedparallel PEs is proposed for communication among PEs. Different scheduling strategies are provided for different kinds of source data streams which are from external raw data and PEs. In addition, resynchronization makes data streams of all PEs synchronous when all data streams are not entirely consistent due to the efficiency of memory access and bus burst transfer. Simulation and synthesis show that only little device resource is utilized by the proposed crossbar scheduling network to meet the requirements of the system. The maximum throughout of the crossbar could be above 20Gbps when the operating frequency is 100MHz, and the minimum latency is three clock cycles. The proposed crossbar scheduling as a sub-module could be easily integrated into SAR real-time imaging system.
We present a new algorithm for facial expression recognition that is robust to occlusion. The facial image is divided into equal sized regions, and a Sparse Representation Classifier (SRC) classifies the facial expres...
详细信息
We present a new algorithm for facial expression recognition that is robust to occlusion. The facial image is divided into equal sized regions, and a Sparse Representation Classifier (SRC) classifies the facial expression in each region. These classification decisions must be combined and different voting methods were considered. A weighted voting method where the vote assigned to each class in a region was based on the class representation error led to the best recognition results under a variety of occlusion conditions. The recognition rate of our algorithm remains very high for unoccluded images (95.3% success). With large occluded regions (≥25% of the image), it significantly outperforms an SRC algorithm based on the entire image and a Gabor-based algorithm. Since each subimage problem can be solved independently before combining decisions, processing can be done in parallel leading to a fast SRC based classification decision if implemented on a multi-core system.
This paper presents an efficient stiffness identification technique for truss structures based on distributed local computation. Sensor nodes on each element are assumed to collect strain data and communicate only wit...
This paper presents an efficient stiffness identification technique for truss structures based on distributed local computation. Sensor nodes on each element are assumed to collect strain data and communicate only with sensors on neighboring elements. This can significantly reduce the energy demand for data transmission and the complexity of transmission protocols, thus enabling a simplified wireless implementation. Element stiffness parameters are identified by simple low order matrix inversion at a local level, which reduces the computational energy, allows for distributed computation and makes parallel data processing possible. The proposed method also permits addressing the problem of missing data or faulty sensors. Numerical examples, with and without missing data, are presented and the element stiffness parameters are accurately identified. The computation efficiency of the proposed method is n2 times higher than previously proposed global damage identification methods.
Graph cuts methods are at the core of many state-of-the-art algorithms in computer vision due to their efficiency in computing globally optimal solutions. In this paper, we solve the maximum flow/minimum cut problem i...
详细信息
Graph cuts methods are at the core of many state-of-the-art algorithms in computer vision due to their efficiency in computing globally optimal solutions. In this paper, we solve the maximum flow/minimum cut problem in parallel by splitting the graph into multiple parts and hence, further increase the computational efficacy of graph cuts. Optimality of the solution is guaranteed by dual decomposition, or more specifically, the solutions to the subproblems are constrained to be equal on the overlap with dual variables. We demonstrate that our approach both allows (i) faster processing on multi-core computers and (ii) the capability to handle larger problems by splitting the graph across multiple computers on a distributed network. Even though our approach does not give a theoretical guarantee of speedup, an extensive empirical evaluation on several applications with many different data sets consistently shows good performance. An open source implementation of the dual decomposition method is also made publicly available.
Human skin color detection plays an important role in the applications of skin segmentation, face recognition, and tracking. To build a robust human skin color classifier is an essential step. This paper presents a cl...
详细信息
Human skin color detection plays an important role in the applications of skin segmentation, face recognition, and tracking. To build a robust human skin color classifier is an essential step. This paper presents a classifier based on beta mixture models (BMM), which uses the pixel values in RGB space as the features. We propose a Bayesian estimation method based on the variational inference framework to approximate the posterior distribution of the parameters in the BMM and take the posterior mean as a point estimate of the parameters. The well-known Compaq image database is used to evaluate the performance of our BMM based classifier. Compared to some other skin color detection methods, our BMM based classifier shows a better recognition performance.
Random number generation is the kernel of Monte Carlo method and simulation, and it's sometimes necessary to generate a random vector from an unknown distribution described by a group of weighted samples. Based on...
详细信息
Random number generation is the kernel of Monte Carlo method and simulation, and it's sometimes necessary to generate a random vector from an unknown distribution described by a group of weighted samples. Based on the idea of partial approximation, a novel Weighted-Sample-Based Random Vector Generation (WSB-RVG) algorithm is proposed in this paper, which skips the estimation of the unknown density and requires few assumptions on the concealed distribution. Thus this method is particularly suitable for random vector generation, and can be used for resampling in Particle Filter (PF) when the general Gaussian assumption deteriorates. Its validity and performances are verified in the simulations, where the proposed algorithm is compared with regularization, for approximating a Gaussian mixture model and resampling in a non-linear tracking.
On-chip inductive effects are becoming predominant in deep submicron (DSM) interconnects due to increasing clock speed, circuit complexity and decreasing interconnect lengths. Inductance causes noise in the signal wav...
详细信息
On-chip inductive effects are becoming predominant in deep submicron (DSM) interconnects due to increasing clock speed, circuit complexity and decreasing interconnect lengths. Inductance causes noise in the signal waveforms, which could adversely affect the performance of the circuit and signal integrity. The traditional analysis of crosstalk in a transmission line begins with a lossless LC representation, yielding a wave equation governing the system response. This paper proposes a difference model approach to derive crosstalk in the transform domain. A closed form solution for crosstalk is obtained by incorporating initial conditions using difference modal approach for distributed RLCG interconnects. We have derived the crosstalk metric for two parallel lines when both are switching simultaneously. A raw evaluation of the crosstalk could be at the origin of a malfunction of the circuit. Cross talk can be analyzed by computing the signal linkage between aggressor or attacker nets and victim nets. The attacker net carries a signal that couples to the victim net through the mutual inductance. In order to determine the effects that this cross talk will have on circuit operation, the resulting voltage expressions at the victim and aggressor must be calculated. This paper proposes a difference model approach for the effective voltages at the victim and aggressor using superposition theorem. The accuracy of our approach is justified by the results obtained from SPICE simulation.
This paper presents a target tracking algorithm based on the synthesized information of signal's time-ofarrival (TOA) in an asynchronous distributed wireless sensor network (WSN). The proposed algorithm combines d...
详细信息
ISBN:
(纸本)9781424458561
This paper presents a target tracking algorithm based on the synthesized information of signal's time-ofarrival (TOA) in an asynchronous distributed wireless sensor network (WSN). The proposed algorithm combines displacement estimate (DE) and position-displacement estimate (PDE) together, and the case of target's very slow motion is mainly focused on. Taking advantage of DE's efficiency and PDE's ability of eliminating accumulated error, the effect of tracking is fairly good and the validity of the algorithm is guaranteed as well. An adaptive threshold is introduced to switch from DE to PDE so as to eliminating the error accumulated in a series of consecutive DE estimates. And the error of estimating current target's position between these two methods is used to adjust threshold for activating the next switch. Simulations show the proposed method has quite good accuracy and availability in aimed cases as well as other situations.
image segmentation is one of the mostly used procedures in the medical imageprocessing applications. Due to the high resolution characteristic of the medical images and a large amount of computational load in mathema...
详细信息
image segmentation is one of the mostly used procedures in the medical imageprocessing applications. Due to the high resolution characteristic of the medical images and a large amount of computational load in mathematical methods, medical image segmentation process has an excessive computation complexity. Recently, Field Programmable Gate Array (FPGA) implementation capable of performing many complex computations in parallel has been applied in many areas needed for high computation time. In this study, it is proposed that neighbour-pixel-intensity based feature extraction methods for extraction of the textural features in medical images, and k-NN classifier for segmentation process.
Error concealment restores the visual integrity of image content that has been damaged due to a bad network transmission. Best neighborhood matching (BNM) is an effective image recovery method that exploits the inform...
详细信息
Error concealment restores the visual integrity of image content that has been damaged due to a bad network transmission. Best neighborhood matching (BNM) is an effective image recovery method that exploits the information redundancy in a block-coded broken image to find similar content which it then uses to repair or conceal errors. On a high definition image BNM is traditionally implemented sequentially, which requires a relatively long time and so is not suitable for real-time or high volume use. In this paper, we analyze the data access patterns of the BNM algorithm, and exploit a GPU platform to speedup the execution through a parallel implementation. We compare and combine several different GPU optimization methods (coalesced global memory access, shared memory, register files, etc.), and propose an improvement to the parallel BNM algorithm. Experiment results show that our approach can speed up BNM twenty-one times over the sequential approach without any obvious loss of accuracy.
暂无评论