The sparsecoding based approaches for image recognition have recently shown improved performance than traditional bag-of-features technique. Due to high dimensionality of the image descriptor space, existing systems ...
详细信息
The sparsecoding based approaches for image recognition have recently shown improved performance than traditional bag-of-features technique. Due to high dimensionality of the image descriptor space, existing systems usually require very large codebook size to minimize coding error in order to get satisfactory accuracy. While most research efforts try to address the problem by constructing a relatively smaller codebook with stronger discriminative power, in this paper, we introduce an alternative solution by enhancing the quality of coding. Particularly, we apply the idea similar to Fisher kernel to the coding framework, where we use the image-dependent codebook derivative to represent the image. The proposed idea is generic across multiple coding criteria, and in this paper, it is applied to enhance the locality-constraint linear coding (LLC). Experiments show that, the extracted new feature, called "LLC+," achieved significantly improved accuracy on several challenging datasets even with a small codebook of 1/20 the reported size used by LLC. This obviously adds to LLC+ the modeling accuracy, processing speed and codebook training advantages.
In classifying images, scenes or objects, the most popular approach is based on the features extraction-coding-pooling framework allowing to generate discriminative and robust image representations from densely extrac...
详细信息
ISBN:
(纸本)9789897581335
In classifying images, scenes or objects, the most popular approach is based on the features extraction-coding-pooling framework allowing to generate discriminative and robust image representations from densely extracted local patches, mainly some SIFT/HOG ones. The majority of the latest research is focused on how to improve successfully these coding and pooling parts. In this work, we show that substantial improvements can be also obtained by coding information closer to the pixel values level in the same way that deep-learning architectures do. We introduce a two layer, stacked, coder-pooler architecture where the first layer is specifically dedicated to extract, from our so-called Differential Vectors (DV) patches, some efficient, local low-level features more discriminative and efficient that their classic handcrafted counterpart. This first layer can advantageously replace any classic dense SIFT/HOG patches extraction stage. We demonstrate the effectiveness of our approach on three datasets: UIUC-Sports, Scene 15 and Caltech 101. We achieve excellent performances with simple linear classification while using basic coding and pooling schemes for both layers, i.e. sparsecoding (SC) and Max-Pooling (MP) respectively.
We propose a computational model of recognition of the cerebral cortex, based on an approximate belief revision algorithm. The algorithm calculates the MPE (most probable explanation) of Bayesian networks with a linea...
详细信息
ISBN:
(纸本)9781424496365
We propose a computational model of recognition of the cerebral cortex, based on an approximate belief revision algorithm. The algorithm calculates the MPE (most probable explanation) of Bayesian networks with a linear-sum CPT (conditional probability table) model. Although the proposed algorithm is simple enough to be implemented by a fixed circuit, results of the performance evaluation show that this algorithm does not have bad approximation accuracy. The mean convergence time is not sensitive to the number of nodes if the depth the network is constant. The computation amount is linear to the number of nodes if the number of edges per node is constant. The proposed algorithm can be used as a part of a learning algorithm for a kind of sparse-coding, which reproduces orientation selectivity of the primary visual area. The circuit that executes the algorithm shows better correspondence to the anatomical structure of the cerebral cortex, namely its six-layer and columnar features, than the approximate belief propagation algorithm that has been proposed before. These results suggest that the proposed algorithm is a promising starting point for the model of the recognition mechanism of the cerebral cortex.
This work experimentally analyzes the learning and retrieval capabilities of the diluted metric attractor neural network when applied to collections of fingerprint images. The computational cost of the network decreas...
详细信息
This work experimentally analyzes the learning and retrieval capabilities of the diluted metric attractor neural network when applied to collections of fingerprint images. The computational cost of the network decreases with the dilution, so we can increase the region of interest to cover almost the complete fingerprint. The network retrieval was successfully tested for different noisy configurations of the fingerprints, and proved to be robust with a large basin of attraction. We showed that network topologies with a 2D-Grid arrangement adapt better to the fingerprints spatial structure, outperforming the typical 1D-Ring configuration. An optimal ratio of local connections to random shortcuts that better represent the intrinsic spatial structure of the fingerprints was found, and its influence on the retrieval quality was characterized in a phase diagram. Since the present model is a set of nonlinear equations, it is possible to go beyond the naive static solution (consisting in matching two fingerprints using a fixed distance threshold value), and a crossing evolution of similarities was shown, leading to the retrieval of the right fingerprint from an apparently more distant candidate. This feature could be very useful for fingerprint verification to discriminate between fingerprints pairs.
暂无评论