Considering the challenges associated with robots in optoelectronic imaging applications, typically require real-time and accurate recognition and localization of targets, especially in complex environments. Due to th...
详细信息
This article introduces a new theoretical framework to describe the behavior of the Steinbuch's Lernmatrix. The properties of this old associative memory can be modeled using set theory and order relationships, an...
详细信息
ISBN:
(纸本)0819459216
This article introduces a new theoretical framework to describe the behavior of the Steinbuch's Lernmatrix. The properties of this old associative memory can be modeled using set theory and order relationships, analogously to morphological associative memories. The obtained results allow the Lernmatrix, four decades before its creation, to be a good alternative for pattern classification and recognition.
Learning comprehensive spatiotemporal features is crucial for human action recognition. Existing methods tend to model the spatiotemporal feature blocks in an integrate-separate-integrate form, such as appearance-and-...
详细信息
Learning comprehensive spatiotemporal features is crucial for human action recognition. Existing methods tend to model the spatiotemporal feature blocks in an integrate-separate-integrate form, such as appearance-and-relation network(ARTNet) and spatiotemporal and motion network(STM). However, with blocks stacking up, the rear part of the network has poor interpretability. To avoid this problem, we propose a novel architecture called spatial temporal relation network(STRNet), which can learn explicit information of appearance, motion and especially the temporal relation information. Specifically, our STRNet is constructed by three branches,which separates the features into 1) appearance pathway, to obtain spatial semantics, 2) motion pathway, to reinforce the spatiotemporal feature representation, and 3) relation pathway, to focus on capturing temporal relation details of successive frames and to explore long-term representation dependency. In addition, our STRNet does not just simply merge the multi-branch information, but we apply a flexible and effective strategy to fuse the complementary information from multiple pathways. We evaluate our network on four major action recognition benchmarks: Kinetics-400, UCF-101, HMDB-51, and Something-Something v1, demonstrating that the performance of our STRNet achieves the state-of-the-art result on the UCF-101 and HMDB-51 datasets, as well as a comparable accuracy with the state-of-the-art method on Something-Something v1 and Kinetics-400.
In face recognition, the dimensionality of raw data is very high, dimension reduction (Feature Extraction) should be applied before classification. There exist several feature extraction methods, commonly used are Pri...
详细信息
This paper introduced a novel high performance algorithm and VLSI architectures for achieving bit plane coding (BPC) in word level sequential and parallel mode. The proposed BPC algorithm adopts the techniques of co...
详细信息
This paper introduced a novel high performance algorithm and VLSI architectures for achieving bit plane coding (BPC) in word level sequential and parallel mode. The proposed BPC algorithm adopts the techniques of coding pass prediction and parallel & pipeline to reduce the number of accessing memory and to increase the ability of concurrently processing of the system, where all the coefficient bits of a code block could be coded by only one scan. A new parallel bit plane architecture (PA) was proposed to achieve word-level sequential coding. Moreover, an efficient high-speed architecture (HA) was presented to achieve multi-word parallel coding. Compared to the state of the art, the proposed PA could reduce the hardware cost more efficiently, though the throughput retains one coefficient coded per clock. While the proposed HA could perform coding for 4 coefficients belonging to a stripe column at one intra-clock cycle, so that coding for an NxN code-block could be completed in approximate N2/4 intra-clock cycles. Theoretical analysis and experimental results demonstrate that the proposed designs have high throughput rate with good performance in terms of speedup to cost, which can be good alternatives for low power applications.
SIFT (Scale Invariant Feature Transform) is one of most popular approach for feature detection and matching. Many parallelized algorithms have been proposed to accelerate SIFT to apply into real-time systems. This pap...
详细信息
This paper presents a texture segmentation approach which is based on the Markov random field model (MRF) and feed forward neural *** texture is modeled by the second order Gauss MRF model, and the least square error ...
详细信息
This paper presents a texture segmentation approach which is based on the Markov random field model (MRF) and feed forward neural *** texture is modeled by the second order Gauss MRF model, and the least square error estimation is employed for the solution of model parameters. To perform texture segmentation, we introduced an improved BP algorithm to get faster learning speed. Experiment shows that better segmentation results can be obtained than the traditional Euclidean distance method.
In automatic image annotation, it is often extracting low-level visual features from original image for the purpose of mapping to high level image semantic information. In this paper, we propose a novel method which i...
详细信息
In this paper, a face recognition method using local qualitative representations is proposed to solve the problem of face recognition in varying lighting. Based on the observation that the ordinal relationship between...
详细信息
ISBN:
(纸本)9780819469526
In this paper, a face recognition method using local qualitative representations is proposed to solve the problem of face recognition in varying lighting. Based on the observation that the ordinal relationship between the average brightness of image regions pair is invariant under lighting changes, Local Binary Mapping is defined as an illumination invariant for face recognition based on Local Binary pattern descriptor, which extracts the local variance features of an image. For the 'symbol' feature vector, hamming distance is used as similarity measurement. It has been proved that the proposed method can provide the accuracy of 100 percent for subset 2, 3, 4 and 98.89 percent for subset 5 of the Yale facial database B when all images in subset 1 are used as gallery.
Multiple kernel learning (MKL) is a widely used kernel learning method, but how to select kernel is lack of theoretical guidance. The performance of MKL is depend on the users' experience, which is difficult to ch...
详细信息
暂无评论