检索结果-内蒙古大学图书馆

24th IEEE International Conference on Image Processing (ICIP)

作者： Kim, Joonsoo Tahboub, Khalid Delp, Edward J. Purdue Univ Sch Elect & Comp Engn Video & Image Proc Lab VIPER W Lafayette IN 47907 USA

ISBN: (纸本)9781509021758

The bag of visual words (BOW) model is widely used for image representation and classification. Spatial pyramid based feature pooling utilizes the BOW model and is the most popular approach to capture the spatial distribution (layout) of local image features, It makes the assumption that the center of an object is aligned with the center of an image, which can lead to misalignment and degradation in performance. In this paper, we propose a method to utilize max pooled features to estimate objects centers and align the spatial pyramid accordingly. We also propose an image representation descriptor robust to misalignments and objects deformations. The experimental results demonstrate that our spatial pyramid alignment method is simple yet efficient in handling misalignments and achieves high object classification accuracy.

关键词： object classification spatial pyramid feature coding spatial pyramid alignment

来源：评论

学校读者我要写书评

暂无评论

A Joint Compression Scheme for Local Binary feature Descriptors and their Corresponding Bag-of-Words Representation

A Joint Compression Scheme for Local Binary Feature Descript...

引用

IEEE Visual Communications and Image Processing (VCIP)

作者： Van Opdenbosch, Dominik Oelsch, Martin Garcea, Adrian Steinbach, Eckehard Tech Univ Munich Chair Media Technol Munich Germany

ISBN: (纸本)9781538604625

For real-time computer vision tasks, binary feature descriptors are an efficient alternative to their real-valued counterparts. While providing comparable results for many applications, the computational complexity of extracting and processing binary descriptors is significantly lower. In many application scenarios, the local features are transmitted over a channel with limited capacity and processed at a more powerful central processing unit, which requires efficient compression and transmission approaches. In this paper, we present a compression scheme for local binary features, which jointly encodes the descriptors and their respective Bag-of-Words representation using a shared vocabulary between client and server. By sending the visual word index and the entropy-coded residual vector containing the differences between the visual word and the descriptor, we are able to reduce ORB features to 60.62 % of their uncompressed size.

关键词： Visual features binary descriptors Bag-of-Words ORB feature coding ATC

来源：评论

学校读者我要写书评

暂无评论

A Tree Regularized Classifier-Exploiting Hierarchical Structure Information in feature Vector for Human Action Recognition

引用

KSII TRANSACTIONS ON INTERNET AND INFORMATION SYSTEMS 2017年第3期11卷 1614-1632页

作者： Luo, Huiwu Zhao, Fei Chen, Shangfeng Lu, Huanzhang Natl Univ Def Technol Sch Elect Sci & Engn Natl Key Lab Automat Target Recognit ATR Changsha 410073 Hunan Peoples R China

Bag of visual words is a popular model in human action recognition, but usually suffers from loss of spatial and temporal configuration information of local features, and large quantization error in its feature coding procedure. In this paper, to overcome the two deficiencies, we combine sparse coding with spatio-temporal pyramid for human action recognition, and regard this method as the baseline. More importantly, which is also the focus of this paper, we find that there is a hierarchical structure in feature vector constructed by the baseline method. To exploit the hierarchical structure information for better recognition accuracy, we propose a tree regularized classifier to convey the hierarchical structure information. The main contributions of this paper can be summarized as: first, we introduce a tree regularized classifier to encode the hierarchical structure information in feature vector for human action recognition. Second, we present an optimization algorithm to learn the parameters of the proposed classifier. Third, the performance of the proposed classifier is evaluated on YouTube, Hollywood2, and UCF50 datasets, the experimental results show that the proposed tree regularized classifier obtains better performance than SVM and other popular classifiers, and achieves promising results on the three datasets.

关键词： Bag of visual words feature coding SVM structure sparsity spatio-temporal pyramid

来源：评论

学校读者我要写书评

暂无评论

Action recognition via spatio-temporal local features: A comprehensive study

引用

IMAGE AND VISION COMPUTING 2016年 50卷 1-13页

作者： Zhen, Xiantong Shao, Ling Nanjing Univ Informat Sci & Technol Coll Elect & Informat Engn Nanjing Jiangsu Peoples R China Northumbria Univ Dept Comp Sci & Digital Technol Newcastle Upon Tyne NE1 8ST Tyne & Wear England Univ Sheffield Dept Elect & Elect Engn Sheffield S10 2TN S Yorkshire England

Local methods based on spatio-temporal interest points (STIPs) have shown their effectiveness for human action recognition. The bag-of-words (BoW) model has been widely used and dominated in this field. Recently, a large number of techniques based on local features including improved variants of the BoW model, sparse coding (SC), Fisher kernels (FK), vector of locally aggregated descriptors (VLAD) as well as the naive Bayes nearest neighbor (NBNN) classifier have been proposed and developed for visual recognition. However, some of them are proposed in the image domain and have not yet been applied to the video domain and it is still unclear how effectively these techniques would perform on action recognition. In this paper, we provide a comprehensive study on these local methods for human action recognition. We implement these techniques and conduct comparison under unified experimental settings on three widely used benchmarks, i.e., the KTH, UCF-YouTube and HMDB51 datasets. We discuss insightfully the findings from the experimental results and draw useful conclusions, which are expected to guide practical applications and future work for the action recognition community. (C) 2016 Elsevier B.V. All rights reserved.

关键词： Action recognition Spatio-temporal local features feature coding Bag-of-words Sparse coding Fisher kernel VLAD NBNN Match kernels Performance evaluation

来源：评论

学校读者我要写书评

暂无评论

Image classification based on saliency coding with category-specific codebooks

引用

NEUROCOMPUTING 2016年第0期184卷 188-195页

作者： Yang, Zhen Xiong, Huilin Shanghai Jiao Tong Univ Dept Automat 800 Dongchuan Rd Shanghai 200240 Peoples R China

This paper presents a feature encoding scheme for image classification by combining the salient coding method with the category-specific codebooks, which are generated separately using the training images of each category. Different from the usual way of concatenating or merging the category codebooks to form a global dictionary, we employ the category codebooks to calculate a type of category-sensitive saliency feature, and then, encode the saliency features to form a representation of image content. Compared to the state-of-the-art methods such as LC-KSVD, the dictionary generation and feature encoding in our scheme are pretty simple, and no complicated optimization is involved. However, our scheme can achieve better, in some cases, significantly better results, in terms of the classification accuracy, than the state-of-the-art methods. Extensive experiments are carried out to show the effectiveness of our method in comparing with various image classification methods. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Image classification feature coding Dictionary learning

来源：评论

学校读者我要写书评

暂无评论

Laplacian regularized locality-constrained coding for image classification

引用

NEUROCOMPUTING 2016年 171卷 1486-1495页

作者： Min, Huaqing Liang, Mingjie Luo, Ronghua Zhu, Jinhui S China Univ Technol Sch Software Engn Guangzhou 510006 Guangdong Peoples R China S China Univ Technol Sch Comp Sci & Engn Guangzhou 510006 Guangdong Peoples R China

feature coding, which encodes local features extracted from an image with a codebook and generates a set of codes for efficient image representation, has shown very promising results in image classification. Vector quantization is the most simple but widely used method for feature coding. However, it suffers from large quantization errors and leads to dissimilar codes for similar features. To alleviate these problems, we propose Laplacian Regularized Locality-constrained coding (LapLLC), wherein a locality constraint is used to favor nearby bases for encoding, and Laplacian regularization is integrated to preserve the code consistency of similar features. By incorporating a set of template features, the objective function used by LapLLC can be decomposed, and each feature is encoded by solving a linear system. Additionally, k nearest neighbor technique is employed to construct a much smaller linear system, so that fast approximated coding can be achieved. Therefore, LapLLC provides a novel way for efficient feature coding. Our experiments on a variety of image classification tasks demonstrated the effectiveness of this proposed approach. (C) 2015 Elsevier B.V. All rights reserved.

关键词： Image classification feature coding Locality-constrained Laplacian regularization

来源：评论

学校读者我要写书评

暂无评论

A spectral-structural bag-of-features scene classifier for very high spatial resolution remote sensing imagery

引用

ISPRS JOURNAL OF PHOTOGRAMMETRY AND REMOTE SENSING 2016年第Jun.期116卷 73-85页

作者： Zhao, Bei Zhong, Yanfei Zhang, Liangpei Wuhan Univ State Key Lab Informat Engn Surveying Mapping & R Wuhan 430079 Peoples R China Chinese Univ Hong Kong Dept Geog & Resource Management Hong Kong Hong Kong Peoples R China Wuhan Univ Collaborat Innovat Ctr Geospatial Technol Wuhan 430079 Peoples R China

Land-use classification of very high spatial resolution remote sensing (VHSR) imagery is one of the most challenging tasks in the field of remote sensing image processing. However, the land-use classification is hard to be addressed by the land-cover classification techniques, due to the complexity of the land-use scenes. Scene classification is considered to be one of the expected ways to address the land-use classification issue. The commonly used scene classification methods of VHSR imagery are all derived from the computer vision community that mainly deal with terrestrial image recognition. Differing from terrestrial images, VHSR images are taken by looking down with airborne and spaceborne sensors, which leads to the distinct light conditions and spatial configuration of land cover in VHSR imagery. Considering the distinct characteristics, two questions should be answered: (1) Which type or combination of information is suitable for the VHSR imagery scene classification? (2) Which scene classification algorithm is best for VHSR imagery? In this paper, an efficient spectral-structural bag-of-features scene classifier (SSBFC) is proposed to combine the spectral and structural information of VHSR imagery. SSBFC utilizes the first- and second-order statistics (the mean and standard deviation values, MeanStd) as the statistical spectral descriptor for the spectral information of the VHSR imagery, and uses dense scale-invariant feature transform (SIFT) as the structural feature descriptor. From the experimental results, the spectral information works better than the structural information, while the combination of the spectral and structural information is better than any single type of information. Taking the characteristic of the spatial configuration into consideration, SSBFC uses the whole image scene as the scope of the pooling operator, instead of the scope generated by a spatial pyramid (SP) commonly used in terrestrial image classification. The experimental

关键词： Scene classification Bag-of-features feature coding Spectral information Structural information Very high spatial resolution Land-use classification Remote sensing

来源：评论

学校读者我要写书评

暂无评论

MULTI-VIEW DISTRIBUTED SOURCE coding OF BINARY featureS FOR VISUAL SENSOR NETWORKS 41

MULTI-VIEW DISTRIBUTED SOURCE CODING OF BINARY FEATURES FOR ...

引用

41st IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Monteiro, Nuno Brites, Catarina Pereira, Fernando Ascenso, Joao Inst Super Tecn Inst Telecomunicacoes Lisbon Portugal

ISBN: (纸本)9781479999880

Visual analysis algorithms have been mostly developed for a centralized scenario where all visual data is acquired and processed at a central location. However, in visual sensor networks (VSN), several constraints in computational power, energy and bandwidth require a radically different approach, notably a paradigm shift from centralized to distributed visual processing. In the new paradigm, visual data is acquired and features are extracted at the sensing nodes locations to be after transmitted to enable further analysis at some central location. In such scenario, one of the key challenges is to design suitable feature coding schemes that are able to exploit the correlation among the features corresponding to (partially) overlapped views of the same visual scene. To achieve efficient coding, it is proposed to employ the distributed source coding paradigm as it does not require any communication between the sensing nodes (rather expensive in VSN) and it is parsimonious in terms of computational resources. Experimental results show that significant accuracy and compression gains (up to 37.36%) can be achieved when coding features extracted from multiple views.

关键词： distributed source coding feature coding multi-view coding visual sensor networks

来源：评论

学校读者我要写书评

暂无评论

MULTI-VIEW DISTRIBUTED coding AND SELECTION OF LOCAL BINARY featureS

MULTI-VIEW DISTRIBUTED CODING AND SELECTION OF LOCAL BINARY ...

引用

IEEE International Conference on Multimedia & Expo (ICME)

作者： Monteiro, Nuno Brites, Catarina Pereira, Fernando Ascenso, Joao Inst Super Tecn Inst Telecomunicacoes Lisbon Portugal

ISBN: (纸本)9781467372589

Recently, the latest advances in compact feature representation and feature learning have provided an efficient framework for several visual analysis tasks, such as object recognition. However, when multiple cameras with overlapping fields-of-view are employed, other visual analysis tasks such as depth estimation can be supported and object recognition accuracy can be improved. In this paper the problem of distributed visual analysis from multiple views of a scene is addressed, considering that computational power and bandwidth, at each camera sensor, are rather limited. More specifically, an efficient coding technique for local binary features is proposed which exploits the correlation at the decoder side between each descriptor and its quantized representation. Moreover, considering that descriptors representing the same visual feature across different views are well correlated, a technique to avoid the transmission of redundant descriptors from multiple views is proposed. At the decoder, the joint statistics of all descriptors from all views is used to drive the selection of the best descriptors to be transmitted by each sensing node. The proposed multi-view feature coding and selection techniques allow obtaining bitrate reductions up to 80%, with respect to the uncompressed descriptor rate, for a certain task accuracy.

关键词： feature selection feature coding distributed coding distributed visual analysis

来源：评论

学校读者我要写书评

暂无评论

Transferring Deep Convolutional Neural Networks for the Scene Classification of High-Resolution Remote Sensing Imagery

引用

REMOTE SENSING 2015年第11期7卷 14680-14707页

作者： Hu, Fan Xia, Gui-Song Hu, Jingwen Zhang, Liangpei State Key Lab Informat Engn Surveying Mapping & R Wuhan 430079 Peoples R China Wuhan Univ Elect Informat Sch Wuhan 430072 Peoples R China

Learning efficient image representations is at the core of the scene classification task of remote sensing imagery. The existing methods for solving the scene classification task, based on either feature coding approaches with low-level hand-engineered features or unsupervised feature learning, can only generate mid-level image features with limited representative ability, which essentially prevents them from achieving better performance. Recently, the deep convolutional neural networks (CNNs), which are hierarchical architectures trained on large-scale datasets, have shown astounding performance in object recognition and detection. However, it is still not clear how to use these deep convolutional neural networks for high-resolution remote sensing (HRRS) scene classification. In this paper, we investigate how to transfer features from these successfully pre-trained CNNs for HRRS scene classification. We propose two scenarios for generating image features via extracting CNN features from different layers. In the first scenario, the activation vectors extracted from fully-connected layers are regarded as the final image features;in the second scenario, we extract dense features from the last convolutional layer at multiple scales and then encode the dense features into global image features through commonly used feature coding approaches. Extensive experiments on two public scene classification datasets demonstrate that the image features obtained by the two proposed scenarios, even with a simple linear classifier, can result in remarkable performance and improve the state-of-the-art by a significant margin. The results reveal that the features from pre-trained CNNs generalize well to HRRS datasets and are more expressive than the low- and mid-level features. Moreover, we tentatively combine features extracted from different CNN models for better performance.

关键词： CNN scene classification feature representation feature coding convolutional layer fully-connected layer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：