检索结果-内蒙古大学图书馆

Conference on computer vision and pattern recognition (CVPR)

作者： Shijie Yu Shihua Li Dapeng Chen Rui Zhao Junjie Yan Yu Qiao ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Science University of Chinese Academy of Sciences China Institute of Microelectronics of the Chinese Academy of Sciences

ISBN: (数字)9781728171685

ISBN: (纸本)9781728171692

Recent years have witnessed great progress in person re-identification (re-id). Several academic benchmarks such as Market1501, CUHK03 and DukeMTMC play important roles to promote the re-id research. To our best knowledge, all the existing benchmarks assume the same person will have the same clothes. While in real-world scenarios, it is very often for a person to change clothes. To address the clothes changing person re-id problem, we construct a novel large-scale re-id benchmark named Clothes Changing Person Set (COCAS), which provides multiple images of the same identity with different clothes. COCAS totally contains 62,382 body images from 5,266 persons. Based on COCAS, we introduce a new person re-id setting for clothes changing problem, where the query includes both a clothes template and a person image taking another clothes. Moreover, we propose a two-branch network named Biometric-Clothes Network (BC-Net) which can effectively integrate biometric and clothes feature for re-id under our setting. Experiments show that it is feasible for clothes changing re-id with clothes templates.

关键词： Face Feature extraction Benchmark testing Protocols Biological system modeling Shape

来源：评论

学校读者我要写书评

暂无评论

Locating high-density clusters with noisy queries

Locating high-density clusters with noisy queries

引用

International Conference on pattern recognition

作者： Chen Cao Shifeng Chen Changqing Zou Jianzhuang Liu Shenzhen Key Laboratory for Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences China Department of Information Engineering Chinese University of Hong Kong China

ISBN: (纸本)9781467322164

Semi-supervised learning (SSL) relies on a few labeled samples to explore data's intrinsic structure through pairwise smooth transduction. The performance of SSL mainly depends on two folds: (1) the accuracy of labeled queries, (2) the integrity of manifolds in data distribution. Both of these qualities would be poor in real applications as data often consist of several irrelevant clusters and discrete noise. In this paper we propose a novel framework to simultaneously remove discrete noise and locate the high-density clusters. Experiments demonstrate that our algorithm is quite effective to solve several problems such as non-feedback image re-ranking and image co-segmentation.

关键词： Noise Noise measurement Clustering algorithms Databases Vectors Manifolds Semisupervised learning

来源：评论

学校读者我要写书评

暂无评论

Attention-driven dynamic graph convolutional network for multi-label image recognition

arXiv

引用

arXiv 2020年

作者： Ye, Jin He, Junjun Peng, Xiaojiang Wu, Wenhao Qiao, Yu ShenZhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen China School of Biomedical Engineering Institute of Medical Robotics Shanghai Jiao Tong University Shanghai China

Recent studies often exploit Graph Convolutional Network (GCN) to model label dependencies to improve recognition accuracy for multi-label image recognition. However, constructing a graph by counting the label co-occurrence possibilities of the training data may degrade model generalizability, especially when there exist occasional co-occurrence objects in test images. Our goal is to eliminate such bias and enhance the robustness of the learnt features. To this end, we propose an Attention-Driven Dynamic Graph Convolutional Network (ADD-GCN) to dynamically generate a specific graph for each image. ADD-GCN adopts a Dynamic Graph Convolutional Network (D-GCN) to model the relation of content-aware category representations that are generated by a Semantic Attention Module (SAM). Extensive experiments on public multi-label benchmarks demonstrate the effectiveness of our method, which achieves mAPs of 85.2%, 96.0%, and 95.5% on MS-COCO, VOC2007, and VOC2012, respectively, and outperforms current state-of-the-art methods with a clear margin. All codes can be found at https://***/Yejin0111/ADD-GCN. Copyright © 2020, The Authors. All rights reserved.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

COCAS: A large-scale clothes changing person dataset for re-identification

arXiv

引用

arXiv 2020年

作者： Yu, Shijie Li, Shihua Chen, Dapeng Zhao, Rui Yan, Junjie Qiao, Yu ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Science University of Chinese Academy of Sciences China Institute of Microelectronics of the Chinese Academy of Sciences

Recent years have witnessed great progress in person re-identification (re-id). Several academic benchmarks such as Market1501, CUHK03 and DukeMTMC play important roles to promote the re-id research. To our best knowledge, all the existing benchmarks assume the same person will have the same clothes. While in real-world scenarios, it is very often for a person to change clothes. To address the clothes changing person re-id problem, we construct a novel large-scale re-id benchmark named ClOthes ChAnging Person Set (COCAS), which provides multiple images of the same identity with different clothes. COCAS totally contains 62,382 body images from 5,266 persons. Based on COCAS, we introduce a new person re-id setting for clothes changing problem, where the query includes both a clothes template and a person image taking another clothes. Moreover, we propose a two-branch network named Biometric-Clothes Network (BC-Net) which can effectively integrate biometric and clothes feature for re-id under our setting. Experiments show that it is feasible for clothes changing re-id with clothes templates. Copyright © 2020, The Authors. All rights reserved.

关键词： Large dataset

来源：评论

学校读者我要写书评

暂无评论

The Equipment Nameplate Dataset for Scene Text Detection and recognition∗

The Equipment Nameplate Dataset for Scene Text Detection and...

引用

IEEE International Conference on Robotics and Biomimetics

作者： Xiaolong Chen Zhengfu Zhang Yu Qiao Pu Zhang Lanqing Guo Wenrui Chen Chen Chen Bin Fu Guangzhou Power Supply Bureau Co. Ltd. Guangzhou China ShenZhen Key Lab of Computer Vision and Pattern Recognition SIAT-SenseTime Joint Lab Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences

In this paper, we introduce the Equipment Nameplate Dataset, a large dataset for scene text detection and recognition. Natural images in this dataset are taken in the wild and thus this dataset includes various intra-class inconsistency such as ill illumination conditions and partly occluded, which makes our dataset more challenging than other datasets. In order to make people train detection and recognition model separately, we annotate our dataset not only word instance, but also text region by using rectangle bounding boxes. Some detailed statistics information about our dataset will be given so that people can use them to analyse and develop their own models. Moreover, we use our dataset to test some famous detection and recognition models and present the corresponding results in order to make researcher compare them with their own models. Dataset will be publicly available on the website.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A New Forged Handwriting Detection Method Based on Fourier Spectral Density and Variation 5th

A New Forged Handwriting Detection Method Based on Fourier S...

引用

5th Asian Conference on pattern recognition, ACPR 2019

作者： Kundu, Sayani Shivakumara, Palaiahnakote Grouver, Anaica Pal, Umapada Lu, Tong Blumenstein, Michael Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata Kolkata India Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur Malaysia National Key Lab for Novel Software Technology Nanjing University Nanjing China Faculty of Engineering and Information Technology University of Technology Sydney Ultimo Australia

ISBN: (纸本)9783030414030

Use of handwriting words for person identification in contrast to biometric features is gaining importance in the field of forensic applications. As a result, forging handwriting is a part of crime applications and hence is challenging for the researchers. This paper presents a new work for detecting forged handwriting words because width and amplitude of spectral distributions have the ability to exhibit unique properties for forged handwriting words compared to blurred, noisy and normal handwriting words. The proposed method studies spectral density and variation of input handwriting images through clustering of high and low frequency coefficients. The extracted features, which are invariant to rotation and scaling, are passed to a neural network classifier for the classification for forged handwriting words from other types of handwriting words (like blurred, noisy and normal handwriting words). Experimental results on our own dataset, which consists of four handwriting word classes, and two benchmark datasets, namely, caption and scene text classification and forged IMEI number dataset, show that the proposed method outperforms the existing methods in terms of classification rate. © Springer Nature Switzerland AG 2020.

关键词： Spectral density

来源：评论

学校读者我要写书评

暂无评论

PC-HMR: Pose calibration for 3d human mesh recovery from 2D images/videos

arXiv

引用

arXiv 2021年

作者： Luan, Tianyu Wang, Yali Zhang, Junhao Wang, Zhe Zhou, Zhipeng Qiao, Yu ShenZhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology Chinese Academy of Sciences Shenzhen China SIAT Branch Shenzhen Institute of Artificial Intelligence and Robotics for Society China University of California Irvine United States

The end-to-end Human Mesh Recovery (HMR) approach (Kanazawa et al. 2018) has been successfully used for 3D body reconstruction. However, most HMR-based frameworks reconstruct human body by directly learning mesh parameters from images or videos, while lacking explicit guidance of 3D human pose in visual data. As a result, the generated mesh often exhibits incorrect pose for complex activities. To tackle this problem, we propose to exploit 3D pose to calibrate human mesh. Specifically, we develop two novel Pose Calibration frameworks, i.e., Serial PC-HMR and Parallel PC-HMR. By coupling advanced 3D pose estimators and HMR in a serial or parallel manner, these two frameworks can effectively correct human mesh with guidance of a concise pose calibration module. Furthermore, since the calibration module is designed via non-rigid pose transformation, our PCHMR frameworks can flexibly tackle bone length variations to alleviate misplacement in the calibrated mesh. Finally, our frameworks are based on generic and complementary integration of data-driven learning and geometrical modeling. Via plug-and-play modules, they can be efficiently adapted for both image/video-based human mesh recovery. Additionally, they have no requirement of extra 3D pose annotations in the testing phase, which releases inference difficulties in practice. We perform extensive experiments on the popular benchmarks, i.e., Human3.6M, 3DPW and SURREAL, where our PC-HMR frameworks achieve the SOTA results. Copyright © 2021, The Authors. All rights reserved.

关键词： Mesh generation

来源：评论

学校读者我要写书评

暂无评论

Activating More Pixels in Image Super-Resolution Transformer

Activating More Pixels in Image Super-Resolution Transformer

引用

Conference on computer vision and pattern recognition (CVPR)

作者： Xiangyu Chen Xintao Wang Jiantao Zhou Yu Qiao Chao Dong State Key Laboratory of Internet of Things for Smart City University of Macau Shenzhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institute of Advanced Technology Chinese Academy of Sciences Shanghai Artificial Intelligence Laboratory ARC Lab Tencent PCG

Transformer-based methods have shown impressive performance in low-level vision tasks, such as image super-resolution. However, we find that these networks can only utilize a limited spatial range of input information through attribution analysis. This implies that the potential of Transformer is still not fully exploited in existing networks. In order to activate more input pixels for better reconstruction, we propose a novel Hybrid Attention Transformer (HAT). It combines both channel attention and window-based self-attention schemes, thus making use of their complementary advantages of being able to utilize global statistics and strong local fitting capability. Moreover, to better aggregate the cross-window information, we introduce an overlapping cross-attention module to enhance the interaction between neighboring window features. In the training stage, we additionally adopt a same-task pre-training strategy to exploit the potential of the model for further improvement. Extensive experiments show the effectiveness of the proposed modules, and we further scale up the model to demonstrate that the performance of this task can be greatly improved. Our overall method significantly outperforms the state-of-the-art methods by more than 1dB.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A Spatial Density and Phase Angle Based Correlation for Multi-type Family Photo Identification 5th

A Spatial Density and Phase Angle Based Correlation for Mult...

引用

5th Asian Conference on pattern recognition, ACPR 2019

作者： Grouver, Anaica Shivakumara, Palaiahnakote Kaljahi, Maryam Asadzadeh Chetty, Bhaarat Pal, Umapada Lu, Tong Hemantha Kumar, G. Faculty of Computer Science and Information Technology University of Malaya Kuala Lumpur Malaysia Google Developers Group NASDAQ Bangalore India Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata India National Key Lab for Novel Software Technology Nanjing University Nanjing China University of Mysore MysoreKarnataka India

ISBN: (纸本)9783030412982

Due to change in mindset and living style of humans, the numbers of diversified marriages are increasing all around the world irrespective of race, color, religion and culture. As a result, it is challenging for research community to identify multi type family photos, namely, normal family (family of the same race, religion or culture), multi-culture family (family of different culture, religion or race) from the family and non-family photos (images with friends, colleagues, etc.). In this work, we present a new method that combines spatial density information with phase angle for multi-type family photo classification. The proposed method uses three facial key points, namely, left-eye, right-eye and nose, for the features which are based on color, roughness and wrinkleless of faces, these are prominent for extracting unique cues for classification. The correlations between features of Left & Right Eyes, Left Eye & Nose and Right Eye & Nose are computed for all the faces in an image. This results in feature vectors for respective spatial density and phase angle information. Furthermore, the proposed method fuses the feature vectors and feeds them to the Convolutional Neural Network (CNN) for the classification of the above-three class problem. Experiments conducted on our database which contains three classes, namely, multi-cultural, normal and non-family images and the benchmark databases (due to Maryam et al. and Wang et al.) which contain two class-family and non-family images, show that the proposed method outperforms the existing methods in terms of classification rate for all the three databases. © 2020, Springer Nature Switzerland AG.

关键词： Database systems

来源：评论

学校读者我要写书评

暂无评论

3D reconstruction based on light field information

3D reconstruction based on light field information

引用

International Conference on Information and Automation (ICIA)

作者： Yan Zhou Huiwen Guo Ruiqing Fu Guoyuan Liang Can Wang Xinyu Wu Shenzhen Key Lab for Computer Vision and Pattern Recognition Chinese Academy of Sciences Shenzhen College of Advanced Technology University of Chinese Academy of Sciences Dept. Mechanical and Automation Engineering The Chinese University of Hong Kong

ISBN: (纸本)9781467391054

As an important branch of computational photography, light field photography combines the hardware design of optical system with key algorithm of signal processing quite well. Unlike traditional photography which can only record light ray's two-dimensional position, light field photography system can record four-dimensional position and direction. Therefore, much more image information can be obtained from light field photography. With the development of 3D display technology, light field based autofocus and 3D display technology is becoming more and more popular. In this paper, a light field based new 3D reconstruction algorithm for buildings and office environment is proposed by applying Wavelet Transform and SVM (Support Vector Machine) model to obtain the image focusing quality assessment, along with the Mean Shift Algorithm and Random Field Model to get the depth map of the scene. Firstly, light field image is captured by using a light field camera. Secondly, we use frequency domain digital refocus algorithm to manipulate light field image and obtain several serialized refocused images with different focus. Thirdly, wavelet features are extracted from each refocused image, and then an image focusing quality assessment is conducted by using RBF (Radial Basis Function) kernel based SVM model. Finally, we use Mean Shift algorithm to realize color clustering of the original light field image, and then build MRF (Markov Random Field) Model with color nodes. By iterating the likelihood depth result obtained from real scenario depth calibrations according to image focusing quality assessment, finally the depth map of the scene is reconstructed. Experiments are conducted to prove the feasibility of the proposed 3D reconstructed algorithm based on light field. And the experimental results on real datasets demonstrate good performance of this algorithm.

关键词： Three-dimensional displays Cameras Photography Lenses Wavelet transforms Focusing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：