检索结果-内蒙古大学图书馆

arXiv 2019年

作者： Zhang, Shifeng Liu, Ajian Wan, Jun Liang, Yanyan Guo, Guogong Escalera, Sergio Escalante, Hugo Jair Li, Stan Z. National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences University of Chinese Academy of Sciences Beijing China Macau University of Science and Technology Macau China Institute of Deep Learning Baidu Research and National Engineering Laboratory for Deep Learning Technology and Application Universitat de Barcelona Computer Vision Center Barcelona Catalonia Instituto Nacional de Astrofsica Ptica y Electrnica Puebla72840 Mexico

Face anti-spoofing is essential to prevent face recognition systems from a security breach. Much of the progresses have been made by the availability of face anti-spoofing benchmark datasets in recent years. However, existing face antispoofing benchmarks have limited number of subjects (≤170) and modalities (≤2), which hinder the further development of the academic community. To facilitate face anti-spoofing research, we introduce a large-scale multi-modal dataset, namely CASIASURF, which is the largest publicly available dataset for face antispoofing in terms of both subjects and modalities. Specifically, it consists of 1;000 subjects with 21;000 videos and each sample has 3 modalities (i.e., RGB, Depth and IR). We also provide comprehensive evaluation metrics, diverse evaluation protocols, training/validation/testing subsets and a measurement tool, developing a new benchmark for face anti-spoofing. Moreover, we present a novel multi-modal multi-scale fusion method as a strong baseline, which performs feature re-weighting to select the more informative channel features while suppressing the less useful ones for each modality across different scales. Extensive experiments have been conducted on the proposed dataset to verify its significance and generalization capability. The dataset is available at http://***/***/chalearnfacespoofingattackdete/. Copyright © 2019, The Authors. All rights reserved.

关键词： Face recognition

来源：评论

学校读者我要写书评

暂无评论

ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Text Detection and recognition — RRC-MLT-2019

ICDAR2019 Robust Reading Challenge on Multi-lingual Scene Te...

引用

International Conference on Document Analysis and recognition

作者： Nibal Nayef Yash Patel Michal Busta Pinaki Nath Chowdhury Dimosthenis Karatzas Wafa Khlif Jiri Matas Umapada Pal Jean-Christophe Burie Cheng-lin Liu Jean-Marc Ogier no affiliation The Robotics Institute Carnegie Mellon Universiry Pittsburgh USA Department of Cybernetics Czech Technical University Prague Czech Republic CVPR unit Indian Statistical Institute India Computer Vision Center Universitat Autònoma de Barcelona Spain L3i Laboratory University of La Rochelle France National Laboratory of Pattern Recognition Institute of Automation of Chinese Academy of Sciences China

With the growing cosmopolitan culture of modern cities, the need of robust Multi-Lingual scene Text (MLT) detection and recognition systems has never been more immense. With the goal to systematically benchmark and push the state-of-the-art forward, the proposed competition builds on top of the RRC-MLT-2017 with an additional end-to-end task, an additional language in the real images dataset, a large scale multi-lingual synthetic dataset to assist the training, and a baseline End-to-End recognition method. The real dataset consists of 20,000 images containing text from 10 languages. The challenge has 4 tasks covering various aspects of multi-lingual scene text: (a) text detection, (b) cropped word script classification, (c) joint text detection and script classification and (d) end-to-end detection and recognition. In total, the competition received 60 submissions from the research and industrial communities. This paper presents the dataset, the tasks and the findings of the presented RRC-MLT-2019 challenge.

关键词： Task analysis Text recognition Training Benchmark testing Protocols Rendering (computer graphics)

来源：评论

学校读者我要写书评

暂无评论

ICDAR2019 Robust reading challenge on multi-lingual scene text detection and recognition – RRC-MLT-2019

arXiv

引用

arXiv 2019年

作者： Nayef, Nibal Patel, Yash Busta, Michal Chowdhury, Pinaki Nath Karatzas, Dimosthenis Khlif, Wafa Matas, Jiri Pal, Umapada Burie, Jean-Christophe Liu, Cheng-lin Ogier, Jean-Marc L3i Laboratory University of La Rochelle France Computer Vision Center Universitat Autònoma de Barcelona Spain CVPR unit Indian Statistical Institute India Robotics Institute Carnegie Mellon Universiry Pittsburgh United States Center for Machine Perception Department of Cybernetics Czech Technical University Prague Czech Republic National Laboratory of Pattern Recognition Institute of Automation of Chinese Academy of Sciences China

关键词： Large dataset

来源：评论

学校读者我要写书评

暂无评论

Deep convolution neural networks cascaded improved boosted forest for pedestrian detection

引用

Journal of computers (Taiwan) 2018年第5期29卷 15-28页

作者： Xu, Zhi-Tong Luo, Yan-Min Liu, Pei-Zhong Du, Yong-Zhao College of Computer Science and Technology Huaqiao University Xiamen361021 China Key Laboratory for Computer Vision and Pattern Recognition of Xiamen City Huaqiao University Xiamen361021 China College of Engineering Huaqiao University Quanzhou362000 China

Due to the resolution of small size pedestrian is relatively low, and the hard negative background is very similar to people, therefore, detecting small size pedestrian or detecting pedestrian from hard negative background still a challenging problem in computer vision. In order to effectively address these problem, we propose a novel deep convolution neural networks, and cascade an improved boosted forest classifier method to detect pedestrian. Firstly, by using selective search method to propose pedestrian candidate boxes with confidence scores for utmost retaining image resolution;then, based on these proposed confidence values, adopting convolution neural network model to extract candidate regions feature maps;finally, we improve the boosted forest classifier and cascade it to classify candidate boxes for achieving efficiently pedestrian detection. Extensive experiments on Caltech and KITTI benchmarks demonstrate the proposed method outperforms the state-of-the-art, achieves promising precision on KITTI and the lowest miss rate of 11.53% on Caltech, outperforming the second best method (CompACT-Deep) by 0.17%. © 2018 computer Society of the Republic of China. All rights reserved.

关键词： Image resolution

来源：评论

学校读者我要写书评

暂无评论

Efficient Audio-Visual Speaker recognition via Deep Heterogeneous Feature Fusion 12th

Efficient Audio-Visual Speaker Recognition via Deep Heteroge...

引用

12th Chinese Conference on Biometric recognition, CCBR 2017

作者： Liu, Yu-Hang Liu, Xin Fan, Wentao Zhong, Bineng Du, Ji-Xiang Department of Computer Science Huaqiao University Xiamen361021 China Xiamen Key Laboratory of Computer Vision and Pattern Recognition Huaqiao University Xiamen361021 China

ISBN: (纸本)9783319699226

Audio-visual speaker recognition (AVSR) has long been an active research area primarily due to its complementary information for reliable access control in biometric system, and it is a challenging problem mainly attributes to its multimodal nature. In this paper, we present an efficient audio-visual speaker recognition approach via deep heterogeneous feature fusion. First, we exploit a dual-branch deep convolutional neural networks (CNN) learning framework to extract and fuse the high-level semantic features of face and audio data. Further, by considering the temporal dependency of audio-visual data, we embed the fused features into a bidirectional Long Short-Term Memory (LSTM) networks to produce the recognition result, though which the speakers acquired under different challenging conditions can be well identified. The experimental results have demonstrated the efficiency of our proposed approach in both audio-visual feature fusion and speaker recognition. © 2017, Springer International Publishing AG.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

ChaLearn looking at people: IsoGD and ConGD large-scale RGB-D gesture recognition

arXiv

引用

arXiv 2019年

作者： Wan, Jun Lin, Chi Wen, Longyin Li, Yunan Miao, Qiguang Escalera, Sergio Anbarjafari, Gholamreza Guyon, Isabelle Guo, Guodong Li, Stan Z. National Laboratory of Pattern Recognition Institute of Automation Chinese Academy of Sciences Beijing100190 China JD Finance Mountain ViewCA United States University of Southern California Los AngelesCA90089-0911 United States School of Computer Science and Technology Xidian University & Xi'an Key Laboratory of Big Data and Intelligent Vision 2nd South Taibai Road Xi'an710071 China Universitat de Barcelona Computer Vision Center Spain iCV Lab Institute of Technology University of Tartu Estonia Faculty of Engineering Hasan Kalyoncu University Gaziantep Turkey Institute of Digital Technologies Loughborough University London United Kingdom ChaLearn United States University Paris-Saclay France institute of Deep Learning Baidu Research National Engineering Laboratory for Deep Learning Technology and Application China

The ChaLearn large-scale gesture recognition challenge has been run twice in two workshops in conjunction with the International Conference on pattern recognition (ICPR) 2016 and International Conference on computer vision (ICCV) 2017, attracting more than 200 teams around the world. This challenge has two tracks, focusing on isolated and continuous gesture recognition, respectively. This paper describes the creation of both benchmark datasets and analyzes the advances in large-scale gesture recognition based on these two datasets. We discuss the challenges of collecting large-scale ground-truth annotations of gesture recognition, and provide a detailed analysis of the current state-of-the-art methods for large-scale isolated and continuous gesture recognition based on RGB-D video sequences. In addition to recognition rate and mean jaccard index (MJI) as evaluation metrics used in our previous challenges, we also introduce the corrected segmentation rate (CSR) metric to evaluate the performance of temporal segmentation for continuous gesture recognition. Furthermore, we propose a bidirectional long short-term memory (Bi-LSTM) baseline method, determining the video division points based on the skeleton points extracted by convolutional pose machine (CPM). Experiments demonstrate that the proposed Bi-LSTM outperforms the state-of-the-art methods with an absolute improvement of 8.1% (from 0.8917 to 0.9639) of CSR. Copyright © 2019, The Authors. All rights reserved.

关键词： Long short-term memory

来源：评论

学校读者我要写书评

暂无评论

Human motion correction and representation method from motion camera

引用

Journal of Engineering 2017年第1期1卷 370-375页

作者： Zhang, Hong-Bo Guo, Feng Zhang, Miaohui Lin, Ying Hsiao, Tsung-Chih Department of Computer Science and Technology Huaqiao University Xiamen China Xiamen Key Laboratory of Computer Vision and Pattern Recognition Huaqiao University Xiamen China School of Information Science and Engineering Xiamen University Xiamen China Institute of Energy Jiangxi Academy of Sciences Jiangxi Province China

Motion estimation is a basic issue for many computer vision tasks, such as human-computer interaction, motion objection detection and intelligent robot. In many practical scenes, the object movement goes with camera motion. Generally, motion descriptors directly based on optical flow are inaccurate and have low discrimination power. To this end, a novel motion correction method is proposed and a novel motion feature descriptor called the motion difference histogram (MDH) for recognising human action is proposed in this study. Motion estimation results are corrected by background motion estimation and MDH encodes the motion difference between the background and the objects. Experimental results on video shot with camera motion show that the proposed motion correction method is effective and the recognition accuracy of MDH is better than that of the state-of-the-art motion descriptor.

关键词： Motion estimation

来源：评论

学校读者我要写书评

暂无评论

FWLBP: A scale invariant descriptor for texture classification

arXiv

引用

arXiv 2018年

作者： Roy, Swalpa Kumar Bhattacharya, Nilavra Chanda, Bhabatosh Chaudhuri, Bidyut B. Ghosh, Dipak Kumar Optical Character Recognition Laboratory Computer Vision and Pattern Recognition Unit Indian Statistical Institute Kolkata700108 India School of Information University of Texas AustinTX78712 United States Image Processing Laboratory Electronics and Communication Sciences Unit Indian Statistical Institute Kolkata700108 India Department of Electronics and Communication Engineering National Institute of Technology Rourkela Rourkela769008 India

In this paper we propose a novel texture descriptor called Fractal Weighted Local Binary pattern (FWLBP). The fractal dimension (FD) measure is relatively invariant to scale-changes, and presents a good correlation with human viewpoint of surface roughness. We have utilized this property to construct a scale-invariant descriptor. Here, the input image is sampled using an augmented form of the local binary pattern (LBP) over three different radii, and then used an indexing operation to assign FD weights to the collected samples. The final histogram of the descriptor has its features calculated using LBP, and its weights computed from the FD image. The proposed descriptor is scale invariant, and is also robust in rotation or reflection, and partially tolerant to noise and illumination changes. In addition, the local fractal dimension is relatively insensitive to the bi-Lipschitz transformations, whereas its extension is adequate to precisely discriminate the fundamental of texture primitives. Experiment results carried out on standard texture databases show that the proposed descriptor achieved better classification rates compared to the state-of-the-art descriptors. Copyright © 2018, The Authors. All rights reserved.

关键词： Fractal dimension

来源：评论

学校读者我要写书评

暂无评论

Author Correction: Development and clinical deployment of a smartphone-based visual field deep learning system for glaucoma detection

引用

NPJ digital medicine 2022年第1期5卷 38页

作者： Fei Li Diping Song Han Chen Jian Xiong Xingyi Li Hua Zhong Guangxian Tang Sujie Fan Dennis S C Lam Weihua Pan Yajuan Zheng Ying Li Guoxiang Qu Junjun He Zhe Wang Ling Jin Rouxi Zhou Yunhe Song Yi Sun Weijing Cheng Chunman Yang Yazhi Fan Yingjie Li Hengli Zhang Ye Yuan Yang Xu Yunfan Xiong Lingfei Jin Aiguo Lv Lingzhi Niu Yuhong Liu Shaoli Li Jiani Zhang Linda M Zangwill Alejandro F Frangi Tin Aung Ching-Yu Cheng Yu Qiao Xiulan Zhang Daniel S W Ting State Key Laboratory of Ophthalmology Zhongshan Ophthalmic Center Sun Yat-Sen University Guangzhou People's Republic of China. ShenZhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology The Chinese Academy of Sciences Shenzhen People's Republic of China. University of Chinese Academy of Sciences Beijing People's Republic of China. Department of Ophthalmology The First Affiliated Hospital of Kunming Medical University Kunming People's Republic of China. The First Hospital of Shijiazhuang City Shijiazhuang People's Republic of China. gxtykyy@***. Handan City Eye Hospital Handan People's Republic of China. C-MER (Shenzhen) Dennis Lam Eye Hospital International Eye Research Institute of The Chinese University of Hong Kong (Shenzhen) Shenzhen People's Republic of China. The Eye Hospital WMU at Hangzhou Hangzhou People's Republic of China. Department of Ophthalmology The Second Hospital of Jilin University Changchun People's Republic of China. SenseTime Group Limited Hong Kong People's Republic of China. Department of Ophthalmology The Second Affiliated Hospital of Guizhou Medical University Kaili People's Republic of China. Department of Ophthalmology The Second Affiliated Hospital of Xi'an Jiaotong University Xi'an People's Republic of China. Department of Ophthalmology The Third Affiliated Hospital of Nanchang University Nanchang People's Republic of China. The First Hospital of Shijiazhuang City Shijiazhuang People's Republic of China. Hamilton Glaucoma Center Shiley Eye Institute Viterbi Family Department of Ophthalmology UC San Diego La Jolla CA United States. CISTIB Center for Computational Imaging and Simulation Technologies in Biomedicine Schools of Computing and Medicine University of Leeds Leeds UK. Singapore Eye Research Institute and Singapore National Eye Centre Singapore Singapore. ShenZhen Key Lab of Computer Vision and Pattern Recognition Shenzhen Institutes of Advanced Technology The Chinese Academy of Sci

来源：评论

学校读者我要写书评

暂无评论

Edgy salient local binary patterns in inter-plane relationship for image retrieval in Diabetic Retinopathy

引用

Procedia computer Science 2017年 115卷 440-447页

作者： Gajanan M. Galshetwar Laxman M. Waghmare Anil B. Gonde Subrahmanyam Murala Center of Excellence in Signal and Image Processing (COESIP) Department of ECE SGGSIET Nanded Maharashtra 431606 India Computer Vision and Pattern Recognition Laboratory Department of Electrical Engineering IIT Ropar Rupnagar 140001 India

In this paper, a novel approach for content based image retrieval (CBIR) in diabetic retinopathy (DR) is proposed. The concept of salient point selection and inter-plane relationship technique is used. Salient points are selected from edgy image and later using inter-planer relationship, Local Binary patterns (LBPs) are calculated using the salient point as a center pixel. Our approach enhanced the results as we used color features in combination with LBP features. Experimentation is carried out on MESSIDOR database of 1200 retinal images, proposed approach has average precision of 57.82% as compared to the earlier approach whose average precision is 53.70%.

关键词： Content-Based image retrieval (CBIR) Diabetic Retinopathy (DR) Edgy Salient points Local Binary patterns (LBPs)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：