检索结果-内蒙古大学图书馆

Object tracking within the framework of concept drift

学校读者我要写书评

暂无评论

Object tracking within the framework of concept drift

11th Asian Conference on Computer Vision, ACCV 2012

作者： Chen, Li Zhou, Yue Yang, Jie Institute of Image Processing and Pattern Recognition Shanghai Jiao Tong University Shanghai 200240 China Key Laboratory of System Control and Information Processing Ministry of Education of China Shanghai Jiao Tong University Shanghai 200240 China

ISBN: (纸本)9783642374302

It is well known that the backgrounds or the targets always change in real scenes, which weakens the effectiveness of classical tracking algorithms because of frequent model mismatches. In this paper, an object tracking algorithm within the framework of concept drift is proposed to solve this problem. We detect the driftpoints using a simple message-passing algorithm based on Bayesian Approach. The analyzed probability distribution lays the foundation for the self-adaption of our new model. Our tracking algorithm within the framework of concept drift improves the tracking robustness and accuracy which is illustrated by the two experiments on two real-world changing scenes. © 2013 Springer-Verlag.

关键词： Probability distributions

Pathology study for blood vessel of ocular fundus images by photoacoustic tomography

学校读者我要写书评

暂无评论

Pathology study for blood vessel of ocular fundus images by ...

IEEE Symposium (IUS) Ultrasonics

作者： Jiayao Zhang Kai Deng Bin Chen Hengrong Lan Meng Zhou Fei Gao The Hybrid Imaging System Laboratory ShanghaiTech University Shanghai China The Pattern Recognition and Intelligent System Laboratory Beijing University of Posts and Telecommunications Beijing China

In the entire diabetic population, the total number of patients with diabetic retinopathy is more than 50%, and the longer diabetes, the higher the incidence of retinopathy and the rate of blindness. Besides, the blood vessel of ocular fundus is the only blood vessel that can be directly observed, which has excellent application value in medical diagnostics. Photoacoustic Tomography (PAT) is an emerging technique that can obtain high-resolution 3D in-vivo images of optical absorption by sensing laser-generated ultrasound. Therefore, in this paper, we applied U-net neural network for the segmentation of blood vessel of ocular fundus images that opens up new methods for fundus medical image processing. Then we use 2D Time Reversal photoacoustic simulation based on k-WAVE MATLAB toolbox to convert the fundus segmentation of blood vessel images into photoacoustic images. Finally, we use the ResNet Network for the diagnosis of diabetes, in which the input data are the healthy and patient photoacoustic images of the fundus segmentation of blood vessel. We achieved 85% accuracy with 158 training samples. These results demonstrate the power of using deep learning for the analysis of diabetes through the fundus segmentation photoacoustic images of the blood vessel.

关键词： Economic indicators

Challenging the recognition of facial expression via deep learning

学校读者我要写书评

暂无评论

Challenging the recognition of facial expression via deep le...

作者： Hu, De Kun Liu, Yong Hong Zhang, Li Duan, Gui Duo Key Laboratory of Pattern Recognition and Intelligent Information Processing Institutions of Higher Education of Sichuan Province Chengdu University Chengdu 610106 China School of Computer Science and Engineering University of Electronic Science and Technology of China Chengdu 610054 China

ISBN: (纸本)9783038351399

A deep Neural Network model was trained to classify the facial expression in unconstrained images, which comprises nine layers, including input layer, convolutional layer, pooling layer, fully connected layers and output layer. In order to optimize the model, rectified linear units for the nonlinear transformation, weights sharing for reducing the complexity, "mean" and "max" pooling for subsample, "dropout" for sparsity are applied in the forward processing. With large amounts of hard training faces, the model was trained via back propagation method with stochastic gradient descent. The results of shows the proposed model achieves excellent performance. © (2014) Trans Tech Publications, Switzerland.

关键词： Neural network models

Channel DropBlock: An improved regularization method for fine-grained visual classification

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Ding, Yifeng Dong, Shuwei Tong, Yujun Ma, Zhanyu Xiao, Bo Ling, Haibin Pattern Recognition and Intelligent System Laboratory Beijing University of Posts and Telecommunications Department of Computer Science Stony Brook University

Classifying the sub-categories of an object from the same super-category (e.g., bird) in a fine-grained visual classification (FGVC) task highly relies on mining multiple discriminative features. Existing approaches mainly tackle this problem by introducing attention mechanisms to locate the discriminative parts or feature encoding approaches to extract the highly parameterized features in a weakly-supervised fashion. In this work, we propose a lightweight yet effective regularization method named Channel DropBlock (CDB), in combination with two alternative correlation metrics, to address this problem. The key idea is to randomly mask out a group of correlated channels during training to destruct features from co-adaptations and thus enhance feature representations. Extensive experiments on three benchmark FGVC datasets show that CDB effectively improves the performance. Copyright © 2021, The Authors. All rights reserved.

关键词： Benchmarking

Modeling discriminative representations for out-of-domain detection with supervised contrastive learning

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Zeng, Zhiyuan He, Keqing Yan, Yuanmeng Liu, Zijun Wu, Yanan Xu, Hong Jiang, Huixing Xu, Weiran Pattern Recognition & Intelligent System Laboratory Beijing University of Posts and Telecommunications Beijing China Meituan Group Beijing China

Detecting Out-of-Domain (OOD) or unknown intents from user queries is essential in a task-oriented dialog system. A key challenge of OOD detection is to learn discriminative semantic features. Traditional cross-entropy loss only focuses on whether a sample is correctly classified, and does not explicitly distinguish the margins between categories. In this paper, we propose a supervised contrastive learning objective to minimize intra-class variance by pulling together in-domain intents belonging to the same class and maximize inter-class variance by pushing apart samples from different classes. Besides, we employ an adversarial augmentation mechanism to obtain pseudo diverse views of a sample in the latent space. Experiments on two public datasets prove the effectiveness of our method capturing discriminative representations for OOD detection. © 2021, CC BY.

关键词： Semantics

Multi-View Active Fine-Grained recognition

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Du, Ruoyi Yu, Wenqing Wang, Heqing Chang, Dongliang Lin, Ting-En Li, Yongbin Ma, Zhanyu Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China

As fine-grained visual classification (FGVC) being developed for decades, great works related have exposed a key direction - finding discriminative local regions and revealing subtle differences. However, unlike identifying visual contents within static images, for recognizing objects in the real physical world, discriminative information is not only present within seen local regions but also hides in other unseen perspectives. In other words, in addition to focusing on the distinguishable part from the whole, for efficient and accurate recognition, it is required to infer the key perspective with a few glances, e.g., people may recognize a "Benz AMG GT" with a glance of its front and then know that taking a look at its exhaust pipe can help to tell which year's model it is. In this paper, back to reality, we put forward the problem of active fine-grained recognition (AFGR) and complete this study in three steps: (i) a hierarchical, multi-view, fine-grained vehicle dataset is collected as the testbed, (ii) a simple experiment is designed to verify that different perspectives contribute differently for FGVC and different categories own different discriminative perspective, (iii) a policy-gradient-based framework is adopted to achieve efficient recognition with active view selection. Comprehensive experiments demonstrate that the proposed method delivers a better performance-efficient trade-off than previous FGVC methods and advanced neural networks. Codes are available at: https://***/PRIS-CV/AFGR. © 2022, CC BY.

关键词： Economic and social effects

A novel approach to edge detection of color image based on quaternion fractional directional differentiation

学校读者我要写书评

暂无评论

A novel approach to edge detection of color image based on q...

2011 International Conference on Automation and Robotics, ICAR 2011

作者： Gao, Chaobang Zhou, Jiliu Lang, Fangnian Pu, Qiang Liu, Chang College of Computer Science and Technology Chengdu University Chengdu 610106 China School of Computer Science Sichuan University Chengdu 610064 China Key Laboratory of Pattern Recognition and Intelligent Information Processing Sichuan Province Chengdu 610106 China

ISBN: (纸本)9783642255526

In this paper, we denote a color image by a quaternion function, then find edge points by solving the maximum of quaternion fractional directional differentiation(QFDD)'s norm. This method is called edge detection based on QFDD. Experiments indicate that the method has special advantages. Comparing with Canny, LOG, Sobel, and general fractional differentiation, we discover that QFDD has fewer false negatives in the textured regions and is also better at detecting edges which are partially defined by texture, which means we will obtain better results in the interesting regions by QFDD and these results are more consistent with the characteristics of human visual system. © 2011 Springer-Verlag.

关键词： Edge detection

Face transformer for recognition

学校读者我要写书评

暂无评论

arXiv 2021年

作者： Zhong, Yaoyao Deng, Weihong Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China

Recently there has been a growing interest in Transformer not only in NLP but also in computer vision. We wonder if transformer can be used in face recognition and whether it is better than CNNs. Therefore, we investigate the performance of Transformer models in face recognition. Considering the original Transformer may neglect the interpatch information, we modify the patch generation process and make the tokens with sliding patches which overlaps with each others. The models are trained on CASIA-WebFace and MSCeleb- 1M databases, and evaluated on several mainstream benchmarks, including LFW, SLLFW, CALFW, CPLFW, TALFW, CFP-FP, AGEDB and IJB-C databases. We demonstrate that Face Transformer models trained on a large-scale database, MS-Celeb-1M, achieve comparable performance as CNN with similar number of parameters and MACs. To facilitate further researches, Face Transformer models and codes are available at https://***/zhongyy/Face-Transformer. Copyright © 2021, The Authors. All rights reserved.

关键词： Face recognition

Cycle Label-Consistent Networks for Unsupervised Domain Adaptation

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Wang, Mei Deng, Weihong The Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China

Domain adaptation aims to leverage a labeled source domain to learn a classifier for the unlabeled target domain with a different distribution. Previous methods mostly match the distribution between two domains by global or class alignment. However, global alignment methods cannot achieve a fine-grained class-to-class overlap;class alignment methods supervised by pseudo-labels cannot guarantee their reliability. In this paper, we propose a simple yet efficient domain adaptation method, i.e. Cycle Label-Consistent Network (CLCN), by exploiting the cycle consistency of classification label, which applies dual cross-domain nearest centroid classification procedures to generate a reliable self-supervised signal for the discrimination in the target domain. The cycle label-consistent loss reinforces the consistency between ground-truth labels and pseudo-labels of source samples leading to statistically similar latent representations between source and target domains. This new loss can easily be added to any existing classification network with almost no computational overhead. We demonstrate the effectiveness of our approach on MNIST-USPS-SVHN, Office-31, Office-Home and Image CLEF-DA benchmarks. Results validate that the proposed method can alleviate the negative influence of falsely-labeled samples and learn more discriminative features, leading to the absolute improvement over source-only model by 9.4% on Office-31 and 6.3% on Image CLEF-DA. © 2022, CC BY.

关键词： Image enhancement