检索结果-内蒙古大学图书馆

KNOWLEDGE TRANSFER BASED FINE-GRAINED VISUAL CLASSIFICATION

学校读者我要写书评

暂无评论

KNOWLEDGE TRANSFER BASED FINE-GRAINED VISUAL CLASSIFICATION

2021 IEEE International Conference on Multimedia and Expo, ICME 2021

作者： Zhang, Siqing Du, Ruoyi Chang, Dongliang Ma, Zhanyu Guo, Jun The Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing China

ISBN: (纸本)9781665438643

Fine-grained visual classification (FGVC) aims to distinguish the sub-classes of the same category and its essential solution is to mine the subtle and discriminative regions. Convolution neural networks (CNNs), which employ the cross entropy loss (CE-loss) as the loss function, show poor performance since the model can only learn the most discriminative part and ignore other meaningful regions. Some existing works try to solve this problem by mining more discriminative regions by some detection techniques or attention mechanisms. However, most of them will meet the background noise problem when trying to find more discriminative regions. In this paper, we address it in a knowledge transfer learning manner. Multiple models are trained one by one, and all previously trained models are regarded as teacher models to supervise the training of the current one. Specifically, a orthogonal loss (OR-loss) is proposed to encourage the network to find diverse and meaningful regions. In addition, the first model is trained with only CE-Loss. Finally, all models' outputs with complementary knowledge are combined together for the final prediction result. We demonstrate the superiority of the proposed method and obtain state-of-the-art (SOTA) performances on three popular FGVC datasets. © 2021 IEEE

关键词： Distillation

Cross-Domain Person Re-identification Combining Feature Concatenation and Attention

学校读者我要写书评

暂无评论

Cross-Domain Person Re-identification Combining Feature Conc...

作者： Feng Pan Lin Wang Yansha Zhang Jie Wang College of Data Science and Information Engineering Guizhou Minzu University Key Laboratory of Pattern Recognition and Intelligent System of Guizhou Province

To improve the insufficient generalization and poor cross-domain capability of the existing direct cross-dataset person re-identification methods,a cross-domain person re-identification method combining feature concatenation and attention(FCANet) is *** deep features of the network are concatenated to complement the feature information and obtain discriminatively feature,and the position attention module is introduced to enhance the data feature representation capability of the cross-domain task,using the joint training network of label smooth cross-entropy loss and triplet loss,model training in the source domain,and directly deploy to the target domain for *** verify the performance of the proposed method,it was experimented on three public datasets of Market1501,DukeMTMC-reID and MSMT17,which mAP and Rank1can reach 51.4%and 62.7% on *** results show that the proposed method has good performance in improving the generalization of cross-domain tasks,and the recognition accuracy outperforms the domain generalization algorithms of comparison.

关键词：

A Garbage Classification Method Based on Improved YOLOv5

学校读者我要写书评

暂无评论

A Garbage Classification Method Based on Improved YOLOv5

2022 International Conference on Networks, Communications and Information Technology, CNCIT 2022

作者： Yan, Xiaobo Yang, Yuan Feng, Le Wang, Lin Tan, Mian Guizhou Minzu Unversity Key Laboratory of Pattern Recognition and Intelligent System of Guizhou Province. Guiyang China College of Date Science and Information Engineering. Guizhou Minzu Unversity Guiyang China

ISBN: (数字)9781665452960

ISBN: (纸本)9781665452960

With the development of national economy, people's daily garbage is increasing day by day. Relying on manpower to sort garbage is a heavy workload and low efficiency. In this paper, an automatic garbage classification system model based on computer vision is proposed to solve the above problems. Firstly, the YOLOv5 object detection algorithm is improved, and abandoning small object detection to obtain faster detection speed. Secondly, shortcut the complex network layer in the YOLOv5 algorithm framework to speed up the recognition speed while ensuring the recognition accuracy. Finally, the traditional loss function is used to improve the detection speed of large objects in raspberry pie. We collect various types of garbage data set, trains the model on the improved algorithm and tests it by embedding raspberry pie. The experimental results show that the proposed algorithm framework has the characteristics of fast speed and high precision, which can complete the garbage classification at the source of garbage. © 2022 IEEE.

关键词： Network layers

RBP-Former: Joint Prediction of RNA-protein Binding Sites on Full-length RNA Transcripts for Multiple RBPs

学校读者我要写书评

暂无评论

RBP-Former: Joint Prediction of RNA-protein Binding Sites on...

2024 IEEE International Conference on Bioinformatics and Biomedicine, BIBM 2024

作者： Li, Yichong Liu, Xiaojian Cheng, Fan Pan, Xiaoyong Yang, Yang Shanghai Jiao Tong University Department of Computer Science and Engineering Shanghai200240 China Shanghai Jiao Tong University Institute of Image Processing and Pattern Recognition Shanghai200240 China Ministry of Education of China Key Laboratory of System Control and Information Processing Shanghai200240 China Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Shanghai200240 China

ISBN: (纸本)9798350386226

RNA-binding proteins (RBPs) are essential for gene expression, and the complex RNA-protein interaction mechanisms require analysis of global RNA information. Therefore, accurate prediction of RBP binding sites on full-length RNA transcripts is crucial for understanding these mechanisms and their roles in diseases. While machine learning methods can predict RBP binding to RNA fragments, extending this to full-length transcripts presents challenges due to sequence length and data imbalance. In this paper, we introduce RBP-Former, a binding site joint prediction model designed specifically for full-length RNA transcripts that can be used for multiple RBPs. This model processes information at both coarse and fine-grained levels to fully exploit sequence data and its interactions with multiple RBPs. We develop multi-level imbalance learning strategies, achieving favorable results on imbalanced data. Our method outperforms existing methods in predicting binding sites on full-length RNA transcripts for multiple RBPs, demonstrating its effectiveness in handling imbalanced label and sample distributions. © 2024 IEEE.

关键词： Binding sites

Adaptive geodesic feedback controller design for the quadrotor

学校读者我要写书评

暂无评论

International Journal of Vehicle Autonomous systems 2020年第3-4期15卷 319-342页

作者： Liu, Chao Yang, Shengyi Key Laboratory of Pattern Recognition and Intelligent System Guizhou Minzu University Huaxi District Guizhou Province Guiyang China

In this paper, a non-linear adaptive control method based on SO(3) for the quadrotor attitude tracking is proposed. Distinct from other control methods on Euclidean space, the controller proposed is developed on SO(3), which can avoid singularities and ambiguities. Besides, the geodesic on SO(3) is used to provide the shortest curve between the current attitude configuration and the desire attitude configuration. Then, a feedback controller based on geodesic is derived. To solve the problem that the inertial tensor of the quadrotor is unknown, an adaptive term is designed with the Lyapunov theory to estimate the inertial tensor. Thus, the almost global asymptotic stability of the quadrotor attitude tracking is achieved without the prior knowledge of the quadrotor inertia tensor. The numerical simulations are utilised to show better control performance of the adaptive geodesic feedback control method. Copyright © 2020 Inderscience Enterprises Ltd.

关键词： Controllers

Unsupervised Attention Regularization Based Domain Adaptation for Oracle Character recognition

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Wang, Mei Deng, Weihong Hu, Jiani Su, Sen The Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China State Key Laboratory of Networking and Switching Technology Beijing University of Posts and Telecommunications Beijing100876 China

The study of oracle characters plays an important role in Chinese archaeology and philology. However, the difficulty of collecting and annotating real-world scanned oracle characters hinders the development of oracle character recognition. In this paper, we develop a novel unsupervised domain adaptation (UDA) method, i.e., unsupervised attention regularization network (UARN), to transfer recognition knowledge from labeled handprinted oracle characters to unlabeled scanned data. First, we experimentally prove that existing UDA methods are not always consistent with human priors and cannot achieve optimal performance on the target domain. For these oracle characters with flip-insensitivity and high inter-class similarity, model interpretations are not flip-consistent and class-separable. To tackle this challenge, we take into consideration visual perceptual plausibility when adapting. Specifically, our method enforces attention consistency between the original and flipped images to achieve the model robustness to flipping. Simultaneously, we constrain attention separability between the pseudo class and the most confusing class to improve the model discriminability. Extensive experiments demonstrate that UARN shows better interpretability and achieves state-of-the-art performance on Oracle-241 dataset, substantially outperforming the previously structure-texture separation network by 8.5%. © 2024, CC BY.

关键词： Character recognition

Detailed Object Description with Controllable Dimensions

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Wang, Xinran Zhang, Haiwen Li, Baoteng Liang, Kongming Sun, Hao He, Zhongjiang Ma, Zhanyu Guo, Jun Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China China Telecom Artificial Intelligence Technology Co. Ltd Beijing100034 China

Object description plays an important role for visually impaired individuals to understand and compare the differences between objects. Recent multimodal large language models (MLLMs) exhibit powerful perceptual abilities and demonstrate impressive potential for generating object-centric descriptions. However, the descriptions generated by such models may still usually contain a lot of content that is not relevant to the user intent or miss some important object dimension details. Under special scenarios, users may only need the details of certain dimensions of an object. In this paper, we propose a training-free object description refinement pipeline, Dimension Tailor, designed to enhance user-specified details in object descriptions. This pipeline includes three steps: dimension extracting, erasing, and supplementing, which decompose the description into user-specified dimensions. Dimension Tailor can not only improve the quality of object details but also offer flexibility in including or excluding specific dimensions based on user preferences. We conducted extensive experiments to demonstrate the effectiveness of Dimension Tailor on controllable object descriptions. Notably, the proposed pipeline can consistently improve the performance of the recent MLLMs. The code is currently accessible at https://***/xin-ran-w/ControllableObjectDescription. Copyright © 2024, The Authors. All rights reserved.

关键词： Visual languages

Multi-View Active Fine-Grained recognition

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Du, Ruoyi Yu, Wenqing Wang, Heqing Chang, Dongliang Lin, Ting-En Li, Yongbin Ma, Zhanyu Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China

As fine-grained visual classification (FGVC) being developed for decades, great works related have exposed a key direction - finding discriminative local regions and revealing subtle differences. However, unlike identifying visual contents within static images, for recognizing objects in the real physical world, discriminative information is not only present within seen local regions but also hides in other unseen perspectives. In other words, in addition to focusing on the distinguishable part from the whole, for efficient and accurate recognition, it is required to infer the key perspective with a few glances, e.g., people may recognize a "Benz AMG GT" with a glance of its front and then know that taking a look at its exhaust pipe can help to tell which year's model it is. In this paper, back to reality, we put forward the problem of active fine-grained recognition (AFGR) and complete this study in three steps: (i) a hierarchical, multi-view, fine-grained vehicle dataset is collected as the testbed, (ii) a simple experiment is designed to verify that different perspectives contribute differently for FGVC and different categories own different discriminative perspective, (iii) a policy-gradient-based framework is adopted to achieve efficient recognition with active view selection. Comprehensive experiments demonstrate that the proposed method delivers a better performance-efficient trade-off than previous FGVC methods and advanced neural networks. Codes are available at: https://***/PRIS-CV/AFGR. © 2022, CC BY.

关键词： Economic and social effects

Cycle Label-Consistent Networks for Unsupervised Domain Adaptation

学校读者我要写书评

暂无评论

arXiv 2022年

作者： Wang, Mei Deng, Weihong The Pattern Recognition and Intelligent System Laboratory School of Artificial Intelligence Beijing University of Posts and Telecommunications Beijing100876 China

Domain adaptation aims to leverage a labeled source domain to learn a classifier for the unlabeled target domain with a different distribution. Previous methods mostly match the distribution between two domains by global or class alignment. However, global alignment methods cannot achieve a fine-grained class-to-class overlap;class alignment methods supervised by pseudo-labels cannot guarantee their reliability. In this paper, we propose a simple yet efficient domain adaptation method, i.e. Cycle Label-Consistent Network (CLCN), by exploiting the cycle consistency of classification label, which applies dual cross-domain nearest centroid classification procedures to generate a reliable self-supervised signal for the discrimination in the target domain. The cycle label-consistent loss reinforces the consistency between ground-truth labels and pseudo-labels of source samples leading to statistically similar latent representations between source and target domains. This new loss can easily be added to any existing classification network with almost no computational overhead. We demonstrate the effectiveness of our approach on MNIST-USPS-SVHN, Office-31, Office-Home and Image CLEF-DA benchmarks. Results validate that the proposed method can alleviate the negative influence of falsely-labeled samples and learn more discriminative features, leading to the absolute improvement over source-only model by 9.4% on Office-31 and 6.3% on Image CLEF-DA. © 2022, CC BY.

关键词： Image enhancement