检索结果-内蒙古大学图书馆

MAEDAY: MAE for few- and zero-shot AnomalY-Detection

COMPUTER VISION AND IMAGE UNDERSTANDING 2024年 241卷

作者： Schwartz, Eli Arbelle, Assaf Karlinsky, Leonid Harary, Sivan Scheidegger, Florian Doveh, Sivan Giryes, Raja IBM Res Haifa Israel Tel Aviv Univ Tel Aviv Israel MIT IBM Watson AI Lab Cambridge MA USA

We propose using masked Auto -Encoder (MAE), a transformer model self-supervisedly trained on image inpainting, for anomaly detection (AD). Assuming anomalous regions are harder to reconstruct compared with normal regions. MAEDAY is the first image-reconstruction-based anomaly detection method that utilizes a pre-trained model, enabling its use for Few-Shot Anomaly Detection (FSAD). We also show the same method works surprisingly well for the novel tasks of Zero-Shot AD (ZSAD) and Zero-Shot Foreign Object Detection (ZSFOD), where no normal samples are available.

关键词： Anomaly-detection masked autoencoder Foreign object detection

来源：评论

学校读者我要写书评

暂无评论

Cross-Dataset Model Training for Hyperspectral Image Classification Using Self-Supervised Learning

引用

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2024年 62卷

作者： Bai, Jing Zhou, Zichen Chen, Zheng Xiao, Zhu Wei, Erlong Wen, Yihong Jiao, Licheng Xidian Univ Sch Artificial Intelligence Minist Educ Xian 710071 Peoples R China Xidian Univ Key Lab Intelligent Percept & Image Understanding Minist Educ Xian 710071 Peoples R China Xiaomi Shanghai 200030 Peoples R China Hunan Univ Chongqing Res Inst Changsha 410082 Peoples R China Hunan Univ Coll Comp Sci & Elect Engn Changsha 410082 Peoples R China 54th Res Inst CETC Shijiazhuang 050081 Hebei Peoples R China

With the development of deep learning and the increase in the amount of data, general artificial intelligence models have become a popular research area nowadays. When facing a new application scenario, a pretraining general model can often show better performance than models trained with new data on its own. However, because of the specificity of the differences in hyperspectral image data bands, the current hyperspectral image classification (HSIC) field has not proposed a better general model training solution, and it is difficult to utilize the information of the existing hyperspectral datasets for model training in the face of a new scenario. In order to solve this problem, this article proposes a generalized hyperspectral classification model training method, which effectively completes the training of hyperspectral classification models across datasets by adaptive channel module and masked self-supervised pretraining method, and can pretrain and fine-tune hyperspectral classification models using multiple datasets. The adaptive channel module is able to solve the band difference problem of using hyperspectral datasets across datasets, and the masked self-supervised learning method solves the label difference and labeling difficulties of training models across datasets. Experimental results on multiple datasets show that the method proposed in this article can effectively use a large amount of data to complete the pretraining of hyperspectral classification models, and the fine-tuning results on downstream datasets have certain advantages relative to current advanced deep learning methods.

关键词： Hyperspectral imaging Training Data models Feature extraction Image classification Adaptation models Transformers Self-supervised learning Supervised learning Deep learning Classification cross dataset general model hyperspectral image masked autoencoder self-supervised learning

来源：评论

学校读者我要写书评

暂无评论

masked Graph Convolutional Network for Small Sample Classification of Hyperspectral Images

引用

REMOTE SENSING 2023年第7期15卷

作者： Liu, Wenkai Liu, Bing He, Peipei Hu, Qingfeng Gao, Kuiliang Li, Hui North China Univ Water Resources & Elect Power Coll Surveying & Geoinformat Zhengzhou 450046 Peoples R China PLA Strateg Support Force Informat Engn Univ Inst Geospatial Informat Zhengzhou 450002 Peoples R China

The deep learning method has achieved great success in hyperspectral image classification, but the lack of labeled training samples still restricts the development and application of deep learning methods. In order to deal with the problem of small samples in hyperspectral image classification, a novel small sample classification method based on rotation-invariant uniform local binary pattern (RULBP) features and a graph-based masked autoencoder is proposed in this paper. Firstly, the RULBP features of hyperspectral images are extracted, and then the k-nearest neighbor method is utilized to construct the graph. Furthermore, self-supervised learning is conducted on the constructed graph so that the model can learn to extract features more suitable for small sample classification. Since the self-supervised training mainly adopts the masked autoencoder method, only unlabeled samples are needed to complete the training. After training, only a small number of samples are used to fine-tune the graph convolutional network, so as to complete the classification of all nodes in the graph. A large number of classification experiments on three commonly used hyperspectral image datasets show that the proposed method could achieve higher classification accuracy with fewer labeled samples.

关键词： hyperspectral image classification graph convolutional network masked autoencoder rotation-invariant uniform local binary pattern

来源：评论

学校读者我要写书评

暂无评论

Pose Mask: A Model-Based Augmentation Method for 2D Pose Estimation in Classroom Scenes Using Surveillance Images

引用

SENSORS 2022年第21期22卷 8331页

作者： Liu, Shichang Ma, Miao Li, Haiyang Ning, Hanyang Wang, Min Shaanxi Normal Univ Sch Comp Sci Xian 710119 Peoples R China Minist Educ Key Lab Modern Teaching Technol Xian 710062 Peoples R China Natl Engn Lab Integrated Aero Space Ground Ocean Xian 710072 Peoples R China

Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene;furthermore, no specific knowledge is considered in the design of image augmentation for pose estimation. A masked autoencoder was shown to have a non-negligible capability in image reconstruction, where the masking mechanism that randomly drops patches forces the model to build unknown pixels from known pixels. Inspired by this self-supervised learning method, where the restoration of the feature loss induced by the mask is consistent with tackling the occlusion problem in classroom scenarios, we discovered that the transfer performance of the pre-trained weights could be used as a model-based augmentation to overcome the intractable occlusion in classroom pose estimation. In this study, we proposed a top-down pose estimation method that utilized the natural reconstruction capability of missing information of the MAE as an effective occluded image augmentation in a pose estimation task. The difference with the original MAE was that instead of using a 75% random mask ratio, we regarded the keypoint distribution probabilistic heatmap as a reference for masking, which we named Pose Mask. To test the performance of our method in heavily occluded classroom scenes, we collected a new dataset for pose estimation in classroom scenes named Class Pose and conducted many experiments, the results of which showed promising performance.

关键词： pose estimation masked autoencoder model-based augmentation classroom scenes

来源：评论

学校读者我要写书评

暂无评论

A blood cell classification method based on MAE and active learning

引用

BIOMEDICAL SIGNAL PROCESSING AND CONTROL 2024年 90卷

作者： Lu, Qinghang Wang, Bangyao He, Quanhui Zhang, Qingmao Guo, Liang Li, Jiaming Li, Jie Ma, Qiongxiong South China Normal Univ Sch Informat & Optoelect Sci & Engn Guangdong Prov Key Lab Nanophoton Funct Mat & Devi Guangzhou 510006 Peoples R China Southern Med Univ Nanfang Hosp Dept Hematol Guangzhou 510515 Peoples R China

Cell morphology analysis is a crucial diagnostic tool for identifying blood diseases, including acute leukemia. However, the traditional analysis process is time-consuming and requires significant investment in labor and expertise from laboratory doctors. In recent years, deep learning-based automatic blood cell classification techniques have gained popularity. But acquiring image data and annotations in the medical field is often challenging and costly. With the increasing use of deep learning techniques in clinical practice, it has become vital to ensure both accuracy and high-quality annotations. To address these challenges, this paper proposes a blood cell classification method based on masked autoencoder (MAE) and active learning (AL), namely MAE4AL. This method utilizes the self-supervised loss of MAE and sample uncertainty to select the most valuable samples for labeling. A comprehensive comparison is conducted between our method and the state -of-the-art blood cell classification technique, which employed ResNeXt. Remarkably, our proposed approach achieves comparable classification performance to ResNeXt when utilizing only 20% of the labeled data. When employing half of the labeled data, our method achieves a classification accuracy of 96.36%, surpassing the ResNeXt model trained with 100% labeled data by 0.79%.

关键词： Deep learning Blood cell classification masked autoencoder Active learning

来源：评论

学校读者我要写书评

暂无评论

A masked graph neural network model for real-time gastric polyp detection in Healthcare 4.0

引用

JOURNAL OF INDUSTRIAL INFORMATION INTEGRATION 2024年 34卷

作者： Huang, Junjun Saw, Shier Nee He, Tianran Feng, Wei Loo, Chu Kiong Univ Malaya Fac Comp Sci & Informat Technol Kuala Lumpur 50603 Malaysia Annoland Technol PTE LTD Singapore 068902 Singapore Tongji Univ Sch Elect & Informat Engn Shanghai 201804 Peoples R China Fuzhou Univ Zhicheng Coll Fac FinTech Fuzhou 350000 Peoples R China

The emergence of Healthcare 4.0 brings convenience to the diagnosis of gastric polyps patients. The computer aided gastric polyp detection model can automatically locate the position of gastric polyps in gastroscopic images, which helps endoscopists to detect gastric polyps in time and reduce the rate of missed diagnosis. The deep learning model has achieved remarkable success in the field of gastroscopic images, however, it still has the following problems to be solved. Firstly, the model based on the convolutional neural network only analyzes the underlying pixels of the gastroscopic image to locate the polyp, which does not take into account the spatial and positional information contained in the anatomical structure of the gastroscopic image. Secondly, although the number of gastroscopic images is huge, the number of manually annotated gastric polyp images is very small, which makes the deep learning model prone to overfitting. Therefore, in this work, we propose a masked graph neural network model (MGNN) for real-time detecting the location of polyps in gastroscopic images in the Healthcare 4.0. The MGNN model novelly utilizes the graph structure and graph convolution operations to extract spatial location information and semantic information of the gastroscopic images. The information from masked self-training is additionally considered in the prediction value stage to compensate for the deficiency in the number of manually labeled gastric polyp images. In this way, the MGNN model can automatically learn the essential features of gastroscopic images without labeling data. The effectiveness of the MGNN model has been verified on real gastroscope images.

关键词： Gastric polyp detection Graph neural network masked autoencoder Convolutional neural network

来源：评论

学校读者我要写书评

暂无评论

Instance-aware diversity feature generation for unsupervised person re-identification

引用

DISPLAYS 2024年 83卷

作者： Zhang, Xiaowei Dou, Xiao Zhao, Xinpeng Li, Guocong Wang, Zekang Qingdao Univ Sch Comp Sci & Technol Qingdao 266071 Peoples R China Shandong Univ Sch Comp Sci & Technol Qingdao 266237 Peoples R China

Unsupervised person re -identification (Re -ID) methods have made significant progress by exploiting contrastive learning from unlabeled data. However, the previous approaches including cluster -level or instance -level contrast loss, did not fully explore inherent commonality of each identified individual from unlabeled samples, where the divergence of individual cluster and convergence of different clusters leads to a set of noisy pseudo labels which may result in label noise accumulation. To address this issue, we propose an instance -aware diversity feature generation (IDFG) framework, which can form a stable clustering feature space by exhuming diverse counterparts of given exemplars to update memory dictionary of each cluster, so as to reduce the effect of noisy labels. Specifically, we combines instance segmentation and masked auto -encoder to generate foreground -invariant diversity counterparts of given exemplars to reduce inter -class convergence caused by background similarity between different identification instances. Further, we introduce an instance -aware diversity feature mining module, which gradually creates more reliable clusters to generate more robust pseudo labels by exploiting the compactness and independence of clustering to update the memory dictionary. Extensive experiments demonstrate that the proposed IDFG framework achieves impressive performances of 85.6%, 73.7%, and 31.0% mAP on Market1501, DukeMTMC-reID and MSMT17, respectively.

关键词： Contrastive learning Diversity features masked autoencoder Pseudo label Unsupervised re-ID

来源：评论

学校读者我要写书评

暂无评论

Single-Image Super-Resolution Method for Rotating Synthetic Aperture System Using Masking Mechanism

引用

REMOTE SENSING 2024年第9期16卷

作者： Sun, Yu Zhi, Xiyang Jiang, Shikai Shi, Tianjun Song, Jiachun Yang, Jiawei Wang, Shengao Zhang, Wei Harbin Inst Technol Res Ctr Space Opt Engn Harbin 150001 Peoples R China Boston Univ Div Syst Engn Boston MA 02215 USA

The emerging technology of rotating synthetic aperture (RSA) presents a promising solution for the development of lightweight, large-aperture, and high-resolution optical remote sensing systems in geostationary orbit. However, the rectangular shape of the primary mirror and the distinctive imaging mechanism involving the continuous rotation of the mirror lead to a pronounced decline in image resolution along the shorter side of the rectangle compared to the longer side. The resolution also exhibits periodic time-varying characteristics. To address these limitations and enhance image quality, we begin by analyzing the imaging mechanism of the RSA system. Subsequently, we propose a single-image super-resolution method that utilizes a rotated varied-size window attention mechanism instead of full attention, based on the Vision Transformer architecture. We employ a two-stage training methodology for the network, where we pre-train it on images masked with stripe-shaped masks along the shorter side of the rectangular pupil. Following that, we fine-tune the network using unmasked images. Through the strip-wise mask sampling strategy, this two-stage training approach effectively circumvents the interference of lower confidence (clarity) information and outperforms training the network from scratch using the unmasked degraded images. Our digital simulation and semi-physical imaging experiments demonstrate that the proposed method achieves satisfactory performance. This work establishes a valuable reference for future space applications of the RSA system.

关键词： optical remote sensing super-resolution (SR) rotating synthetic aperture masked autoencoder vision transformer rectangular pupil

来源：评论

学校读者我要写书评

暂无评论

DeepVATS: Deep Visual Analytics for Time Series

引用

KNOWLEDGE-BASED SYSTEMS 2023年 277卷

作者： Rodriguez-Fernandez, Victor Montalvo-Garcia, David Piccialli, Francesco Nalepa, Grzegorz J. Camacho, David Univ Politecn Madrid Sch Comp Syst Engn Calle Alan Turing Madrid 28038 Spain Univ Politecn Madrid Escuela Tecn Super Ingn Telecomunicac Ave Complutense 30 Madrid 28040 Spain Univ Naples Federico II Dept Math & Applicat R Caccioppoli Naples Italy Jagiellonian Univ Inst Appl Comp Sci Jagiellonian Human Ctr Artificial Intelligence La PL-30348 Krakow Poland

The field of Deep Visual Analytics (DVA) has recently arisen from the idea of developing Visual Interactive Systems supported by deep learning, in order to provide them with large-scale data processing capabilities and to unify their implementation across different data and domains. In this paper we present DeepVATS, an open-source tool that brings the field of DVA into time series data. DeepVATS trains, in a self-supervised way, a masked time series autoencoder that reconstructs patches of a time series, and projects the knowledge contained in the embeddings of that model in an interactive plot, from which time series patterns and anomalies emerge and can be easily spotted. The tool includes a back-end for data processing pipeline and model training, as well as a front-end with an interactive user interface. We report on results that validate the utility of DeepVATS, running experiments on both synthetic and real datasets. The code is publicly available on https: //***/vrodriguezf/deepvats.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).

关键词： Deep learning Visual analytics Time series masked autoencoder

来源：评论

学校读者我要写书评

暂无评论

Self-supervised learning with self-distillation on COVID-19 medical image classification

引用

COMPUTER METHODS AND PROGRAMS IN BIOMEDICINE 2024年 243卷 107876-107876页

作者： Tan, Zhiyong Yu, Yuhai Meng, Jiana Liu, Shuang Li, Wei Dalian Minzu Univ Sch Comp Sci & Engn Dalian 116600 Liaoning Peoples R China

Background and objective: Currently, COVID-19 is a highly infectious disease that can be clinically diagnosed based on diagnostic radiology. Deep learning is capable of mining the rich information implied in inpatient imaging data and accomplishing the classification of different stages of the disease process. However, a large amount of training data is essential to train an excellent deep-learning model. Unfortunately, due to factors such as privacy and labeling difficulties, annotated data for COVID-19 is extremely scarce, which encourages us to propose a more effective deep learning model that can effectively assist specialist physicians in COVID-19 diagnosis. Methods: In this study,we introduce masked autoencoder (MAE) for pre-training and fine-tuning directly on small-scale target datasets. Based on this, we propose Self-Supervised Learning with Self-Distillation on COVID19 medical image classification (SSSD-COVID). In addition to the reconstruction loss computation on the masked image patches, SSSD-COVID further performs self-distillation loss calculations on the latent representation of the encoder and decoder outputs. The additional loss calculation can transfer the knowledge from the global attention of the decoder to the encoder which acquires only local attention. Results: Our model achieves 97.78 % recognition accuracy on the SARS-COV-CT dataset containing 2481 images and is further validated on the COVID-CT dataset containing 746 images, which achieves 81.76 % recognition accuracy. Further introduction of external knowledge resulted in experimental accuracies of 99.6% and 95.27 % on these two datasets, respectively. Conclusions: SSSD-COVID can obtain good results on the target dataset alone, and when external information is introduced, the performance of the model can be further improved to significantly outperform other models. Overall, the experimental results show that our method can effectively mine COVID-19 features from rare data and can assist pr

关键词： COVID-19 masked autoencoder Self-supervised learning Chest CT Self-distillation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：