We propose using masked Auto -Encoder (MAE), a transformer model self-supervisedly trained on image inpainting, for anomaly detection (AD). Assuming anomalous regions are harder to reconstruct compared with normal reg...
详细信息
We propose using masked Auto -Encoder (MAE), a transformer model self-supervisedly trained on image inpainting, for anomaly detection (AD). Assuming anomalous regions are harder to reconstruct compared with normal regions. MAEDAY is the first image-reconstruction-based anomaly detection method that utilizes a pre-trained model, enabling its use for Few-Shot Anomaly Detection (FSAD). We also show the same method works surprisingly well for the novel tasks of Zero-Shot AD (ZSAD) and Zero-Shot Foreign Object Detection (ZSFOD), where no normal samples are available.
With the development of deep learning and the increase in the amount of data, general artificial intelligence models have become a popular research area nowadays. When facing a new application scenario, a pretraining ...
详细信息
With the development of deep learning and the increase in the amount of data, general artificial intelligence models have become a popular research area nowadays. When facing a new application scenario, a pretraining general model can often show better performance than models trained with new data on its own. However, because of the specificity of the differences in hyperspectral image data bands, the current hyperspectral image classification (HSIC) field has not proposed a better general model training solution, and it is difficult to utilize the information of the existing hyperspectral datasets for model training in the face of a new scenario. In order to solve this problem, this article proposes a generalized hyperspectral classification model training method, which effectively completes the training of hyperspectral classification models across datasets by adaptive channel module and masked self-supervised pretraining method, and can pretrain and fine-tune hyperspectral classification models using multiple datasets. The adaptive channel module is able to solve the band difference problem of using hyperspectral datasets across datasets, and the masked self-supervised learning method solves the label difference and labeling difficulties of training models across datasets. Experimental results on multiple datasets show that the method proposed in this article can effectively use a large amount of data to complete the pretraining of hyperspectral classification models, and the fine-tuning results on downstream datasets have certain advantages relative to current advanced deep learning methods.
The deep learning method has achieved great success in hyperspectral image classification, but the lack of labeled training samples still restricts the development and application of deep learning methods. In order to...
详细信息
The deep learning method has achieved great success in hyperspectral image classification, but the lack of labeled training samples still restricts the development and application of deep learning methods. In order to deal with the problem of small samples in hyperspectral image classification, a novel small sample classification method based on rotation-invariant uniform local binary pattern (RULBP) features and a graph-based masked autoencoder is proposed in this paper. Firstly, the RULBP features of hyperspectral images are extracted, and then the k-nearest neighbor method is utilized to construct the graph. Furthermore, self-supervised learning is conducted on the constructed graph so that the model can learn to extract features more suitable for small sample classification. Since the self-supervised training mainly adopts the masked autoencoder method, only unlabeled samples are needed to complete the training. After training, only a small number of samples are used to fine-tune the graph convolutional network, so as to complete the classification of all nodes in the graph. A large number of classification experiments on three commonly used hyperspectral image datasets show that the proposed method could achieve higher classification accuracy with fewer labeled samples.
Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene;furthermore, no specific knowledge is considered in the desi...
详细信息
Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene;furthermore, no specific knowledge is considered in the design of image augmentation for pose estimation. A masked autoencoder was shown to have a non-negligible capability in image reconstruction, where the masking mechanism that randomly drops patches forces the model to build unknown pixels from known pixels. Inspired by this self-supervised learning method, where the restoration of the feature loss induced by the mask is consistent with tackling the occlusion problem in classroom scenarios, we discovered that the transfer performance of the pre-trained weights could be used as a model-based augmentation to overcome the intractable occlusion in classroom pose estimation. In this study, we proposed a top-down pose estimation method that utilized the natural reconstruction capability of missing information of the MAE as an effective occluded image augmentation in a pose estimation task. The difference with the original MAE was that instead of using a 75% random mask ratio, we regarded the keypoint distribution probabilistic heatmap as a reference for masking, which we named Pose Mask. To test the performance of our method in heavily occluded classroom scenes, we collected a new dataset for pose estimation in classroom scenes named Class Pose and conducted many experiments, the results of which showed promising performance.
Cell morphology analysis is a crucial diagnostic tool for identifying blood diseases, including acute leukemia. However, the traditional analysis process is time-consuming and requires significant investment in labor ...
详细信息
Cell morphology analysis is a crucial diagnostic tool for identifying blood diseases, including acute leukemia. However, the traditional analysis process is time-consuming and requires significant investment in labor and expertise from laboratory doctors. In recent years, deep learning-based automatic blood cell classification techniques have gained popularity. But acquiring image data and annotations in the medical field is often challenging and costly. With the increasing use of deep learning techniques in clinical practice, it has become vital to ensure both accuracy and high-quality annotations. To address these challenges, this paper proposes a blood cell classification method based on masked autoencoder (MAE) and active learning (AL), namely MAE4AL. This method utilizes the self-supervised loss of MAE and sample uncertainty to select the most valuable samples for labeling. A comprehensive comparison is conducted between our method and the state -of-the-art blood cell classification technique, which employed ResNeXt. Remarkably, our proposed approach achieves comparable classification performance to ResNeXt when utilizing only 20% of the labeled data. When employing half of the labeled data, our method achieves a classification accuracy of 96.36%, surpassing the ResNeXt model trained with 100% labeled data by 0.79%.
The emergence of Healthcare 4.0 brings convenience to the diagnosis of gastric polyps patients. The computer aided gastric polyp detection model can automatically locate the position of gastric polyps in gastroscopic ...
详细信息
The emergence of Healthcare 4.0 brings convenience to the diagnosis of gastric polyps patients. The computer aided gastric polyp detection model can automatically locate the position of gastric polyps in gastroscopic images, which helps endoscopists to detect gastric polyps in time and reduce the rate of missed diagnosis. The deep learning model has achieved remarkable success in the field of gastroscopic images, however, it still has the following problems to be solved. Firstly, the model based on the convolutional neural network only analyzes the underlying pixels of the gastroscopic image to locate the polyp, which does not take into account the spatial and positional information contained in the anatomical structure of the gastroscopic image. Secondly, although the number of gastroscopic images is huge, the number of manually annotated gastric polyp images is very small, which makes the deep learning model prone to overfitting. Therefore, in this work, we propose a masked graph neural network model (MGNN) for real-time detecting the location of polyps in gastroscopic images in the Healthcare 4.0. The MGNN model novelly utilizes the graph structure and graph convolution operations to extract spatial location information and semantic information of the gastroscopic images. The information from masked self-training is additionally considered in the prediction value stage to compensate for the deficiency in the number of manually labeled gastric polyp images. In this way, the MGNN model can automatically learn the essential features of gastroscopic images without labeling data. The effectiveness of the MGNN model has been verified on real gastroscope images.
Unsupervised person re -identification (Re -ID) methods have made significant progress by exploiting contrastive learning from unlabeled data. However, the previous approaches including cluster -level or instance -lev...
详细信息
Unsupervised person re -identification (Re -ID) methods have made significant progress by exploiting contrastive learning from unlabeled data. However, the previous approaches including cluster -level or instance -level contrast loss, did not fully explore inherent commonality of each identified individual from unlabeled samples, where the divergence of individual cluster and convergence of different clusters leads to a set of noisy pseudo labels which may result in label noise accumulation. To address this issue, we propose an instance -aware diversity feature generation (IDFG) framework, which can form a stable clustering feature space by exhuming diverse counterparts of given exemplars to update memory dictionary of each cluster, so as to reduce the effect of noisy labels. Specifically, we combines instance segmentation and masked auto -encoder to generate foreground -invariant diversity counterparts of given exemplars to reduce inter -class convergence caused by background similarity between different identification instances. Further, we introduce an instance -aware diversity feature mining module, which gradually creates more reliable clusters to generate more robust pseudo labels by exploiting the compactness and independence of clustering to update the memory dictionary. Extensive experiments demonstrate that the proposed IDFG framework achieves impressive performances of 85.6%, 73.7%, and 31.0% mAP on Market1501, DukeMTMC-reID and MSMT17, respectively.
The emerging technology of rotating synthetic aperture (RSA) presents a promising solution for the development of lightweight, large-aperture, and high-resolution optical remote sensing systems in geostationary orbit....
详细信息
The emerging technology of rotating synthetic aperture (RSA) presents a promising solution for the development of lightweight, large-aperture, and high-resolution optical remote sensing systems in geostationary orbit. However, the rectangular shape of the primary mirror and the distinctive imaging mechanism involving the continuous rotation of the mirror lead to a pronounced decline in image resolution along the shorter side of the rectangle compared to the longer side. The resolution also exhibits periodic time-varying characteristics. To address these limitations and enhance image quality, we begin by analyzing the imaging mechanism of the RSA system. Subsequently, we propose a single-image super-resolution method that utilizes a rotated varied-size window attention mechanism instead of full attention, based on the Vision Transformer architecture. We employ a two-stage training methodology for the network, where we pre-train it on images masked with stripe-shaped masks along the shorter side of the rectangular pupil. Following that, we fine-tune the network using unmasked images. Through the strip-wise mask sampling strategy, this two-stage training approach effectively circumvents the interference of lower confidence (clarity) information and outperforms training the network from scratch using the unmasked degraded images. Our digital simulation and semi-physical imaging experiments demonstrate that the proposed method achieves satisfactory performance. This work establishes a valuable reference for future space applications of the RSA system.
The field of Deep Visual Analytics (DVA) has recently arisen from the idea of developing Visual Interactive Systems supported by deep learning, in order to provide them with large-scale data processing capabilities an...
详细信息
The field of Deep Visual Analytics (DVA) has recently arisen from the idea of developing Visual Interactive Systems supported by deep learning, in order to provide them with large-scale data processing capabilities and to unify their implementation across different data and domains. In this paper we present DeepVATS, an open-source tool that brings the field of DVA into time series data. DeepVATS trains, in a self-supervised way, a masked time series autoencoder that reconstructs patches of a time series, and projects the knowledge contained in the embeddings of that model in an interactive plot, from which time series patterns and anomalies emerge and can be easily spotted. The tool includes a back-end for data processing pipeline and model training, as well as a front-end with an interactive user interface. We report on results that validate the utility of DeepVATS, running experiments on both synthetic and real datasets. The code is publicly available on https: //***/vrodriguezf/deepvats.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).
Background and objective: Currently, COVID-19 is a highly infectious disease that can be clinically diagnosed based on diagnostic radiology. Deep learning is capable of mining the rich information implied in inpatient...
详细信息
Background and objective: Currently, COVID-19 is a highly infectious disease that can be clinically diagnosed based on diagnostic radiology. Deep learning is capable of mining the rich information implied in inpatient imaging data and accomplishing the classification of different stages of the disease process. However, a large amount of training data is essential to train an excellent deep-learning model. Unfortunately, due to factors such as privacy and labeling difficulties, annotated data for COVID-19 is extremely scarce, which encourages us to propose a more effective deep learning model that can effectively assist specialist physicians in COVID-19 diagnosis. Methods: In this study,we introduce masked autoencoder (MAE) for pre-training and fine-tuning directly on small-scale target datasets. Based on this, we propose Self-Supervised Learning with Self-Distillation on COVID19 medical image classification (SSSD-COVID). In addition to the reconstruction loss computation on the masked image patches, SSSD-COVID further performs self-distillation loss calculations on the latent representation of the encoder and decoder outputs. The additional loss calculation can transfer the knowledge from the global attention of the decoder to the encoder which acquires only local attention. Results: Our model achieves 97.78 % recognition accuracy on the SARS-COV-CT dataset containing 2481 images and is further validated on the COVID-CT dataset containing 746 images, which achieves 81.76 % recognition accuracy. Further introduction of external knowledge resulted in experimental accuracies of 99.6% and 95.27 % on these two datasets, respectively. Conclusions: SSSD-COVID can obtain good results on the target dataset alone, and when external information is introduced, the performance of the model can be further improved to significantly outperform other models. Overall, the experimental results show that our method can effectively mine COVID-19 features from rare data and can assist pr
暂无评论