Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene;furthermore, no specific knowledge is considered in the desi...
详细信息
Solid developments have been seen in deep-learning-based pose estimation, but few works have explored performance in dense crowds, such as a classroom scene;furthermore, no specific knowledge is considered in the design of image augmentation for pose estimation. A masked autoencoder was shown to have a non-negligible capability in image reconstruction, where the masking mechanism that randomly drops patches forces the model to build unknown pixels from known pixels. Inspired by this self-supervised learning method, where the restoration of the feature loss induced by the mask is consistent with tackling the occlusion problem in classroom scenarios, we discovered that the transfer performance of the pre-trained weights could be used as a model-based augmentation to overcome the intractable occlusion in classroom pose estimation. In this study, we proposed a top-down pose estimation method that utilized the natural reconstruction capability of missing information of the MAE as an effective occluded image augmentation in a pose estimation task. The difference with the original MAE was that instead of using a 75% random mask ratio, we regarded the keypoint distribution probabilistic heatmap as a reference for masking, which we named Pose Mask. To test the performance of our method in heavily occluded classroom scenes, we collected a new dataset for pose estimation in classroom scenes named Class Pose and conducted many experiments, the results of which showed promising performance.
The emergence of Healthcare 4.0 brings convenience to the diagnosis of gastric polyps patients. The computer aided gastric polyp detection model can automatically locate the position of gastric polyps in gastroscopic ...
详细信息
The emergence of Healthcare 4.0 brings convenience to the diagnosis of gastric polyps patients. The computer aided gastric polyp detection model can automatically locate the position of gastric polyps in gastroscopic images, which helps endoscopists to detect gastric polyps in time and reduce the rate of missed diagnosis. The deep learning model has achieved remarkable success in the field of gastroscopic images, however, it still has the following problems to be solved. Firstly, the model based on the convolutional neural network only analyzes the underlying pixels of the gastroscopic image to locate the polyp, which does not take into account the spatial and positional information contained in the anatomical structure of the gastroscopic image. Secondly, although the number of gastroscopic images is huge, the number of manually annotated gastric polyp images is very small, which makes the deep learning model prone to overfitting. Therefore, in this work, we propose a masked graph neural network model (MGNN) for real-time detecting the location of polyps in gastroscopic images in the Healthcare 4.0. The MGNN model novelly utilizes the graph structure and graph convolution operations to extract spatial location information and semantic information of the gastroscopic images. The information from masked self-training is additionally considered in the prediction value stage to compensate for the deficiency in the number of manually labeled gastric polyp images. In this way, the MGNN model can automatically learn the essential features of gastroscopic images without labeling data. The effectiveness of the MGNN model has been verified on real gastroscope images.
Cell morphology analysis is a crucial diagnostic tool for identifying blood diseases, including acute leukemia. However, the traditional analysis process is time-consuming and requires significant investment in labor ...
详细信息
Cell morphology analysis is a crucial diagnostic tool for identifying blood diseases, including acute leukemia. However, the traditional analysis process is time-consuming and requires significant investment in labor and expertise from laboratory doctors. In recent years, deep learning-based automatic blood cell classification techniques have gained popularity. But acquiring image data and annotations in the medical field is often challenging and costly. With the increasing use of deep learning techniques in clinical practice, it has become vital to ensure both accuracy and high-quality annotations. To address these challenges, this paper proposes a blood cell classification method based on masked autoencoder (MAE) and active learning (AL), namely MAE4AL. This method utilizes the self-supervised loss of MAE and sample uncertainty to select the most valuable samples for labeling. A comprehensive comparison is conducted between our method and the state -of-the-art blood cell classification technique, which employed ResNeXt. Remarkably, our proposed approach achieves comparable classification performance to ResNeXt when utilizing only 20% of the labeled data. When employing half of the labeled data, our method achieves a classification accuracy of 96.36%, surpassing the ResNeXt model trained with 100% labeled data by 0.79%.
Unsupervised person re -identification (Re -ID) methods have made significant progress by exploiting contrastive learning from unlabeled data. However, the previous approaches including cluster -level or instance -lev...
详细信息
Unsupervised person re -identification (Re -ID) methods have made significant progress by exploiting contrastive learning from unlabeled data. However, the previous approaches including cluster -level or instance -level contrast loss, did not fully explore inherent commonality of each identified individual from unlabeled samples, where the divergence of individual cluster and convergence of different clusters leads to a set of noisy pseudo labels which may result in label noise accumulation. To address this issue, we propose an instance -aware diversity feature generation (IDFG) framework, which can form a stable clustering feature space by exhuming diverse counterparts of given exemplars to update memory dictionary of each cluster, so as to reduce the effect of noisy labels. Specifically, we combines instance segmentation and masked auto -encoder to generate foreground -invariant diversity counterparts of given exemplars to reduce inter -class convergence caused by background similarity between different identification instances. Further, we introduce an instance -aware diversity feature mining module, which gradually creates more reliable clusters to generate more robust pseudo labels by exploiting the compactness and independence of clustering to update the memory dictionary. Extensive experiments demonstrate that the proposed IDFG framework achieves impressive performances of 85.6%, 73.7%, and 31.0% mAP on Market1501, DukeMTMC-reID and MSMT17, respectively.
The emerging technology of rotating synthetic aperture (RSA) presents a promising solution for the development of lightweight, large-aperture, and high-resolution optical remote sensing systems in geostationary orbit....
详细信息
The emerging technology of rotating synthetic aperture (RSA) presents a promising solution for the development of lightweight, large-aperture, and high-resolution optical remote sensing systems in geostationary orbit. However, the rectangular shape of the primary mirror and the distinctive imaging mechanism involving the continuous rotation of the mirror lead to a pronounced decline in image resolution along the shorter side of the rectangle compared to the longer side. The resolution also exhibits periodic time-varying characteristics. To address these limitations and enhance image quality, we begin by analyzing the imaging mechanism of the RSA system. Subsequently, we propose a single-image super-resolution method that utilizes a rotated varied-size window attention mechanism instead of full attention, based on the Vision Transformer architecture. We employ a two-stage training methodology for the network, where we pre-train it on images masked with stripe-shaped masks along the shorter side of the rectangular pupil. Following that, we fine-tune the network using unmasked images. Through the strip-wise mask sampling strategy, this two-stage training approach effectively circumvents the interference of lower confidence (clarity) information and outperforms training the network from scratch using the unmasked degraded images. Our digital simulation and semi-physical imaging experiments demonstrate that the proposed method achieves satisfactory performance. This work establishes a valuable reference for future space applications of the RSA system.
The field of Deep Visual Analytics (DVA) has recently arisen from the idea of developing Visual Interactive Systems supported by deep learning, in order to provide them with large-scale data processing capabilities an...
详细信息
The field of Deep Visual Analytics (DVA) has recently arisen from the idea of developing Visual Interactive Systems supported by deep learning, in order to provide them with large-scale data processing capabilities and to unify their implementation across different data and domains. In this paper we present DeepVATS, an open-source tool that brings the field of DVA into time series data. DeepVATS trains, in a self-supervised way, a masked time series autoencoder that reconstructs patches of a time series, and projects the knowledge contained in the embeddings of that model in an interactive plot, from which time series patterns and anomalies emerge and can be easily spotted. The tool includes a back-end for data processing pipeline and model training, as well as a front-end with an interactive user interface. We report on results that validate the utility of DeepVATS, running experiments on both synthetic and real datasets. The code is publicly available on https: //***/vrodriguezf/deepvats.(c) 2023 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY-NC-ND license (http://***/licenses/by-nc-nd/4.0/).
Background and objective: Currently, COVID-19 is a highly infectious disease that can be clinically diagnosed based on diagnostic radiology. Deep learning is capable of mining the rich information implied in inpatient...
详细信息
Background and objective: Currently, COVID-19 is a highly infectious disease that can be clinically diagnosed based on diagnostic radiology. Deep learning is capable of mining the rich information implied in inpatient imaging data and accomplishing the classification of different stages of the disease process. However, a large amount of training data is essential to train an excellent deep-learning model. Unfortunately, due to factors such as privacy and labeling difficulties, annotated data for COVID-19 is extremely scarce, which encourages us to propose a more effective deep learning model that can effectively assist specialist physicians in COVID-19 diagnosis. Methods: In this study,we introduce masked autoencoder (MAE) for pre-training and fine-tuning directly on small-scale target datasets. Based on this, we propose Self-Supervised Learning with Self-Distillation on COVID19 medical image classification (SSSD-COVID). In addition to the reconstruction loss computation on the masked image patches, SSSD-COVID further performs self-distillation loss calculations on the latent representation of the encoder and decoder outputs. The additional loss calculation can transfer the knowledge from the global attention of the decoder to the encoder which acquires only local attention. Results: Our model achieves 97.78 % recognition accuracy on the SARS-COV-CT dataset containing 2481 images and is further validated on the COVID-CT dataset containing 746 images, which achieves 81.76 % recognition accuracy. Further introduction of external knowledge resulted in experimental accuracies of 99.6% and 95.27 % on these two datasets, respectively. Conclusions: SSSD-COVID can obtain good results on the target dataset alone, and when external information is introduced, the performance of the model can be further improved to significantly outperform other models. Overall, the experimental results show that our method can effectively mine COVID-19 features from rare data and can assist pr
暂无评论