Assessing the quality of 360-degree images based on individual regions presents a challenging task. The lack of ground truth opinion scores (MOS) for specific regions makes it difficult to evaluate image quality accur...
详细信息
ISBN:
(纸本)9798350338935
Assessing the quality of 360-degree images based on individual regions presents a challenging task. The lack of ground truth opinion scores (MOS) for specific regions makes it difficult to evaluate image quality accurately. Existing datasets only provide MOS for entire 360-degree images, which limits the granularity of assessment. To overcome this challenge, we propose a novel framework that employs adaptive patch labeling techniques. We leverage a set of 2D-IQA methods to generate quality score distributions for each patch in the 360-degree images. These distributions, combined with the available MOS, serve as labels for individual patches, providing a more comprehensive characterization of patch quality. Furthermore, we use these labels to adaptively select and refine deep neural features. By selectively choosing label-specific features, we enhance the accuracy and effectiveness of patch-based 360-degree image quality assessment. This approach allows us to focus on the most relevant and informative features for each patch, resulting in improved assessment performance. The experimental results on two benchmark datasets demonstrate that adaptive patch labeling and feature selection achieve accurate and reliable performances, thus advancing the field of 360-degree image quality assessment.
Self-knowledge distillation does not require a pre-trained teacher network like traditional knowledge distillation. Existing methods either require additional parameters or require additional memory consumption. To al...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Self-knowledge distillation does not require a pre-trained teacher network like traditional knowledge distillation. Existing methods either require additional parameters or require additional memory consumption. To alleviate this problem, this paper proposes a more efficient self-knowledge distillation method, named LRMS (learning from role-model samples). In every mini-batch, LRMS selects out a rolemodel sample for each sampled category, and takes its prediction as the proxy semantic for the corresponding category. Then, predictions of the other samples are constrained to be consistent with the proxy semantics, which makes the distribution of predictions for samples within the same category more compact. Meanwhile, the regularization targets corresponding to proxy semantics are set with a higher distillation temperature to better utilize the classificatory information about the categories. Experimental results show that diverse architectures achieve improvements on four image classification datasets by using LRMS. Code is acaliable: https://***/KAI1179/LRMS
In order to improve the anti-noise performance of traditional image binarization methods, this paper proposes a novel binarization method for low-quality images based on threshold array system. The proposed method inv...
详细信息
Fires represent an important risk to an entire planet, destroying everything from huge cities to impenetrable forests. This can be prevented using fire detection systems, but have been slow to be implemented due to co...
详细信息
Fires represent an important risk to an entire planet, destroying everything from huge cities to impenetrable forests. This can be prevented using fire detection systems, but have been slow to be implemented due to concerns about the high cost, specialized connection, false alarms, and unreliability of existing facility-based detection systems. This take a first step towards utilizing DL to detect fire in images in this work. A Forest Fires dataset, obtained by an UCI ML Repository, is utilized for both training and testing purposes in this study. The components of preprocessingmethods involve defining the paths for training and testing data, converting images to pixel representations, normalizing of the data and target variable selection. The model is applied in place due to its ability to highlight intricate details and patterns that are the main elements of precise fire detection. This research presents new methods for detecting forest fires through the use of a carefully selected dataset, transfer learning using the Hybrid (ResNet152V2 and InceptionV3) model and also deep learning based ConvNext model, and innovative preprocessing procedures. The novelty of this study arises from the effective incorporation of Hybrid (ResNet152V2, InceptionV3) model and ConvNext model into the field of fire detection, demonstrating its capability to attain exceptional levels of accuracy and precision. Techniques of visualization and comprehensive evaluation metrics raise the study's novelty. By utilizing the Hybrid (ResNet152V2 and InceptionV3) model, which attains an astounding accuracy, recall, f1-score, and precision of 99.47%, exceptional performance is achieved. While also ConvNext model get 95.53% accuracy. This study makes a valuable contribution to the field of fire detection systems by utilizing innovative deep neural network architectures to enhance performance and dependability.
Multi -center cervical cytology images have various image styles due to the differences in staining and imaging techniques, which pose a significant challenge to the performance of automated cervical cancer diagnosis ...
详细信息
Multi -center cervical cytology images have various image styles due to the differences in staining and imaging techniques, which pose a significant challenge to the performance of automated cervical cancer diagnosis tools. We propose a dual -head network architecture that explicitly disentangles image features into content and style features, and applies contrastive self -supervised learning to a large number of unlabeled images, achieving enhanced generalization across various styles. We pretrain our model on 1,024,855 images cropped from 3,561 whole slide images (WSIs), and visualize the features using t -distributed stochastic neighbor embedding (t-SNE) method, demonstrating the effectiveness of our method in distinguishing between content and style features. In the downstream task, we evaluate our model on 192,123 binary -classified images with 10 styles, and achieve the best accuracy among all methods for every style. Across the 10 different data sources, our method attained an average accuracy of 80.4%, outperforming all other comparative methods by 3% to 17%, demonstrating our method's potential to enhance the performance and robustness of automated cytology image analysis in multi -center settings.
The main objective of this paper is to improve the image segmentation model for handwritten notebook analytics. We conducted a considerable amount of research in this area to increase the accuracy and efficiency of se...
详细信息
ISBN:
(纸本)9781728198354
The main objective of this paper is to improve the image segmentation model for handwritten notebook analytics. We conducted a considerable amount of research in this area to increase the accuracy and efficiency of segmentation. To address the issues with traditional methods, we introduced attention mechanism and recursive residual convolutional neural network in the multi-task U-Net model. Through training and testing the model on handwritten notebook dataset and compared it with other existing technologies, we demonstrated the effectiveness of this method. The results showed that the model had a significant improvement in accuracy. Therefore, the research findings in this paper are important for improving the technology of handwritten notebook analytics.
Reconstructing magnetic resonance (MR) images from undersampled k-space data have always been a challenging problem. Compressed Sensing (CS) can reconstruct images from a small amount of sampled data when combined wit...
详细信息
Reconstructing magnetic resonance (MR) images from undersampled k-space data have always been a challenging problem. Compressed Sensing (CS) can reconstruct images from a small amount of sampled data when combined with the robust feature learning ability of deep learning, and it can further reduce the sampling time. Most previous deep learning methods using compressed sensing relied heavily on convolutional neural networks (CNNs) or swin transformer block(STB), even they reconstructed images through stacking or cross-domain structure. However, due to the limited size of their receptive fields, convolutional neural networks cannot explore the global features of images. Conversely, vast receptive fields would increase model complexity and make the entire network difficult to train. In this paper, we proposed a cascade dual-domain swin-conv unet for reconstruction(CDSCU-Net), which combines STB and CNNs to focus on both local and global features during reconstruction. By fusing these features through incorporating our designed residual modules in the skip connections, we can mine more refined feature representations. Compared with the best-performing deep learning reconstruction methods based on compressed sensing in recent years, CDSCU-Net can better preserve the structural details of images and achieve good reconstruction quality at lower acceleration factors, additionally, the reconstructed images can also serve as raw data for other tasks.
The development of smart homes, equipped with devices connected to the Internet of Things (IoT), has opened up new possibilities to monitor and control energy consumption. In this context, non-intrusive load monitorin...
详细信息
The development of smart homes, equipped with devices connected to the Internet of Things (IoT), has opened up new possibilities to monitor and control energy consumption. In this context, non-intrusive load monitoring (NILM) techniques have emerged as a promising solution for the disaggregation of total energy consumption into the consumption of individual appliances. The classification of electrical appliances in a smart home remains a challenging task for machine learning algorithms. In the present study, we propose comparing and evaluating the performance of two different algorithms, namely Multi-Label K-Nearest Neighbors (MLkNN) and Convolutional neural Networks (CNN), for NILM in two different scenarios: without and with data augmentation (DAUG). Our results show how the classification results can be better interpreted by generating a scalogram image from the power consumption signal data and processing it with CNNs. The results indicate that the CNN model with the proposed data augmentation performed significantly higher, obtaining a mean F1-score of 0.484 (an improvement of +0.234), better than the other methods. Additionally, after performing the Friedman statistical test, it indicates that it is significantly different from the other methods compared. Our proposed system can potentially reduce energy waste and promote more sustainable energy use in homes and buildings by providing personalized feedback and energy savings tips.
Data-enabled predictive control (DeePC) for linear systems utilizes data matrices of recorded trajectories to directly predict new system trajectories, which is very appealing for real-life applications. In this paper...
详细信息
Data-enabled predictive control (DeePC) for linear systems utilizes data matrices of recorded trajectories to directly predict new system trajectories, which is very appealing for real-life applications. In this paper we leverage the universal approximation properties of neural networks (NNs) to develop neural DeePC algorithms for nonlinear systems. Firstly, we point out that the outputs of the last hidden layer of a deep NN implicitly construct a basis in a so-called neural (feature) space, while the output linear layer performs affine interpolation in the neural space. As such, we can train of-line a deep NN using large data sets of trajectories to learn the neural basis and compute on-line a suitable affine interpolation using DeePC. Secondly, methods for guaranteeing consistency of neural DeePC and for reducing computational complexity are developed. Several neural DeePC formulations are illustrated on a nonlinear pendulum example. Copyright (c) 2024 The Authors.
While graph neural networks (GNNs) are popular in the deep learning community, they suffer from several challenges including over-smoothing, over-squashing, and gradient vanishing. Recently, a series of models have at...
详细信息
While graph neural networks (GNNs) are popular in the deep learning community, they suffer from several challenges including over-smoothing, over-squashing, and gradient vanishing. Recently, a series of models have attempted to relieve these issues by first augmenting the node features and then imposing node-wise functions based on multilayer perceptron (MLP), which are widely referred to as graph-augmented MLP (GA-MLP) models. However, while GA-MLP models enjoy deeper architectures for better accuracy, their efficiency largely deteriorates. Moreover, popular acceleration techniques such as stochastic-version or data parallelism cannot be effectively applied due to the dependency among samples (i.e., nodes) in graphs. To address these issues, in this article, instead of data parallelism, we propose a parallel graph deep learning Alternating Direction Method of Multipliers (pdADMM-G) framework to achieve model parallelism: parameters in each layer of GA-MLP models can be updated in parallel. The extended pdADMM-G-Q algorithm reduces communication costs by introducing the quantization technique. Theoretical convergence to a (quantized) stationary point of the pdADMM-G algorithm and the pdADMM-G-Q algorithm is provided with a sublinear convergence rate o(1/k), where k is the number of iterations. Extensive experiments demonstrate the convergence of two proposed algorithms. Moreover, they lead to a more massive speedup and better performance than all state-of-the-art comparison methods on nine benchmark datasets. Last but not least, the proposed pdADMM-G-Q algorithm reduces communication overheads by up to 45% without loss of performance.
暂无评论