The fault diagnosis method based on model and signalprocessing has some problems, such as difficulty in modeling and difficulty in extracting signal features;As the depth of the neural network deepens, there will be ...
详细信息
The fault diagnosis method based on model and signalprocessing has some problems, such as difficulty in modeling and difficulty in extracting signal features;As the depth of the neural network deepens, there will be the problem of gradient vanishing or mass, and directly converting the fault signal into a one-dimensional or twodimensional image as the network input cannot retain the time information between the signals, resulting in the loss of information. In order to solve the above problems, this paper proposes a fault diagnosis method based on Compressed Sensing (CS) and lightweight SqueezeNet model. Firstly, the Compressive Sensing (CS) technology was used to sparse the original signal, and the Compressed Sampling Matching Pursuit (CoSaMP) algorithm was used to compress and reconstruct the signal to complete the signal noise reduction and remove the data redundancy. Secondly, the compressed and reconstructed signal was encoded by recursive graph (RP) to generate a two-dimensional image, and the temporal correlation characteristics of the signal were retained. Thirdly, the traditional SqueezeNet model is improved by using the residual idea to enhance the feature extraction ability. The RP dataset was input into the improved SqueezeNet model for fault feature extraction and fault classification. Finally, the experimental results show that compared with other methods, the proposed method can identify faults and have high diagnostic accuracy on the rolling bearing data samples of wind turbines.
Blind image quality assessment (BIQA) is crucial for user satisfaction and the performance of various imageprocessing applications. Most BIQA methods directly use the pre-trained model to extract features and then pe...
详细信息
Blind image quality assessment (BIQA) is crucial for user satisfaction and the performance of various imageprocessing applications. Most BIQA methods directly use the pre-trained model to extract features and then perform feature fusion. However, the features extracted by pre-trained models may contain irrelevant information to BIQA. Although some methodspre-train the feature extraction network from scratch, these approaches raise computational costs and resource demands. In this letter, a Feature-selected Pyramid Network(FsPN) is proposed to address this issue from a different perspective. First, a spatial selection module selects useful information from the features extracted by the pre-trained model. Additionally, a pyramid network based on skip connections is utilized to fuse the selected multi-scale features. The proposed method is verified in six public datasets, where it consistently outperformed existing state-of-the-art methods, affirming its effectiveness and adaptability.
The accelerated MR image reconstruction algorithm based on deep learning has demonstrated tremendous potential in improving efficiency and performance. Compared with static reconstruction, the utilization of temporal ...
详细信息
The accelerated MR image reconstruction algorithm based on deep learning has demonstrated tremendous potential in improving efficiency and performance. Compared with static reconstruction, the utilization of temporal correlation is the key to cardiac cine MR image reconstruction, and modeling the information of temporal dimension during the reconstruction process can effectively reduce artifacts. However, current methods typically depend on 3D convolutional neural networks or recurrent neural networks, which may not be able to effectively capture both local feature details and long-range feature dependencies at the same time. In this paper, we propose a Dual-Domain Inter-Frame Feature Enhancement Network (DIFENet) for cardiac MR image reconstruction to extract beneficial information from the neighboring frames. First, an inter- frame feature fusion strategy is designed to learn the non-local spatio-temporal correlations of cardiac motion with attention mechanisms from the features of multi-frame data. Moreover, the fused inter-frame features are used to provide proper guidance for the subsequent refinement of reconstructed details by modulating the multi-scale features. Beyond that, a dual-domain parallel structure is incorporated into the framework, considering the complementarity of inter-frame features in the frequency and image domains. Comprehensive experiments demonstrate that the proposed method consistently outperforms other state-of-the-art static and dynamic reconstruction methods at multiple acceleration rates.
Multimodal biometric systems integrate multiple biometric traits to enhance recognition accuracy and robustness. This study introduces a novel face-iris multimodal biometric framework combining texture-based and deep ...
详细信息
Multimodal biometric systems integrate multiple biometric traits to enhance recognition accuracy and robustness. This study introduces a novel face-iris multimodal biometric framework combining texture-based and deep learning methods. The system utilizes uniform local binary patterns applied to capture fine-grained texture features. Additionally, a dual convolutional neural network (CNN) model, incorporating AlexNet and an attention mechanism, extracts high-level discriminative features from entire face and iris images. The attention mechanism prioritizes critical regions in feature maps, improving focus on discriminative details while mitigating noise. The key innovation of the system lies in integrating texture-based and CNN-based features, which collectively enable robust feature extraction and classification. Furthermore, the decision-level fusion strategy using the majority voting technique ensures optimal combination of independent decisions from the methods, providing a resilient final classification decision. Experiments conducted on the CASIA-Iris-Distance database demonstrate a recognition performance of 99.53%, significantly outperforming unimodal and state-of-the-art multimodal systems.
A large number of paintings are digitized, the automatic recognition and retrieval of artistic image styles become very meaningful. Because there is no standard definition and quantitative description of characteristi...
详细信息
A large number of paintings are digitized, the automatic recognition and retrieval of artistic image styles become very meaningful. Because there is no standard definition and quantitative description of characteristics of artistic style, the representation of style is still a difficult problem. Recently, some work have used deep correlation features in neural style transfer to describe the texture characteristics of paintings and have achieved exciting results. Inspired by this, this paper proposes a multimodal style aggregation network that incorporates three modalities of texture, structure and color information of artistic images. Specifically, the group-wise Gram aggregation model is proposed to capture multi-level texture styles. The global average pooling (GAP) and histogram operation are employed to perform distillation of the high-level structural style and the low-level color style, respectively. Moreover, an improved deep correlation feature calculation method called learnable Gram (L-Gram) is proposed to enhance the ability to express style. Experiments show that our method outperforms several state-of-the-art methods in five style datasets.
Deep neural networks are increasingly used in imageprocessing tasks. However, deep learning models often show vulnerability when facing adversarial attacks. Active defense is an important method to deal with adversar...
详细信息
Deep neural networks are increasingly used in imageprocessing tasks. However, deep learning models often show vulnerability when facing adversarial attacks. Active defense is an important method to deal with adversarial attacks in image identification. This letter proposes an active defense strategy for Generative Adversarial Networks (GAN) against adversarial attacks. The proposed method is that when the target network has been trained and remains unchanged, the perturbation generated by the generator is added to the original image and then input to the target network, which has little effect on the performance of the target network and can resist adversarial attacks well. The experimental results of implementing five adversarial attacks on three target network models based on the dataset MNIST and two target network models based on CIFAR10 and comparing them with two defense methods show that our method has achieved good performance in defense effect.
In recent years, score-based generative models (SGM) have achieved state-of-the-art (SOTA) performance in noisy image restoration [1, 2]. But at present, most of these methods are performed in the position space, and ...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
In recent years, score-based generative models (SGM) have achieved state-of-the-art (SOTA) performance in noisy image restoration [1, 2]. But at present, most of these methods are performed in the position space, and there is a lack in modeling of the velocity and acceleration of the image on the restoration path. In this paper, we propose a new image restoration method called conditional acceleration score approximation (CASA), which introduces velocity and acceleration variables on top of the data position along the recovery path. Guided by the degraded image, CASA can effectively and dynamically control the direction and speed of motion along the diffusion path in the reverse-time stochastic differential equation. Therefore, the key to this process is how to inject the degraded image as a guidance into the third-order reverse-time process in this position-velocity-acceleration space, especially in the evolution direction of the diffusion path. We propose a strategy for approximating the conditional acceleration score by decomposing the true posterior CAS into a priori CAS and an observed acceleration score for the measurement at the current moment. Experiments on 3 different datasets and 7 kinds of restoration tasks show that CASA is better than other methods and achieves a new SOTA.
Object re-identification (reID) plays a pivotal role in traffic surveillance systems for matching objects like people, cars, and motorcycles across multiple cameras. This is an active area of research in both industry...
详细信息
Object re-identification (reID) plays a pivotal role in traffic surveillance systems for matching objects like people, cars, and motorcycles across multiple cameras. This is an active area of research in both industry and academia due to the ever-growing population and need for smart surveillance, public safety, and traffic management. Most current reID methods use deep convolutional neural networks as the backbone that are manually designed, which does not have the optimum settings as the network complexity increases. This paper introduces MNASreID, an automated approach for designing deep convolutional neural networks designed specifically for motorcycle reID. Key contributions include proposing a NAS based optimization framework and designing a comprehensive search space covering backbone architectures and hyperparameters. Grasshopper optimization algorithm used as NAS search strategy to find the optimal DNN model. Experimental results on two motorcycle datasets, MoRe and BPReID, demonstrate MNASreID's ability to automatically identify efficient DNN models for reID tasks. Comparative evaluation against existing algorithms reveals significant performance enhancements. Specifically, MNASreID achieves a notable improvement of +1.14% and +1.24% in r1 and mAP metrics, respectively, on the MoRe dataset. On the BPReID dataset, it outperforms existing approaches by +26.82% and +29.56% in r1 and mAP metrics, respectively.
We introduce a novel approach for optimizing imagesignalprocessing (ISP) rendering pipelines for night photography through a Bayesian derivative-free procedure. Traditional neural-network-based ISPs depend on differ...
详细信息
We introduce a novel approach for optimizing imagesignalprocessing (ISP) rendering pipelines for night photography through a Bayesian derivative-free procedure. Traditional neural-network-based ISPs depend on differentiable operations to enable backpropagation-based optimization, a requirement that can impose significant constraints. Our method circumvents this by employing Bayesian optimization to fine-tune the pipeline's parameters, independently of their differentiability. Additionally, we address the need for paired data to enable supervised optimization: while such paired data is available on public datasets, it is expensive to collect for new imaging devices. To this extent, we design a raw-to-raw mapping procedure, that aligns images from an available paired dataset to the target unpaired dataset. This allows us to supervise the optimization of our solution directly within the target space, without the need for device-specific paired data. We validate our approach with extensive experimentation on paired and unpaired datasets, demonstrating its efficacy using both subjective and objective evaluation metrics. Our code is made available for public download at https://***/TheZino/Bayesian-pipeline-optimization.
Various types of control methods are utilized in wind turbines to obtain the optimal amount of power from wind. The turbine dynamics are required in said methods, and the wind speed is a critical component of the anal...
详细信息
Various types of control methods are utilized in wind turbines to obtain the optimal amount of power from wind. The turbine dynamics are required in said methods, and the wind speed is a critical component of the analysis. However, the stochastic nature of wind means that wind speed sensor signals are noisy. This paper proposes the utilization of a radial basis function neural network (RBFNN) based filter to process the signal, by training the network with a simulated wind signal. The network is differentiated from a traditional filter in that the number of neurons and the "learning rate" of the network dictate the properties of the filtered signal. The information flow in the network consists of the signal to be processed as the input, the which is then used as an argument in a radial basis function (which determines the "distance" of each value in the input from a particular preset point), and then it multiplied by a weight. The learning rate is obtained from a novel equation that is proposed in the paper. The results showed that the proposed scheme has versatility in terms of noise removal and signal smoothing, and if required, can viably match performance with a Butterworth filter. Furthermore, live training and adaptability also serve as advantages over a classic filter. Three "modes" of processing the signal are determined based on choosing certain ranges of values for parameters which comprise the RBFNN (number of neurons used and learning rate), and the control designer can choose which one to implement based on performance requirements.
暂无评论