We present a statistical framework to benchmark the performance of reconstruction algorithms for linear inverse problems, in particular, neural-network-based methods that require large quantities of training data. We ...
详细信息
We present a statistical framework to benchmark the performance of reconstruction algorithms for linear inverse problems, in particular, neural-network-based methods that require large quantities of training data. We generate synthetic signals as realizations of sparse stochastic processes, which makes them ideally matched to variational sparsity-promoting techniques. We derive Gibbs sampling schemes to compute the minimum mean-square error estimators for processes with Laplace, Student's t, and Bernoulli-Laplace innovations. These allow our framework to provide quantitative measures of the degree of optimality (in the mean-square-error sense) for any given reconstruction method. We showcase our framework by benchmarking the performance of some well-known variational methods and convolutional neural network architectures that perform direct nonlinear reconstructions in the context of deconvolution and Fourier sampling. Our experimental results support the understanding that, while these neural networks outperform the variational methods and achieve near-optimal results in many settings, their performance deteriorates severely for signals associated with heavy-tailed distributions.
This paper proposes a graph linear canonical transform (GLCT) by decomposing the linear canonical parameter matrix into fractional Fourier transform, scale transform, and chirp modulation for graph signalprocessing. ...
详细信息
This paper proposes a graph linear canonical transform (GLCT) by decomposing the linear canonical parameter matrix into fractional Fourier transform, scale transform, and chirp modulation for graph signalprocessing. The GLCT enables adjustable smoothing modes, enhancing alignment with graph signals. Leveraging traditional fractional domain time-frequency analysis, we investigate vertex-frequency analysis in the graph linear canonical domain, aiming to overcome limitations in capturing local information. Filter design methods, including optimal design and learning with stochastic gradient descent, are analyzed and applied to image classification tasks. The proposed GLCT and vertex-frequency analysis present innovative approaches to signalprocessing challenges, with potential applications in various fields.
Human action detection in static images is a hot and challenging field within computer vision. Given the limited features of a single image, achieving precision detection results require the full utilization of the im...
详细信息
Human action detection in static images is a hot and challenging field within computer vision. Given the limited features of a single image, achieving precision detection results require the full utilization of the image's intrinsic features, as well as the integration of methods from other fields to process the images for generating additional features. In this paper, we propose a novel dual pathway model for action detection, whose main pathway employs a convolutional neural network to extract image features and predict the probability of the image belonging to each respective action. Meanwhile, the auxiliary pathway uses a pose estimate algorithm to obtain human key points and connection information for constructing a graphical human model for each image. These graphical models are then transformed into graph data and input into a graph neural network for features extracting and probability prediction. Finally, a corresponding connected neural network propose by us is used to fusing the probability vectors generated from the two pathways, which learns the weight of each action class in each vector to enable their subsequent fusion. It is noted that transfer learning is also used in our model to improve the training speed and detection accuracy of it. Experimental results upon three challenging datasets: Stanford40, PPMI and MPII illustrate the superiority of the proposed method.
image analysis is crucial for microscopic medical images, particularly for imaging sperm cells. Sperm morphology analysis, a crucial process of assisted fertilization techniques, can be used to evaluate male infertili...
详细信息
image analysis is crucial for microscopic medical images, particularly for imaging sperm cells. Sperm morphology analysis, a crucial process of assisted fertilization techniques, can be used to evaluate male infertility, which significantly impacts couples' quality of life. This paper proposes a technique that combines convolutional neural networks (CNN) with modified Havrda-Charvat entropic segmentation to identify normal sperm cells in pre-processed image samples. Initially, a noise removal algorithm is applied to the sperm cell images, followed by segmentation using the modified Havrda-Charvat entropy method to isolate individual sperm cells. High detection accuracy is then achieved through a combination of deep learning and feature extraction. This research optimizes three stages: image pre-processing with a Wiener filter, segmentation using the Havrda-Charvat entropy technique, and abnormality detection with CNN. The proposed method achieves 98.99% accuracy in identifying normal sperm cells based on their morphology, outperforming state-of-the-art techniques. By enhancing sperm cell analysis methods, this research facilitates more precise and automated segmentation, processing, and detection. The proposed approach has the potential to revolutionize reproductive medicine by improving the accuracy of fertility diagnoses and the effectiveness of treatments.
Using stochastic resonance (SR) mechanism, the output signal can be enhanced by adding noise to the nonlinear system. Therefore, an image denoising algorithm based on adaptive bi-dimensional stochastic resonance (ABSR...
详细信息
Using stochastic resonance (SR) mechanism, the output signal can be enhanced by adding noise to the nonlinear system. Therefore, an image denoising algorithm based on adaptive bi-dimensional stochastic resonance (ABSR) is proposed in this paper. Firstly, the image is sampled as a bi-dimensional signal, and an adaptive bi-dimensional dynamic nonlinear system model is constructed. The peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) of the output image are used as the double evaluation model of the adaptive system, and the optimal parameters of the model are automatically obtained by adjusting the parameters of the dynamic nonlinear system using the reverse positioning method. Compared with the traditional mean filter, median filter and one-dimensional stochastic resonance, the image restoration effect of dynamic adaptive bi-dimensional stochastic resonance is more closer to the original image, and the histogram, PSNR and SSIM of the output image are also significantly better than the other three methods. The results show that dynamic adaptive bi-dimensional stochastic resonance has better denoising effect and better robustness to the change of noise intensity in imageprocessing.
To improve the performance of the image preprocessing module in consumer electronics using an active-matrix organic light-emitting diode display panel, the concept of judging before processing for salt-and-pepper deno...
详细信息
To improve the performance of the image preprocessing module in consumer electronics using an active-matrix organic light-emitting diode display panel, the concept of judging before processing for salt-and-pepper denoising is originally proposed. Firstly, a dataset for salt-and-pepper noise image classification is constructed, and a convolutional neural network (CNN) for judging noise image (CNN-J) is trained. image classified as normal by CNN-J is not processed, while the classified noisy image is denoised. In the denoising process, a marking image and a rough denoised image are generated by CNN for noise mask (CNN-M) and CNN for denoising (CNN-D), respectively. Subsequently, the refined denoised image is output using the proposed refining mechanism. The middle layers of CNN-M and CNN-D are constructed by depth-separable CNN to reduce the network complexity. Experimental results show that the misjudging rate of CNN-M marking is reduced by 19.94% compared with the best existing marking method. Compared with the traditional methods, the peak signal to noise ratio of the proposed method is increased by 2.95% and the information loss is reduced by 21.46%. In addition, the computational complexity is at least 11.18% lower than that of the traditional CNN. Finally, the display of salt-and-pepper denoised images on the flexible AMOLED is realized.
Video analysis is a computer vision task that is useful for many applications like surveillance, human-machine interaction, and autonomous vehicles. Deep learning methods are currently the state-of-the-art methods for...
详细信息
Video analysis is a computer vision task that is useful for many applications like surveillance, human-machine interaction, and autonomous vehicles. Deep learning methods are currently the state-of-the-art methods for video analysis. Particularly, two-stream methods, which leverage both spatial and temporal information, have proven to be valuable in Human Action Recognition (HAR). However, they have high computational costs, and need a large amount of labeled data for training. In addressing these challenges, this paper adopts amore efficient approach by leveraging Convolutional Spiking neural Networks (CSNNs) trained with the unsupervised Spike Timing-Dependent Plasticity (STDP) learning rule for action classification. These networks represent the information using asynchronous low-energy spikes, which allows the network to be more energy efficient when implemented on neuromorphic hardware. Furthermore, learning visual features with unsupervised learning reduces the need for labeled data during training, making the approach doubly advantageous. Therefore, we explore transposing two-stream convolutional neural networks into the spiking domain, where we train each stream with the unsupervised STDP learning rule. We investigate the performance of these networks in video analysis by employing five distinct configurations for the temporal stream, and evaluate them across four benchmark HAR datasets. In this work, we show that two-stream CSNNs can successfully extract spatio-temporal information from videos despite using limited training data, and that the spiking spatial and temporal streams are complementary. We also show that replacing a dedicated temporal stream with a spatio-temporal one within a spiking two-stream architecture leads to information redundancy that hinders the performance.
Recent breakthroughs in generative neural networks have paved the way for transformative capabilities, particularly in their capacity to generate novel data, notably in the realm of images. The integration of these mo...
详细信息
Recent breakthroughs in generative neural networks have paved the way for transformative capabilities, particularly in their capacity to generate novel data, notably in the realm of images. The integration of these models with the increasingly popular technique of transfer learning, designed for proficient feature extraction, holds the promise of enhancing overall performance. This paper delves into the exploration of employing generative models in conjunction with transfer learning methods for feature extraction, with a specific focus on image classification tasks. Our investigation aims to scrutinize the effectiveness of leveraging generative models alongside pre-trained models as feature extractors in the context of image classification. To the best of our knowledge, our investigation is the first to link transfer learning and generative models for a discriminative task under one roof. The proposed approach undergoes rigorous evaluation on two distinct datasets, employing specific metrics to gauge the model's performance. The results exhibit a notable nearly 10% enhancement achieved through the integration of generative models, underscoring their potential for achieving heightened accuracy in image classification. These findings highlight significant advancements in image classification accuracy, surpassing the performance of conventional Artificial neural Network (ANN) models.
Electroencephalogram (EEG) based emotion recognition has become an important topic in human-computer interaction and affective computing. However, existing advanced methods still have some problems. Firstly, using too...
详细信息
Electroencephalogram (EEG) based emotion recognition has become an important topic in human-computer interaction and affective computing. However, existing advanced methods still have some problems. Firstly, using too many electrodes will decrease the practicality of EEG acquisition device. Secondly, transformer is not good at extracting local features. Finally, differential entropy (DE) is unsuitable for extracting features outside the 2-44 Hz frequency band. To solve these problems, we designed a neural network using 14 electrodes, utilizing differential entropy and designed spectrum sum (SS) to extract features, using convolutional neural networks and image segmentation techniques to learn local features, and transformer encoders to learn global features. The model outperformed advanced methods with classification results of 98.50% and 99.00% on the SEED-IV and SEED-V datasets.
Recent advances in Synthetic Aperture Radar (SAR) sensors and innovative advanced imagery techniques have enabled SAR systems to acquire very high-resolution images with wide swaths, large bandwidth and in multiple po...
详细信息
Recent advances in Synthetic Aperture Radar (SAR) sensors and innovative advanced imagery techniques have enabled SAR systems to acquire very high-resolution images with wide swaths, large bandwidth and in multiple polarization channels. The improvements of the SAR system capabilities also imply a significant increase in SAR data acquisition rates, such that efficient and effective compression methods become necessary. The compression of SAR raw data plays a crucial role in addressing the challenges posed by downlink and memory limitations onboard the SAR satellites and directly affects the quality of the generated SAR image. neural data compression techniques using deep models have attracted many interests for natural image compression tasks and demonstrated promising results. In this study, neural data compression is extended into the complex domain to develop a Complex-Valued (CV) autoencoder-based data compression for SAR raw data. To this end, the basic fundamentals of data compression and Rate-Distortion (RD) theory are reviewed, well known data compression methods, Block Adaptive Quantization (BAQ) and JPEG2000 methods, are implemented and tested for SAR raw data compression, and a neural data compression based on CV autoencoders is developed for SAR raw data. Furthermore, since the available Sentinel-1 SAR raw products are already compressed with Flexible Dynamic BAQ (FDBAQ), an adaptation procedure applied to the decoded SAR raw data to generate SAR raw data with quasi-uniform quantization that resemble the statistics of the uncompressed SAR raw data onboard the satellites.
暂无评论