Convolutional neural networks have proven to be proficient when extracting low-level concepts in an image. With the wonderful performance of transformers in exploiting the long-range correlations in an image, many met...
详细信息
ISBN:
(纸本)9781728198354
Convolutional neural networks have proven to be proficient when extracting low-level concepts in an image. With the wonderful performance of transformers in exploiting the long-range correlations in an image, many methods have been explored where one exploit benefits of both the architectures. Therefore, in order to strengthen our network we add an important feature to transformers wherein single image super-resolution (SISR) is exploited using band grouping leveraging a simple CNN architecture. This paper aims to train a set of simple residual modelling architectures and then integrate them into a transformer architecture to solve super-resolution problem in HSI. We take a step forward to analyse how to adapt swinIR to fully exploit the information derived from band grouping for efficient SISR.
The need for automated systems to aid law enforcement during densely packed events arises from the inherent danger of large crowds, evidenced by historical instances of stampedes and crushes. Existing methods vary fro...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
The need for automated systems to aid law enforcement during densely packed events arises from the inherent danger of large crowds, evidenced by historical instances of stampedes and crushes. Existing methods vary from basic crowd statistics extraction to detailed anomaly detection in behavior classification, but often focus on single, pre-segmented scenes. Our work addresses classifying crowd behaviors in environments where multiple behaviors coexist within a single scene, defined as a multi-class crowd motion characterization challenge. We use a microscopic approach for scenes captured by drones at varying altitudes, without prior manipulation. This approach combines graph-based representations of individuals and flow images, facilitating classification of diverse crowd behaviors in unsegmented scenes. Tested on a public dataset, our method shows promising results in analyzing complex crowd dynamics.
images contain a wealth of information and are frequently targeted by malicious attackers when transmitted over public networks. Fortunately, image encryption prevents confidential information from being acquired by i...
详细信息
images contain a wealth of information and are frequently targeted by malicious attackers when transmitted over public networks. Fortunately, image encryption prevents confidential information from being acquired by illegal attackers. Deep learning-based image encryption is a relatively new research area, but recently proposed methods have not achieved satisfactory levels of generalization, security, and efficiency. To address these limitations, we employ a lite dense residual network (Dense-ResNet) to rearrange image pixels, thereby reducing the computation amounts. In addition, we design a weight-adjustable loss function model, which combines the encryption loss function, decryption loss function, and total variational loss function. And then we adopt bit-XOR diffusion to further encrypt the intermedia ciphertext image obtained by the encryption network. We trained and tested encryption and decryption neural networks in a dataset of no fixed category images. Experiments declare our method can complete the image encryption/ decryption tasks in various scenarios. Additionally, the proposed approach exhibits broad generalization abilities with high encryption and decryption quality aided by the decryption total variation loss function. Compared to recently proposed deep learning-based image encryption approaches, our method demonstrates faster processing times for both image encryption and decryption, with at least a 2.7% and 7.5% increase in efficiency, respectively. Furthermore, our method improves decryption performance by at least 1.0% and 0.5% in Peak signal-to-ratio (PSNR) as well as structural similarity (SSIM) indicators while maintaining a high level of security. What is more, our method enhances traceability of data loss or noise attacks since such attacks leave a noticeable trail on decrypted images produced by our method.
Blood cells play an essential role in various bodily functions, such as protection against infections and the body's defense. The accurate classification of blood cells, generally grouped as red, white, and platel...
详细信息
ISBN:
(纸本)9798350388978;9798350388961
Blood cells play an essential role in various bodily functions, such as protection against infections and the body's defense. The accurate classification of blood cells, generally grouped as red, white, and platelets is important for clinical diagnosis and hematological analysis. However, identifying these cells is a specialized and time-consuming process. Therefore, there is a hot-topic for high-precision automatic blood cell classification methods. Convolutional neural networks (CNNs) are a deep learning model used for visual data analysis and are very powerful in extracting features from data. In this study, we propose a hybrid classification model that combines the feature extraction power of CNNs with the ensemble-based prediction capabilities of Random Forest and XGBoost algorithms. The proposed hybrid model is compared with different methods on the BloodMNIST dataset in terms of classification performance and inference time. The results show that the tree-based methods outperform CNN by up to 8.49 and 11.62 points and achieve up to 82.9 times better inference times than other methods.
Ground-penetrating radar (GPR) is an important nondestructive testing (NDT) tool for the underground exploration of urban roads. However, due to the large amount of GPR data, traditional manual interpretation is time-...
详细信息
Ground-penetrating radar (GPR) is an important nondestructive testing (NDT) tool for the underground exploration of urban roads. However, due to the large amount of GPR data, traditional manual interpretation is time-consuming and laborious. To address this problem, an efficient underground target detection method for urban roads based on neural networks is proposed in this paper. First, robust principal component analysis (RPCA) is used to suppress the clutter in the B-scan image. Then, three time-domain statistics of each A-scan signal are calculated as its features, and one backpropagation (BP) neural network is adopted to recognize A-scan signals to obtain the horizontal regions of targets. Next, the fusion and deletion (FAD) algorithm is used to further optimize the horizontal regions of targets. Finally, three time-domain statistics of each segmented A-scan signal in the horizontal regions of targets are extracted as the features, and another BP neural network is employed to recognize the segmented A-scan signals to obtain the vertical regions of targets. The proposed method is verified with both simulation and real GPR data. The experimental results show that the proposed method can effectively locate the horizontal ranges and vertical depths of underground targets for urban roads and has higher recognition accuracy and less processing time than the traditional segmentation recognition methods.
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods. This paper prese...
详细信息
ISBN:
(纸本)9781728198354
Over-fitting-based image compression requires weights compactness for compression and fast convergence for practical use, posing challenges for deep convolutional neural networks (CNNs) based methods. This paper presents a simple re-parameterization method to train CNNs with reduced weights storage and accelerated convergence. The convolution kernels are re-parameterized as a weighted sum of discrete cosine transform (DCT) kernels enabling direct optimization in the frequency domain. Combined with L1 regularization, the proposed method surpasses vanilla convolutions by achieving a significantly improved rate-distortion with low computational cost. The proposed method is verified with extensive experiments of over-fitting-based image restoration on various datasets, achieving up to -46.12% BD-rate on top of HEIF with only 200 iterations.
Prostate magnetic resonance imaging (MRI) is widely used in the diagnosis of prostate cancer and other prostate diseases. The automatic segmentation of images from prostate MRI plays an important role in the auxiliary...
详细信息
Prostate magnetic resonance imaging (MRI) is widely used in the diagnosis of prostate cancer and other prostate diseases. The automatic segmentation of images from prostate MRI plays an important role in the auxiliary diagnosis of prostate diseases. Currently, there are two commonly used methods for automatic segmentation of prostate MRI, which are 2D image segmentation and 3D image segmentation. In this paper, a two-stage CNN method for MRI image segmentation of prostate with lesion is proposed. At the first stage, we used a CNN model incorporating the Squeeze-Excitation module to discriminate whether the image contains prostate or not. At the second stage, we proposed a Residual-Attention U-Net for segmentation of images containing prostate. Eventually, the 3D prostate MRI segmentation results are obtained and fully automated segmentation is accomplished. We evaluated our proposed method and other common 2D and 3D segmentation methods on the test dataset and compared their results based on Dice Similarity Coefficient (DSC) value. Our method performed the best and achieved the DSC metric value of 0.860.
Handwriting images are commonly used to diagnose Parkinson's disease due to their intuitive nature and easy accessibility. However, existing methods have not explored the potential of the fusion of different handw...
详细信息
Handwriting images are commonly used to diagnose Parkinson's disease due to their intuitive nature and easy accessibility. However, existing methods have not explored the potential of the fusion of different handwriting image sources for diagnosis. To address this issue, this study proposes a hybrid fusion approach that makes use of the visual information derived from different handwriting images and handwriting templates, significantly enhancing the performance in diagnosing Parkinson's disease. The proposed method involves several key steps. Initially, different preprocessed handwriting images undergo pixel-level fusion using Laplacian transformation. Subsequently, the fused and original images are fed into a pre-trained CNN separately to extract visual features. Finally, feature-level fusion is performed by concatenating the feature vectors extracted from the flatten layer, and the fused feature vectors are input into SVM to obtain classification results. Our experimental results validate that the proposed method achieves excellent performance by only utilizing visual features from images, with 95.45% accuracy on the NewHandPD. Furthermore, the results obtained on our dataset verify the strong generalizability of the proposed approach.
Thanks to the powerful representation capabilities, transformers have made impressive progress in image restoration. However, existing transformers-based methods do not carefully consider the particularities of image ...
详细信息
ISBN:
(纸本)9781713871088
Thanks to the powerful representation capabilities, transformers have made impressive progress in image restoration. However, existing transformers-based methods do not carefully consider the particularities of image restoration. In general, image restoration requires that an ideal approach should be translation-invariant to the degradation, i.e., the undesirable degradation should be removed irrespective of its position within the image. Furthermore, the local relationships also play a vital role, which should be faithfully exploited for recovering clean images. Nevertheless, most transformers either adopt local attention with the fixed local window strategy or global attention, which unfortunately breaks the translation invariance and causes huge loss of local relationships. To address these issues, we propose an elegant stochastic window strategy for transformers. Specifically, we first introduce the window partition with stochastic shift to replace the original fixed window partition for training. Then, we design a new layer expectation propagation algorithm to efficiently approximate the expectation of the induced stochastic transformer for testing. Our stochastic window transformer not only enjoys powerful representation but also maintains the desired property of translation invariance and locality. Experiments validate the stochastic window strategy consistently improves performance on various image restoration tasks (deraining, denoising and deblurring) by significant margins. The code is available at https://***/jiexiaou/Stoformer.
Due to increasingly large computational resources, modern neural networks are severely constrained due to their processing speed and energy consumption. Optical neural networks (ONNs), which use photonic structures to...
详细信息
暂无评论