The compressive sensing (CS) scheme exploits many fewer measurements than suggested by the Nyquist-Shannon sampling theorem to accurately reconstruct images, which has attracted considerable attention in the computati...
详细信息
The compressive sensing (CS) scheme exploits many fewer measurements than suggested by the Nyquist-Shannon sampling theorem to accurately reconstruct images, which has attracted considerable attention in the computational imaging community. While classic image CS schemes employ sparsity using analytical transforms or bases, the learning-based approaches have become increasingly popular in recent years. Such methods can effectively model the structure of image patches by optimizing their sparse representations or learning deep neural networks while preserving the known or modeled sensing process. Beyond exploiting local image properties, advanced CS schemes adopt nonlocal image modeling by extracting similar or highly correlated patches at different locations of an image to form a group to process jointly. More recent learning-based CS schemes apply nonlocal structured sparsity priors using group sparse (and related) representation (GSR) and/or low-rank (LR) modeling, which have demonstrated promising performance in various computational imaging and imageprocessing applications.
The automated classification of gastrointestinal endoscopy images holds immense importance in modern health care. It streamlines the diagnostic process by enabling faster and more accurate identification of gastrointe...
详细信息
The automated classification of gastrointestinal endoscopy images holds immense importance in modern health care. It streamlines the diagnostic process by enabling faster and more accurate identification of gastrointestinal diseases. While the existing automated methods have demonstrated promising performance, there still remains a gap in consistently achieving high accuracy. This is due to reason that endoscopy images suffer from inter -class similarities and intra-class differences, which complicates the classification task. To address these problems, we propose a framework for endoscopy image classification. In general, the proposed framework comprises three essential modules. The first module is the Local -Global Convolutional neural Network (LGCNN) which aims to extract both local fine-grained features and captures global context, second module is the Endoscopy-Lesion Attention Module (ELA) that enables the framework to emphasize more crucial regions and filter out noises and other irreverent information. Finally, the last module, Gastrointestinal Endoscopy CNN (GE -CNN) leverages the above two modules in a effective way to classify the input image into various categories. We evaluate the performance of proposed framework on two publicly available challenging datasets, namely, Kvasir, and HyperKvasir. Based on the experimental results, we illustrate the efficacy of the proposed framework in effectively classifying endoscopy images.
We present CT-Bound, a robust and fast boundary detection method for very noisy images using a hybrid Convolution and Transformer neural network. The proposed architecture decomposes boundary estimation into two tasks...
详细信息
ISBN:
(纸本)9798350387261;9798350387254
We present CT-Bound, a robust and fast boundary detection method for very noisy images using a hybrid Convolution and Transformer neural network. The proposed architecture decomposes boundary estimation into two tasks: local detection and global regularization. During the local detection, the model uses a convolutional architecture to predict the boundary structure of each image patch in the form of a pre-defined local boundary representation, the field-of-junctions (FoJ) [9]. Then, it uses a feed-forward transformer architecture to globally refine the boundary structures of each patch to generate an edge map and a smoothed color map simultaneously. Our quantitative analysis shows that CT-Bound outperforms the previous best algorithms in edge detection on very noisy images. It also increases the edge detection accuracy of FoJ-based methods while having a 3-time speed improvement. Finally, we demonstrate that CT-Bound can produce boundary and color maps on real captured images without extra fine-tuning and real-time boundary map and color map videos at ten frames per second.
De-noising is an effective mechanism for removing the aberration present in the image and has been exploited in diverse fields. In this proposed research work, a novel deep learning-based profound memory-affiliated ne...
详细信息
De-noising is an effective mechanism for removing the aberration present in the image and has been exploited in diverse fields. In this proposed research work, a novel deep learning-based profound memory-affiliated neural network (PMANN) for de-noising aerial images with disaster management is implemented. The proposed method is optimized with an adaptive dual-threshold wavelet transform for precise noise suppression of aerial images. This proposed architecture overrides the prior art methods employed for de-noising as well as disaster management. The intended scheme is correlated over the preexisting noise removal techniques such as convolution neural network (CNN), CNN with long short-time memory (CNN-LSTM), weighted nuclear norm minimization (WNNM), and de-noising CNN (DNCNN), respectively. The peak signal-noise ratio value of the proposed PMANN is increased by 0.24%, 0.086%, 0.643%, and 0.720% compared to CNN, CNN-LSTM, WNNM, and DNCNN models, respectively. The structural similarity index value is increased by 0.59%, 0.382%, 0.037%, and 0.465% compared to CNN, CNN-LSTM, WNNM, and DNCNN techniques. The mean-squared error value is decreased by 8.93%, 2.1457%, 0.316%, and 0.582% compared to CNN, CNN-LSTM, WNNM, and DNCNN techniques, respectively.
In the wake of unparalleled expansion in digital communication platforms, the imperative to bolster security and privacy measures has escalated. Within this landscape, image steganalysis emerges as a pivotal domain co...
详细信息
In the wake of unparalleled expansion in digital communication platforms, the imperative to bolster security and privacy measures has escalated. Within this landscape, image steganalysis emerges as a pivotal domain committed to detecting concealed information embedded in image files. This academic article unveils a novel image steganalysis model, melding dilated convolutional methodologies with a state-of-the-art mutual learning-based artificial bee colony (ML-ABC) approach and reinforcement learning (RL). The architecture operates a consortium of convolutional neural networks, collaboratively deriving features. After derivation, these features are combined to simplify the subsequent classification task. A reinforcement learning-focused (RL-focused) algorithm is employed to address the challenges posed by uneven datasets. The learning path is conceived as a series of linked decision points, with each instance representing a unique state. The network acts as an agent, earning rewards or suffering consequences according to its ability to distinguish between less frequent and more frequent classes. To commence the initial weight training, a methodology grounded in ML-ABC is implemented. This tactic adeptly adjusts the optimal food source for solution candidates, intertwining elements of mutual learning tied to the initial weights. The efficacy of the model is rigorously evaluated utilizing the BossBase 1.01 and BOWS datasets. Thorough experimentation is conducted on the selected dataset, with the objective of identifying optimal parameter values, including the reward mechanism. Subsequent results prominently highlight the superiority of our proposed solution compared to alternative methods explored within this research.
Convolutional neural networks (CNNs) have found extensive use in medical image segmentation tasks. However, they encounter limitations in capturing long-range semantic interactions. Conversely, Transformers excel at h...
详细信息
Convolutional neural networks (CNNs) have found extensive use in medical image segmentation tasks. However, they encounter limitations in capturing long-range semantic interactions. Conversely, Transformers excel at handling long-range dependencies but struggle to preserve local semantic details. To address this challenge, we propose STA-Former, a hybrid CNN-Transformer model for medical image segmentation. Our approach is founded on three fundamental principles: (1) We propose the Shrinkage Triplet Attention (STA) module to enhance feature fusion within the decoder. It focuses on spatial and channel interactions in the feature map, computes thresholds across dimensions, and suppresses irrelevant information through soft-thresholding. (2) We present a redesigned hierarchical hybrid CNN-Transformer encoder that connects CNN and Transformer blocks at multiple scales, enabling the capture of both long-range and short-range dependencies across various scales of feature maps. (3) Unlike traditional decoders that apply the attention mechanism exclusively to low-level features, our approach utilizes a multiscale attention hierarchical decoder, leveraging feature map correlations at different scales for effective feature fusion. Our method exhibits superior performance compared to the state-of-the-art methods on three datasets: Synapse multiorgan CT, ACDC cardiac MRI scans, and breast ultrasound image.
The digital revolution places great emphasis on digital media watermarking due to the increased vulnerability of multimedia content to unauthorized alterations. Recently, in the digital boom in the technology of hidin...
详细信息
The digital revolution places great emphasis on digital media watermarking due to the increased vulnerability of multimedia content to unauthorized alterations. Recently, in the digital boom in the technology of hiding data, research has been tending to perform watermarking with numerous architectures of deep learning, which has explored a variety of problems since its inception. Several watermarking approaches based on deep learning have been proposed, and they have proven their efficiency compared to traditional methods. This paper summarizes recent developments in conventional and deep learning image and video watermarking techniques. It shows that although there are many conventional techniques focused on video watermarking, there are yet to be any deep learning models focusing on this area;however, for image watermarking, different deep learning-based techniques where efficiency in invisibility and robustness depends on the used network architecture are observed. This study has been concluded by discussing possible research directions in deep learning-based video watermarking.
In vehicular communications, channel estimation is a complex problem due to the joint time-frequency selectivity of wireless propagation channels. To this end, several signalprocessing techniques as well as approache...
详细信息
In vehicular communications, channel estimation is a complex problem due to the joint time-frequency selectivity of wireless propagation channels. To this end, several signalprocessing techniques as well as approaches based on neural networks have been proposed to address this issue. Due to the highly dynamic and random nature of vehicular communication environments, precise characterization of temporal correlation across a received data sequence can enable more accurate channel estimation. This paper proposes a new pilot constellation scheme in combination with a small feed-forward neural network to improve the accuracy of channel estimation in V2X systems while keeping low the implementation complexity. The performance is evaluated in typical vehicular channels using simulated BER curves, and it is found superior to traditional channel estimation methods and state-of-the-art neural-network-based implementations such as feed-forward and super-resolution. It is illustrated that the improvement becomes pronounced for small subcarrier spacings (or low 5G numerologies);hence, this paper contributes to the development of more reliable mobile services across rapidly varying vehicular communication channels with rich multi-path interference. In vehicular communication systems, estimating channels is complex due to their time-frequency selectivity. This paper introduces a novel pilot constellation scheme coupled with a compact feed-forward neural network, aiming to enhance the accuracy of channel estimation in vehicle-to-everything (V2X) systems while minimizing implementation complexity. The approach outperforms traditional methods and advanced neural-network-based methods, particularly in environments with small subcarrier spacings, thus aiding the provision of more dependable mobile services in rapidly changing vehicular communication channels with significant multipath interference. image
Digital image applications have been extensively utilized in entertainment, education, research, medicine, and industry. Many images should be resized for better demonstration. In general, image resizing is performed ...
详细信息
This paper proposes a Self-Attention Convolutional neural Network (SACNN) optimized with Arithmetic Optimization Algorithm (AOA) for coinciding Diabetic Retinopathy (DR) and Diabetic Macular Edema Grading (DMEG) (SACN...
详细信息
This paper proposes a Self-Attention Convolutional neural Network (SACNN) optimized with Arithmetic Optimization Algorithm (AOA) for coinciding Diabetic Retinopathy (DR) and Diabetic Macular Edema Grading (DMEG) (SACNN-AOA-DR-DMEG). Initially, the input image is collected from 2 openly available benchmark datasets, namely Messidor and ISBI 2018 IDRiD. Then the input image is pre-processing using Altered Phase Preserving Dynamic Range Compression (APPDRC) for reducing noise from the imageries. SACNN receives the pre-processed imageries. The SACNN has three modules: (i) plane attention module, (ii) depth attention module, (iii) Attention Fusion Module. DR and DME features are extracted by plane attention module and depth attention module of SACNN. Attention Fusion Module receives extracted characteristics for categorizing and grading DR and DME disorders. SACNN does not adopt any optimization techniques to guarantee accurate DR and DME grading disorders. That's why, Arithmetic Optimization Algorithm (AOA) is deemed to optimize the SACNN weight parameters. The proposed technique is implemented in Python. The proposed SACNN-AOA-DR-DMEG method provides 11.18%, 18.99% and 23.76% higher accuracy for diabetic retinopathy grading;11.52%, 29.62% and 20.38% higher accuracy for DMEG;33.39%, 22%, 39.26% lower computation time on Messidor data compared with the existing methods, such as AMGNN-DR-DMEG, LCNN-DR-DMEG, and FFN-DR-DMEG respectively.
暂无评论