Colorectal cancer (CRC) histopathological image classification is a critical part of diagnosing CRC. In this context, the classification efficiency of deep learning methods is higher than that of physicians. However, ...
详细信息
Colorectal cancer (CRC) histopathological image classification is a critical part of diagnosing CRC. In this context, the classification efficiency of deep learning methods is higher than that of physicians. However, high complexity would weaken the flexibility of these methods in practice. This study proposed an enhanced lightweight convolutional neural network (EL-CNN) architecture with spatial pyramid pooling for automated classification of multi-class colorectal tissue histopathological images. In the EL-CNN, two enhanced convolutional block attention modules (ECBAM) were developed to enhance classification stability on different datasets. We compared the performance and complexity of the proposed EL-CNN with five common CNNs, as well as the effectiveness of the ECBAM with four popular attention networks, by a series of ablation studies on two publicly available datasets, including colorectal histology dataset and NCT-CRC-HE-100K. These models were evaluated by parameter size, FLOPs, accuracy, precision, sensitivity, specificity, F-1 score, and area under the curve. Specifically, the proposed EL-CNN with 3.72 M parameter size attained an accuracy of 95.24% on the colorectal histology dataset and 99.48 % on NCT-CRC-HE-100K. The proposed network keeps less complexity and better classification scores compared to most common convolutional neural networks, and even outperforms the existing state-of-the-art approaches. The comparative analysis shows that the proposed EL-CNN can take into account both robust performance and low complexity which may have significant application prospects in practical clinics.
Handwritten digit string recognition (HDSR) has received increased interest in recent years due to its vast practical applicability in both academia and industry. Approaches developed for handwritten text recognition ...
详细信息
Handwritten digit string recognition (HDSR) has received increased interest in recent years due to its vast practical applicability in both academia and industry. Approaches developed for handwritten text recognition (HTR) can be applied to HDSR if HDSR is viewed as a restricted version of HTR. It does, however, provide different challenges than HTR. For instance, a language model, which is critical for handwritten text recognition, is ineffective and cannot be employed in HDSR in general mode. The limited amount of training data is another problem influencing HDSR based on end-to-end deep learning methods. In this paper, we present a data-efficient end-to-end neural architecture for HDSR based on the HTR workflow. The proposed architecture is a gated fully convolutional network with no recurrent connections that is trained with CTC loss functions. In addition, two augmentation techniques are used to improve the model's performance. We examined our proposed model using the evaluation metrics introduced in the ICFHR 2014 competition. On the ORAND CAR-A, ORAND CAR-B, and CVL datasets, our best recognition rates are 95.41, 95.90, and 88.06%, respectively.
Scene Text image Super-Resolution (STISR) aims to enhance the resolution and legibility of text within low-resolution (LR) images, consequently elevating recognition accuracy in Scene Text Recognition (STR). Previous ...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Scene Text image Super-Resolution (STISR) aims to enhance the resolution and legibility of text within low-resolution (LR) images, consequently elevating recognition accuracy in Scene Text Recognition (STR). Previous methods predominantly employ discriminative Convolutional neural Networks (CNNs) augmented with diverse forms of text guidance to address this issue. Nevertheless, they remain deficient when confronted with severely blurred images, due to their insufficient generation capability when little structural or semantic information can be extracted from original images. Therefore, we introduce RGDiffSR, a Recognition-Guided Diffusion model for scene text image Super-Resolution, which exhibits great generative diversity and fidelity even in challenging scenarios. Moreover, we propose a Recognition-Guided Denoising Network, to guide the diffusion model generating LR-consistent results through succinct semantic guidance. Experiments on the TextZoom dataset demonstrate the superiority of RGDiffSR over prior state-of-the-art methods in both text recognition accuracy and image fidelity.
Breast Cancer (BC) is a killing disorder, every year it kills millions of human beings. Early diagnosis is the only way to mitigate the mortality rate. Among all kinds of screening methods, medical imaging is an essen...
详细信息
Breast Cancer (BC) is a killing disorder, every year it kills millions of human beings. Early diagnosis is the only way to mitigate the mortality rate. Among all kinds of screening methods, medical imaging is an essential method for screening BC. Existing medical imaging alters the tissue structure and cell morphology. To overcome these limitations, histopathology image is used because it can support the decision of pathologists about the closeness or the non-appearance of a disease, as well as it can help in infection development estimation. Hence, this research develops an efficient method for BC classification using the proposed Adam Golden Search Optimization-based Deep Convolutional neural Network (AGSO-DCNN). Initially, Gaussian filter-enabled preprocessing is utilized for mitigating the noises composed in the input images. Afterwards, k-means clustering is used to feed the input images into the segmentation phase to reduce the complexity of the image. Then, to extract features like shape features, statistical features, Local Vector Patterns (LVP), and Pyramid Histogram of Oriented Gradients (PHOG) feature extraction is performed. Thereafter, the obtained features are forwarded to the multigrade BC classification stage, where DCNN is employed for classifying the image into six categories, such as apoptosis, tubule, mitosis, non-tubule, tumour nuclei, and non-tumor nuclei. DCNN is trained by the formulated AGSO mechanism, which is obtained by incorporating the Adam Optimizer and Golden Search Optimization (GSO) algorithm. Moreover, the AGSO-based DCNN technique achieved better accuracy, TPR and TNR with the values of 97.90%, 98.00%, and 98.30%, respectively.
image restoration refers to restoring the original image as much as possible from the damaged or degraded image. In recent years, deep learning-based methods have become the mainstream method for image restoration. Ho...
详细信息
image restoration refers to restoring the original image as much as possible from the damaged or degraded image. In recent years, deep learning-based methods have become the mainstream method for image restoration. However, the existing deep learning-based image restoration networks have the same encoder and decoder structures. The homogeneous image recovery network has limited feature representation capability, which limits its image recovery capability. In addition, the homogeneous network does not exploit the relationship between global and local features of the image well, resulting in poor quality of the recovered images. To address the above issues, we propose a heterogeneous image restoration network (HIR-Net). HIR-Net uses Transformer as an encoder to extract the global features of the image, and a convolutional neural network -based feature enhancement block is designed as a decoder to recover the local details of the image. In addition, HIR-Net introduces a new feature fusion strategy to enhance the feature representation capability of the network. This strategy performs cross-attention operations on global features extracted by the encoder and local features extracted by the decoder to achieve cross-fusion of global and local features. Compared with state-of-the-art methods, the proposed method can achieve better performance on image denoising, image draining, and underwater image enhancement.
Millions of people worldwide suffer from malaria, a potentially fatal disease. Early and precise diagnosis is essential for the medical condition to be successfully treated and managed. This paper employs three comput...
详细信息
ISBN:
(纸本)9798350343557
Millions of people worldwide suffer from malaria, a potentially fatal disease. Early and precise diagnosis is essential for the medical condition to be successfully treated and managed. This paper employs three computer aided methods to determine percentages of red blood cells that are either parasitic or uninfected given test set(s) randomly obtained from National Institutes of Health (NIH) dataset. The three methods employed are traditional imageprocessing, Support Vector Machine (SVM), and Convolutional neural Networks based Deep Learning (CNN-DL). The simulations were performed using a dataset that had 27,558 images of red blood cells. The traditional imageprocessing method achieves an accuracy of 91.97%. SVM classifier using Histogram of Oriented Gradients (HOG) features had accuracy of 88.6% and with features extracted using Local Binary Patterns (LBP) accuracy had improved to 92.5%. The two previous methods were proved to be inferior when compared with the CNN- DL classification that gave an accuracy of 95.7%.
Deep Learning has made significant strides in recent years, particularly in supervised learning tasks, leading to the development of numerous architectures aimed at improving various aspects of model performance. Desp...
详细信息
ISBN:
(纸本)9798350367331;9798350367348
Deep Learning has made significant strides in recent years, particularly in supervised learning tasks, leading to the development of numerous architectures aimed at improving various aspects of model performance. Despite the effectiveness of backpropagation (BP) and stochastic gradient descent in training deep networks, these methods are often hindered by time-intensive computations, exploding and vanishing gradients, and significant memory overhead. Alternative training strategies that reduce reliance on global BP are increasingly being explored to address these limitations. This paper proposes a simple architecture that integrates Hilbert Schmidt Independence Criterion (HSIC) layers with linear layers, where the HSIC layers are trained locally, and the linear layers are optimized using global BP. This hybrid approach mitigates the drawbacks of BP while enhancing the model's ability to learn complex features across multiple layers. Our proposed model is benchmarked against existing HSIC-only models across several datasets, including MNIST, CIFAR-10, and Fashion MNIST. Results demonstrate the superior performance of our model, in terms of accuracy achieved and memory used. Additionally, we demonstrate its robustness and effectiveness when handling noisy data. The code is available at https://***/AreeBeee/***
Speech enhancement is designed to enhance the intelligibility and quality of speech across diverse noise conditions. Recently, diffusion models have gained lots of attention in speech enhancement area, achieving compe...
详细信息
Speech enhancement is designed to enhance the intelligibility and quality of speech across diverse noise conditions. Recently, diffusion models have gained lots of attention in speech enhancement area, achieving competitive results. Current diffusion-based methods blur the distribution of the signal with isotropic Gaussian noise and recover clean speech distribution from the prior. However, these methods often suffer from a substantial computational burden. We argue that the computational inefficiency partially stems from the oversight that speech enhancement is not purely a generative task;it primarily involves noise reduction and completion of missing information, while the clean clues in the original mixture do not need to be regenerated. In this paper, we propose a method that introduces noise with anisotropic guidance during the diffusion process, allowing the neural network to preserve clean clues within noisy recordings. This approach substantially reduces computational complexity while exhibiting robustness against various forms of noise and speech distortion. Experiments demonstrate that the proposed method achieves state-of-the-art results with only approximately 4.5 million parameters, a number significantly lower than that required by other diffusion methods. This effectively narrows the model size disparity between diffusion-based and predictive speech enhancement approaches. Additionally, the proposed method performs well in very noisy scenarios, demonstrating its potential for applications in highly challenging environments.
As a speciality, radiology produces the highest volume of medical images in clinical establishments compared to other commonly employed imaging modalities like digital pathology, ophthalmic imaging, etc. Archiving thi...
详细信息
As a speciality, radiology produces the highest volume of medical images in clinical establishments compared to other commonly employed imaging modalities like digital pathology, ophthalmic imaging, etc. Archiving this massive quantity of images with large file sizes is a major problem since the costs associated with storing medical images continue to rise with an increase in cost of electronic storage devices. One of the possible solutions is to compress them for effective storage. The prime challenge is that each modality is distinctively characterized by dynamic range and resolution of the signal and its spatial and statistical distribution. Such variations in medical images are different from camera-acquired natural scene images. Thus, conventional natural image compression algorithms such as J2K and JPEG often fail to preserve the clinically relevant details present in medical images. We address this challenge by developing a modality-specific compressor and a modality-agnostic generic decompressor implemented using a deep neural network (DNN) and capable of preserving clinically relevant image information. Architecture of the DNN is obtained through design space exploration (DSE) with the objective to feature the least computational complexity at the highest compression and a target high-quality factor, thereby leading to a low power requirement for computation. The neural compressed bitstream is further compressed using the lossless Huffman encoding to obtain a variable bit length and high-density compression (20 x -400x). Experimental validation is performed on X-ray, CT and MRI. Through quantitative measurement and clinical validation with a radiologist in the loop, we experimentally demonstrate our approach's performance superiority over traditional methods like JPEG and J2K operating at matching compression factors.
In the field of optical fiber vibration signal recognition, one-dimensional signals have few features. People often used the shallow layer of a one-dimensional convolutional neural network (1D -CNN), which results in ...
详细信息
In the field of optical fiber vibration signal recognition, one-dimensional signals have few features. People often used the shallow layer of a one-dimensional convolutional neural network (1D -CNN), which results in fewer features being learned by the network, leading to a poor recognition rate. There are also many complex algorithms and data processingmethods, which make the whole signal recognition process more complicated. Therefore, an optical vibration signal recognition method based on an efficient multidimensional feature extraction network was proposed. Based on ResNet-50, efficient channel attention (ECA) was used to improve image features extraction ability, and a long short-term memory (LSTM) network was used to enhance the extraction of temporal features. Three different vibration signals were collected using a phase -sensitive optical time -domain reflectometry (0OTDR) optical fiber sensing system. Vibration signals were converted into 128 x 128 grayscale images, which have more effective vibration information. The experimental results show that the three types of signals can be recognized and classified effectively by the network, and the average recognition rate is 98.67%. (c) 2024 Optica Publishing Group
暂无评论