Deep learning in image classification has achieved remarkable success but at the cost of high resource demands. Model compression through automatic joint pruning-quantization addresses this issue, yet most existing te...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Deep learning in image classification has achieved remarkable success but at the cost of high resource demands. Model compression through automatic joint pruning-quantization addresses this issue, yet most existing techniques overlook a critical aspect: layer correlations. These correlations are essential as they expose redundant computations across layers, and leveraging them facilitates efficient design space exploration. This study employs Graph neural Networks (GNN) to learn these inter-layer relationships, thereby optimizing the pruning-quantization strategy for the targeted model. This approach has yielded a 99.36% reduction in complexity for ResNet20 on CIFAR-10, with only a minimal 0.11% drop in accuracy. Furthermore, the integration of GNN sped up the convergence process, reducing iterations by 2.46 times on average, compared to methods without GNN.
Convolutional neural networks (CNNs) have long been the paradigm of choice for robust medical imageprocessing (MIP). Therefore, it is crucial to effectively and efficiently deploy CNNs on devices with different compu...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
Convolutional neural networks (CNNs) have long been the paradigm of choice for robust medical imageprocessing (MIP). Therefore, it is crucial to effectively and efficiently deploy CNNs on devices with different computing capabilities to support computer-aided diagnosis. Many methods employ factorized convolutional layers to alleviate the burden of limited computational resources at the expense of expressiveness. To this end, given weak medical image-driven CNN model optimization, a Singular value equalization generalizer-induced Factorized Convolution (SFConv) is proposed to improve the expressive power of factorized convolutions in MIP models. We first decompose the weight matrix of convolutional filters into two low-rank matrices to achieve model reduction. Then minimize the KL divergence between the two low-rank weight matrices and the uniform distribution, thereby reducing the number of singular value directions with significant variance. Extensive experiments on fundus and OCTA datasets demonstrate that our SFConv yields competitive expressiveness over vanilla convolutions while reducing complexity.
Recent work developed convolutional deep kernel machines, achieving 92.7% test accuracy on CIFAR-10 using a ResNet-inspired architecture, which is SOTA for kernel methods. However, this still lags behind neural networ...
The design of arrays capable of receiving wideband signals differs from arrays that can only receive narrowband signals. These arrays must be able to receive signals with an instant bandwidth of several GHz across the...
详细信息
The design of arrays capable of receiving wideband signals differs from arrays that can only receive narrowband signals. These arrays must be able to receive signals with an instant bandwidth of several GHz across the entire operating frequency, such as High-Resolution Radars or Terahertz in 6G communication systems. In these arrays, using a time delay line structure leads to an increase in beamformer coefficients, resulting in high computational complexity. This poses a challenge for beamforming in wideband systems. Additionally, classic Wideband beamformers face other factors, such as poor performance in the presence of input direction of arrival error, array calibration error, and requiring too many snapshots to reach the steady state of the beamformer. Therefore, the robustness of wideband adaptive beamforming using deep unfolding model-based technique is focused on, which has not been discussed before. The advent of deep unfolding, an innovative technique, amalgamates iterative optimization approaches with elements of neural networks. The aim is to deftly maneuver through various tasks across disciplines such as machine learning, signal and imageprocessing, and telecommunication systems. Also, the network training method is done to become more robust against the mentioned factors. In the proposed structure, the constraints of the previous methods have been evaluated. It is observed to have better performance compared to other classic algorithms. Also, with the investigations of the proposed method with other conventional deep learning methods, it was observed that in some cases the proposed structure performance is equal to the conventional deep learning method and sometimes better. This article introduces a novel approach to estimating robust adaptive wideband beamforming coefficients called the deep unfolding model-based method. The method transforms an iterative algorithm with a fixed number of iterations into a layer-wise structure resembling a neural netw
images captured under improper exposure conditions lose their brightness information and texture details. Therefore, the enhancement of low-light images has received widespread attention. In recent years, most methods...
详细信息
images captured under improper exposure conditions lose their brightness information and texture details. Therefore, the enhancement of low-light images has received widespread attention. In recent years, most methods are based on deep convolutional neural networks to enhance low-light images in the spatial domain, which tends to introduce a huge number of parameters, thus limiting their practical applicability. In this paper, we propose a Fourier-based two-stage low-light image enhancement method via mutual learning (FT-LLIE), which sequentially enhance the amplitude and phase components. Specifically, we design the amplitude enhancement module (AEM) and phase enhancement module (PEM). In these two enhancement stages, we design the amplitude enhancement block (AEB) and phase enhancement block (PEB) based on the Fast Fourier Transform (FFT) to deal with the amplitude component and the phase component, respectively. In AEB and PEB, we design spatial unit (SU) and frequency unit (FU) to process spatial and frequency domain information, and adopt a mutual learning strategy so that the local features extracted from the spatial domain and global features extracted from the frequency domain can learn from each other to obtain complementary information to enhance the image. Through extensive experiments, it has been shown that our network requires only a small number of parameters to effectively enhance image details, outperforming existing low-light image enhancement algorithms in both qualitative and quantitative results.
With the increasing number of images and videos consumed by computer vision algorithms, compression methods are evolving to consider both perceptual quality and performance in downstream tasks. Traditional codecs can ...
详细信息
ISBN:
(纸本)9798350387261;9798350387254
With the increasing number of images and videos consumed by computer vision algorithms, compression methods are evolving to consider both perceptual quality and performance in downstream tasks. Traditional codecs can tackle this problem by performing rate-distortion optimization (RDO) to minimize the distance at the output of a feature extractor. However, neural network non-linearities can make the rate-distortion landscape irregular, leading to reconstructions with poor visual quality even for high bit rates. Moreover, RDO decisions are made block-wise, while the feature extractor requires the whole image to exploit global information. In this paper, we address these limitations in three steps. First, we apply Taylor's expansion to the feature extractor, recasting the metric as an input-dependent squared error involving the Jacobian matrix of the neural network. Second, we make a localization assumption to compute the metric block-wise. Finally, we use randomized dimensionality reduction techniques to approximate the Jacobian. The resulting expression is monotonic with the rate and can be evaluated in the transform domain. Simulations with AVC show that our approach provides bit-rate savings while preserving accuracy in downstream tasks with less complexity than using the feature distance directly.
Cuttings logging is an important technology in petroleum exploration and production. It can be used to identify rock types, oil and gas properties, and reservoir features. However, the cuttings collected during cuttin...
详细信息
Cuttings logging is an important technology in petroleum exploration and production. It can be used to identify rock types, oil and gas properties, and reservoir features. However, the cuttings collected during cuttings logging are often small and few. Meanwhile, the surface color of cuttings is dark and the boundary is fuzzy. Traditional image segmentation methods have low accuracy. So it is difficult to identify and classify cuttings. Therefore, it is important to improve the accuracy of cuttings image segmentation. A deep learning-based cuttings image segmentation method is proposed in this paper. Firstly, the MultiRes module concept based on the UNet++ segmentation model is introduced in this paper, which proposes an improved end-to-end UNet++ image semantic segmentation model (called MultiRes-UNet++). Secondly, batch normalization into the input part of each layer's feature convolution layer is introduced too. Finally, a convolutional attention mechanism in the improved MultiRes-UNet++ segmentation model is introduced. Experimental results show that the accuracy between the segmentation results and the original image labels is 0.8791, the dice coefficient value is 0.8785, and the intersection over union is 0.7833. Compared with existing neural network segmentation algorithms, the performance is improved by about 5%. Compared with the algorithm before the fusion of the attention mechanism, the training speed is increased by about 75.2%. Our method can provide auxiliary information for cuttings logging. It is also of great significance for subsequent rock identification and classification.
The surface plasmon resonance (SPR) sensors are technologically attractive for applications that demand quick and accurate biological substance monitoring. Through its typical SPR image response, the resonance conditi...
详细信息
The surface plasmon resonance (SPR) sensors are technologically attractive for applications that demand quick and accurate biological substance monitoring. Through its typical SPR image response, the resonance condition indicated by the minimum reflectivity values works like an optical signature for changes in the refractive index (RI) of the substance under analysis. Recently, the incorporation of machine and deep learning methods (MDLMs) on SPR sensors to create intelligent tasks along the signalprocessing chain employed in SPR biosensing was witnessed. One possible intelligent application is substance identification based on the analysis of SPR responses. Occasionally, this problem is addressed with data from SPR curves, requiring a prior SPR image manipulation for the respective curve generation, leading to extra process steps and time consumption. This article presents the design of an intelligent SPR sensor with analyte identification capabilities directly from its SPR image, offering guidance on the precise moment for substance switching during injection routines. An image-based prediction model with convolutional neural networks (CNNs) was fine-tuned to directly identify individually aqueous solutions with different refractive indices. A new approach was described to generate SPR images (Fresnel images) from calculation with the Fresnel analysis (FA) framework. The proposed CNN architecture was evaluated and compared with seven state-of-the-art CNN architectures. The models were integrated into the experimental setup for real-time identification. The experimental tests demonstrate the viability of the overall pipeline for the model conception, being able to reach more than 96% accuracy in performing the identification task.
In image stitching, artifacts caused by misalignment affect the visual quality and the performance of subsequent tasks such as segmentation and detection. This paper proposes SMPR, a reconstruction-based aligned image...
详细信息
ISBN:
(纸本)9798350344868;9798350344851
In image stitching, artifacts caused by misalignment affect the visual quality and the performance of subsequent tasks such as segmentation and detection. This paper proposes SMPR, a reconstruction-based aligned image composition method to minimize artifacts. SMPR fuses images in part of the overlapping areas and reconstructs other portions from single images. Specifically, we propose a seam mask generation method to obtain optimal seam masks that pass through minimal misalignment. During training, we use the seam masks to guide the model in detecting optimal fusion areas. In testing, the model can detect fusion areas without seam masks and reconstruct stitching results. We propose a quantum-inspired local aggregation (QILA) module to improve feature reconstruction performance. We develop an encoder-decoder network with QILA and experiment on a real-world dataset. The experiments show that our method outperforms state-of-the-art methods in both qualitative and quantitative aspects.
Liver cancer remains a significant health concern, and accurate segmentation in CT scans is crucial for diagnosis and treatment. Deep learning -based auxiliary diagnosis techniques, especially utilizing U-shaped struc...
详细信息
Liver cancer remains a significant health concern, and accurate segmentation in CT scans is crucial for diagnosis and treatment. Deep learning -based auxiliary diagnosis techniques, especially utilizing U-shaped structures, are widely employed in medical image segmentation. However, traditional methods that utilize Convolutional neural Networks (CNNs) generally have limitations in modeling long-range dependencies. Inspired by the success of Transformers in various vision tasks, approaches that combine Transformers with CNNs have been spurred. However, many existing hybrid CNN -Transformer models are prone to yielding poor performance on relative small-scale medical image datasets when trained from scratch. Moreover, some of these methods involve additional fusion modules customized, which introduce extra workload and parameters to the model. To address these limitations, we propose AD-DUNet, a hybrid CNN -Transformer model for liver and hepatic tumor segmentation, which comprises a dual -branch encoder and a residual decoder. The Transformer -based encoder, utilizing Axial Transformer (AT) blocks, efficiently captures long-range dependencies across the entire image, while the CNN -based encoder, constructed with cascaded dilated convolutions (CDC) blocks, extracts fine-grained local features. The two encoders synergize in the shared residual decoder, eliminating the need for additional fusion modules. The extensive experiments conducted on the LiTS2017 and 3DIRCAD datasets demonstrate the superiority of AD-DUNet over existing models. Remarkably, our approach achieves state-ofthe-art results without relying on pre -trained weights, showcasing its efficiency with low complexity and 4.24M parameters.
暂无评论