image segmentation is a critical step in digital imageprocessing applications. One of the most preferred methods for image segmentation is multilevel thresholding, in which a set of threshold values is determined to ...
详细信息
image segmentation is a critical step in digital imageprocessing applications. One of the most preferred methods for image segmentation is multilevel thresholding, in which a set of threshold values is determined to divide an image into different classes. However, the computational complexity increases when the required thresholds are high. Therefore, this paper introduces a modified Coronavirus Optimization algorithm for image segmentation. In the proposed algorithm, the chaotic map concept is added to the initialization step of the naive algorithm to increase the diversity of solutions. A hybrid of the two commonly used methods, Otsu's and Kapur's entropy, is applied to form a new fitness function to determine the optimum threshold values. The proposed algorithm is evaluated using two different datasets, including six benchmarks and six satellite images. Various evaluation metrics are used to measure the quality of the segmented images using the proposed algorithm, such as mean square error, peak signal-to-noise ratio, Structural Similarity Index, Feature Similarity Index, and Normalized Correlation Coefficient. Additionally, the best fitness values are calculated to demonstrate the proposed method's ability to find the optimum solution. The obtained results are compared to eleven powerful and recent metaheuristics and prove the superiority of the proposed algorithm in the image segmentation problem.
Recognizing real visual textures in the nature have been a challenging task since they are complex and stochastic. In spite of several decades of research, classifying the real world color textures are still challengi...
详细信息
Recognizing real visual textures in the nature have been a challenging task since they are complex and stochastic. In spite of several decades of research, classifying the real world color textures are still challenging because of the intricate nature of the textures and the lack of substantial improvement of accuracy in benchmark datasets. Deep Learning techniques have found to be effective in identifying and classifying the texture patterns to a larger extent, but it could not capture spectral information and achieve excellent results for natural images. In this paper, we propose a deep convolutional neural network architecture, WaveTexNeT that combines Wavelet convolutional neural networks (WaveletCNN) and Xception model with luminance information for classifying real-world natural textures. Spectral and spatial features are extracted from WaveletCNN and Xception model. The highlight of the work is the utilization of spectral and spatial information along with luminance for texture classification. A color space image data augmentation technique is proposed that use luminance images from YIQ model for color texture classification. This work also throws light into the significance of luminance information for texture classification. Experimental analysis of the work reports that WaveTexNeT captures better feature representations and outperforms the accuracy obtained using the state-of-the-art methods. WaveTexNeT obtained an accuracy of 90.34% and 95.01% for the describable and material perception texture datasets DTD and FMD respectively.
Recently, relying solely on T2I has gradually proven insufficient to meet the demands for image generation. As a result, people have started exploring more controllable image-generation methods based on Diffusion tech...
详细信息
Convolutional neural Networks (CNNs) have gained significant popularity in image classification tasks, yet achieving their optimal design remains a challenge due to the vast array of possible layer configurations and ...
详细信息
Circular synthetic aperture sonars (CSAS) capture multiple observations of a scene to reconstruct high-resolution images. We can characterize resolution by modelingCSAS imaging as the convolution between a scene's...
详细信息
Circular synthetic aperture sonars (CSAS) capture multiple observations of a scene to reconstruct high-resolution images. We can characterize resolution by modelingCSAS imaging as the convolution between a scene's underlying point scattering distribution and a system-dependent point spread function (PSF). The PSF is a function of the system bandwidth and determines a fixed degree of blurring on reconstructed imagery. In theory, deconvolution overcomes bandwidth limitations by reversing the PSF-induced blur and recovering the scene's scattering distribution. However, deconvolution is an ill-posed inverse problem and sensitive to noise. We propose an optimization method that leverages an implicit neural representation (INR) to deconvolve CSAS images. We highlight the performance of our SAS INR pipeline, which we call SINR, by implementing and comparing to existing deconvolution methods. Additionally, prior SAS deconvolution methods assume a spatially-invariant PSF, which we demonstrate yields subpar performance in practice. We provide theory and methods to account for a spatially-varying CSAS PSF, and demonstrate that doing so enables SINR to achieve superior deconvolution performance on simulated and real acoustic SAS data.
This paper proposes a two-stage 3D object detection framework, multiscale voxel graph neural network (MSV-RGNN) which aims to fully exploit multiple scale graph features by establishing global and local relationships ...
详细信息
ISBN:
(纸本)9781728198354
This paper proposes a two-stage 3D object detection framework, multiscale voxel graph neural network (MSV-RGNN) which aims to fully exploit multiple scale graph features by establishing global and local relationships between voxel features at different 3D convolutional neural network (CNN) layers. In contrast to conventional graph-based methods, our proposed multiscale-voxel-graph region-of-interest (RoI) pooling module constructs graphs across diverse voxel resolutions to obtain geometric structure information on voxel features. Initially, our multiscale-voxel-graph RoI pooling module sample voxel center points with voxel-wise feature vectors and 3D region proposals from backbone network. Subsequently, graphs are constructed at different scales and graph features are aggregated for second-stage refinement. The experimental results demonstrate the potential of using multiscale graphs across different voxel resolutions for 3D object detection, achieving decent experimental results with state-of-the-art methods.
In the evolving digital landscape, the proliferation of manipulated images poses a significant challenge to the authenticity and integrity of visual content. This project investigates cutting-edge image manipulation d...
详细信息
ultrasound imaging relies heavily on high-quality signalprocessing to provide reliable and interpretable image reconstructions. Conventionally, reconstruction algorithms have been derived from physical principles. Th...
详细信息
ultrasound imaging relies heavily on high-quality signalprocessing to provide reliable and interpretable image reconstructions. Conventionally, reconstruction algorithms have been derived from physical principles. These algorithms rely on assumptions and approximations of the underlying measurement model, lim-iting image quality in settings where these assumptions break down. Conversely, more sophisticated solutions based on statistical modeling or careful parameter tuning or derived from increased model complexity can be sensitive to different environments. Recently, deep learning-based methods, which are optimized in a data-driven fashion, have gained popularity. These model-agnostic techniques often rely on generic model structures and require vast training data to converge to a robust solution. A relatively new paradigm combines the power of the two: leveraging data-driven deep learning and exploiting domain knowledge. These model-based solutions yield high robustness and require fewer parameters and training data than conventional neural networks. In this work we provide an overview of these techniques from the recent literature and discuss a wide variety of ultra-sound applications. We aim to inspire the reader to perform further research in this area and to address the opportunities within the field of ultrasound signalprocessing. We conclude with a future perspective on model-based deep learning techniques for medical ultrasound. (E-mail: ***@***) (c) 2022 The Author(s). Published by Elsevier Inc. on behalf of World Federation for Ultrasound in Medicine & Biology. This is an open access article under the CC BY license (http://***/licenses/by/4.0/).
Diabetic retinopathy (DR) and diabetic macular edema (DME) are the major eternal blindness in aged people. In this manuscript, Auto-Metric Graph neural Network (AGNN) optimized with Capuchin search optimization al-gor...
详细信息
Diabetic retinopathy (DR) and diabetic macular edema (DME) are the major eternal blindness in aged people. In this manuscript, Auto-Metric Graph neural Network (AGNN) optimized with Capuchin search optimization al-gorithm is proposed for coinciding DR and DME grading (AGNN-CSO-DR-DME). The novelty of this work is to identify the Diabetic retinopathy and diabetic macular edema grading at initial stage with higher accuracy by decreasing the error rate and computation time. Initially, input image is taken from two public benchmark datasets that is ISBI 2018 imbalanced diabetic retinopathy grading dataset and Messidor dataset. Then, the input fundus image is pre-processed by APPDRC filtering method removes noise in input images. Also, the pre-processed images are given to the Gray level co-occurrence matrix (GLCM) window adaptive algorithm based feature extraction method. The extracted features of the DR and DME are fed to AGNN for classifying the grading of both DR and DME diseases. Generally, AGNN not reveal any adoption of optimization methods compute optimum parameters for assuring correct grading of both DR and DME diseases. Thus, CSOA is used for opti-mizing the AGNN weight parameters. The proposed method is carried out in python, its efficiency is assessed under performances metrics, such as f-measure, execution time and accuracy. The proposed method attains higher accuracy in ISBI 2018 IDRiD dataset 99.57 %, 97.28 %, and 96.34 %, compared with existing methods, like CANet-DR-DME, HDLCNN-MGMO-DR-DME, ANN-DR-DME and 91.17 %, 96.52 % and 97.36 %higher ac-curacy in Messidor dataset compared with existing methods, like CANet-DR-DME, TCNN-DR-DME, and 2-D-FBSE-FAWT-DR-DME.
Automatic detection of lettuce growth traits is of great significance in modern greenhouse cultivation. Existing methods mainly focus on capturing coarse representations from RGB or RGB-D images with learnable convolu...
详细信息
ISBN:
(纸本)9781728198354
Automatic detection of lettuce growth traits is of great significance in modern greenhouse cultivation. Existing methods mainly focus on capturing coarse representations from RGB or RGB-D images with learnable convolutional neural networks. However, due to the significant appearance-varying discrepancies at different growth stages, coarse representations and inefficient depth fusion strategies limit the performance of automatic detection of lettuce growth traits. To alleviate the above problem, this paper proposes a novel detection method for lettuce growth traits based on transformer and convolutional neural network. In this method, we design a dual-transformer module and a residual module to effectively extract multi-scale representations and depth representations from appearance-varying lettuce images. In addition, a feature coupling bridge is proposed to fuse the multi-scale representations and depth representations. The experimental results show that our method outperforms the state-of-the-art methods.
暂无评论