In ground-penetrating radar (GPR) data, clutter and noise are commonly observed in B-scan images, which can seriously affect the interpretability of the GPR data. In this article, we propose wavelet-GAN, a deep-learni...
详细信息
In ground-penetrating radar (GPR) data, clutter and noise are commonly observed in B-scan images, which can seriously affect the interpretability of the GPR data. In this article, we propose wavelet-GAN, a deep-learning network that integrates generative adversarial network (GAN) and discrete wavelet transform (DWT). Wavelet-GAN could decompose the GPR image into multiple frequency subimages and remove clutter. Additionally, we solve the problem of error handling when there are no features in the dataset through micro datasets, dataset fine-tuning, high-speed training, and multiple feature generalization. Our method decomposes GPR image by DWT, then convolutional neural network (CNN) and GAN are, respectively, used to reconstruct low-frequency and high-frequency target signal information. Finally, the information of different frequency bands is combined into a new GPR image by inverse DWT (IDWT). Wavelet-GAN uses a small-scale dataset for training, which enables it to make rapid adjustments to process new target types, even if we only have one typical target data. We compare our method with traditional methods and other deep-learning based methods and demonstrate that our wavelet-GAN performs better in real data processing. Finally, we apply this method as a data-preprocessing tool for machine learning inversion and tested its feasibility.
Over the last decade, the use of machine learning in smart agriculture has surged in popularity. deeplearning, particularly Convolutional Neural Networks (CNNs), has been useful in identifying diseases in plants at a...
详细信息
Over the last decade, the use of machine learning in smart agriculture has surged in popularity. deeplearning, particularly Convolutional Neural Networks (CNNs), has been useful in identifying diseases in plants at an early stage. Recently, Vision Transformers (ViTs) have proven to be effective in image classification tasks. These architectures often outperform most state-of-the-art CNN models. However, the adoption of vision transformers in agriculture is still in its infancy. In this paper, we evaluated the performance of vision transformers in identification of mango leaf diseases and compare them with popular CNNs. We proposed an optimized model based on a pretrained Data-efficient image Transformer (DeiT) architecture that achieves 99.75% accuracy, better than many popular CNNs including SqueezeNet, ShuffleNet, EfficientNet, DenseNet121, and MobileNet. We also demonstrated that vision transformers can have a shorter training time than CNNs, as they require fewer epochs to achieve optimal results. We also proposed a mobile app that uses the model as a backend to identify mango leaf diseases in real-time.
The increasing number of vehicles on the road has made traffic regulations challenging to manage, particularly in large and crowded cities. real-time traffic monitoring systems are one of the most important factors th...
详细信息
The increasing number of vehicles on the road has made traffic regulations challenging to manage, particularly in large and crowded cities. real-time traffic monitoring systems are one of the most important factors that enable efficient traffic flow and enhanced mobility. Therefore, vehicles and drivers have always needed reliable and accurate real-time traffic information. Recently, various solutions have been proposed to solve the problems and concerns in traffic situations. One alternative solution is vehicular cloud computing (VCC). Additionally, an IoTaided robotic (IoRT) model has been developed with a modern architecture that integrates IoT sensor nodes and cameras to gather real-time traffic data. The main contributions of this research work are to implement two deeplearning techniques based on modified LeNet-5 for real-time traffic sign recognition and the transfer learningbased Inception -V3 model for detecting and recognizing traffic lights. Furthermore, optimal distance was found between the ultrasonic sensors and the obstacles using ultrasonics' waves time and speed to reduce road accidents. The data, which is collected by sensors and cameras, is processed using various imageprocessing algorithms and it is sent to the cloud to be available for drivers and commuters through a mobile application. Test results indicate that the proposed models have significant improvements in terms of accuracy. The modified LeNet-5 achieved accuracy rates of 99.12% and 99.78% on the German Traffic Sign Recognition Benchmark (GTSRB) and extended GTSRB (EGTSRB) datasets, respectively, whereas the second model, trained on Laboratory for the Intelligent and Safe Automobiles (LISA) dataset, attained a 98.6% accuracy rate. Compared to the related traffic monitoring systems, the findings of this study outperform other works by 3.78% for traffic sign recognition and by 1.02% for traffic light detection and recognition.
Direction-of-arrival (DOA) estimation is a fundamental task in audio signal processing that becomes difficult in real-world environments due to the presence of reverberation. To address this difficulty, Direct-Path Do...
详细信息
Direction-of-arrival (DOA) estimation is a fundamental task in audio signal processing that becomes difficult in real-world environments due to the presence of reverberation. To address this difficulty, Direct-Path Dominance (DPD) tests have been proposed as an effective approach for detecting time-frequency (TF) bins dominated by direct sound, which contain accurate DOA information. These have been found to be particularly efficient when working with spherical arrays. While methods based on neural networks (NNs) have been developed to estimate the DOA, they have limitations such as the need for a large training database, and often understanding of the system's operation is lacking. This work proposes two novel DPD-test methods based on a model-based deeplearning approach that combines the original DPD-test model with a data-driven system. Thus, it is possible to preserve the robustness of the original DPD-test across acoustic environments, while using a data-driven approach to better extract useful information about the direct sound, thereby enhancing the original method's performance. In particular, the paper investigates how energetic, temporal and spatial information contribute to the identification of TF-bins dominated by the direct signal. The proposed methods are trained on simulated data of a single sound source in a room, and evaluated on simulated and real data. The results show that energetic and temporal information provide new information about direct sound, which has not been considered in previous works and can improve its performance.
Automatic segmentation of histopathology whole -slide images (WSI) usually involves supervised training of deeplearning models with pixel -level labels to classify each pixel of the WSI into tissue regions such as be...
详细信息
Automatic segmentation of histopathology whole -slide images (WSI) usually involves supervised training of deeplearning models with pixel -level labels to classify each pixel of the WSI into tissue regions such as benign or cancerous. However, fully supervised segmentation requires large-scale data manually annotated by experts, which can be expensive and time-consuming to obtain. Non -fully supervised methods, ranging from semi -supervised to unsupervised, have been proposed to address this issue and have been successful in WSI segmentation tasks. But these methods have mainly been focused on technical advancements in algorithmic performance rather than on the development of practical tools that could be used by pathologists or researchers in real -world scenarios. In contrast, we present DEPICTER (deep rEPresentatIon ClusTERing), an interactive segmentation tool for histopathology annotation that produces a patch -wise dense segmentation map at WSI level. The interactive nature of DEPICTER leverages self- and semi -supervised learning approaches to allow the user to participate in the segmentation producing reliable results while reducing the workload. DEPICTER consists of three steps: first, a pretrained model is used to compute embeddings from image patches. Next, the user selects a number of benign and cancerous patches from the multi -resolution image. Finally, guided by the deep representations, label propagation is achieved using our novel seeded iterative clustering method or by directly interacting with the embedding space via feature space gating. We report both real-time interaction results with three pathologists and evaluate the performance on three public cancer classification dataset benchmarks through simulations. The code and demos of DEPICTER are publicly available at https://***/eduardchelebian/depicter.
This paper presents a deeplearning model specifically designed to effectively classify display Mura images. The model leverages advanced deeplearning techniques and computer vision methods to identify and categorize...
详细信息
image stitching is the synthesis of multiple partial image segments into a complete and continuous panoramic image through effective image alignment and seamless fusion techniques. It can achieve a wider field of view...
详细信息
image stitching is the synthesis of multiple partial image segments into a complete and continuous panoramic image through effective image alignment and seamless fusion techniques. It can achieve a wider field of view and richer information for display and analysis. Most deeplearning-based image stitching methods have significant advantages in improving accuracy, but they are not suitable for real-time applications due to multiple iterations of computation or deeper network depth. To deal with this problem, a fast unsupervised image stitching model is proposed in this article. In the proposed model, an adaptive feature extraction module (FEM) for deformation is designed, and then a fast unsupervised learning-based image alignment network is proposed. In addition, a stitching restoration network with a smaller number of parameters is presented to remove the redundant and unnecessary sampling and convolution operations in general deeplearning-based models. Finally, some experiments are conducted on both the synthetic and real-scene datasets. The total stitching accuracy of the proposed model is higher, and the details of the output images are clearer. The proposed can achieve 1.79, 26.54, and 0.86 in RMSE, peak signal-to-noise ratio (PSNR), and structural similarity (SSIM) on the alignment results, respectively, which are better than those of the state-of-the-art methods. Furthermore, the comparison results prove that the proposed model can effectively reduce memory loss, and achieve a fast unsupervised image stitching, with a very small model size.
Abstract: In order to improve the intellective level of water resources management, a real-time water level recognition method based on deep-learning algorithms and image-processing techniques is proposed in this pape...
详细信息
ISBN:
(纸本)9781450395687
Abstract: In order to improve the intellective level of water resources management, a real-time water level recognition method based on deep-learning algorithms and image-processing techniques is proposed in this paper. The recognition process is composed of four steps. Firstly, for the purpose of digit detection, YOLO-v3 model is deployed for extracting numbers from the water gauges. Then, the cropped number images are fed into the LSTM + CTC model as training samples so that digits can be recognized. In the third step, Hough transform are adopted to correct the tilt of water gauge in terms of the vertical edge feature. Morphological operation, associated with horizontal projection would position upper and lower edge of water gauge to recognize the scale lines correctly. Water level could be determined correspondingly. Model application shows that the recognition model has satisfying accuracy and efficiency, with potential being applied in practice.
Random noise attenuation is significant in seismic data *** deeplearning-based denoising methods have been widely developed and applied in recent *** practice,it is often time-consuming and laborious to obtain noise-...
详细信息
Random noise attenuation is significant in seismic data *** deeplearning-based denoising methods have been widely developed and applied in recent *** practice,it is often time-consuming and laborious to obtain noise-free data for supervised ***,we propose a novel deeplearning framework to denoise prestack seismic data without clean labels,which trains a high-resolution residual neural network(SRResnet)with noisy data for input and the same valid data with different noise for *** valid signals in noisy sample pairs are spatially correlated and random noise is spatially independent and unpredictable,the model can learn the features of valid data while suppressing random *** data targets are generated by a simple conventional method without fine-tuning *** initial estimates allow signal or noise leakage as the network does not require clean *** Monte Carlo strategy is applied to select training patches for increasing valid patches and expanding training *** learning is used to improve the generalization of real data *** synthetic and real data tests perform better than the commonly used state-of-the-art denoising methods.
This paper introduces an innovative framework for wind power prediction that focuses on the future of energy forecasting utilizing intelligent deeplearning and strategic feature engineering. This research investigate...
详细信息
This paper introduces an innovative framework for wind power prediction that focuses on the future of energy forecasting utilizing intelligent deeplearning and strategic feature engineering. This research investigates the application of a state-of-the-art deeplearning model for wind energy prediction to make extremely short-term forecasts using real-time data on wind generation from New South Wales, Australia. In contrast with typical approaches to wind energy forecasting, this model relies entirely on historical data and strategic feature engineering to make predictions, rather than relying on meteorological parameters. A hybrid feature engineering strategy that integrates features from several feature generation techniques to obtain the optimal input parameters is a significant contribution to this work. The model's performance is assessed using key metrics, yielding optimal results with a Mean Absolute Error (MAE) of 8.76, Mean Squared Error (MSE) of 139.49, Root Mean Squared Error (RMSE) of 11.81, R-squared score of 0.997, and Mean Absolute Percentage Error (MAPE) of 4.85%. Additionally, the proposed framework outperforms six other deeplearning and hybrid deeplearning models in terms of wind energy prediction accuracy. These findings highlight the importance of advanced data analysis for feature generation in data processing, pointing to its key role in boosting the precision of forecasting applications.
暂无评论