Automatic detection of pavement crack is an important task for conducting road maintenance. However, as an important part of the intelligent transportation system, automatic pavement crack detection is challenging due...
详细信息
Automatic detection of pavement crack is an important task for conducting road maintenance. However, as an important part of the intelligent transportation system, automatic pavement crack detection is challenging due to the poor continuity of cracks, the different width of cracks, and the low contrast between cracks and the surrounding pavement. This study proposes a novel pavement crack detection method based on an end-to-end trainable deep convolution neural network. The authors build the network using the encoder-decoder architecture and adopt a pyramid module to exploit global context information for the complex topology structures of cracks. Moreover, they introduce a spatial-channel combinational attention module into the encoder-decoder network for refining crack features. Further, the dilated convolution is used to reduce the loss of crack details due to the pooling operation in the encoder network. In addition, they introduce a lovasz hinge loss function, which is suitable for small objects. They train the authors' network on the CRACK500 dataset and evaluate it on three pavement crack datasets. Among the methods they compare, their method can achieve the best experimental results.
Seismic facies analysis is to study the sedimentary environment of stratigraphic sequence and provides an important basis for reservoir prediction. Most of the existing analysis methods have low efficiency and heavily...
详细信息
Seismic facies analysis is to study the sedimentary environment of stratigraphic sequence and provides an important basis for reservoir prediction. Most of the existing analysis methods have low efficiency and heavily rely on manual experience, and therefore, it is difficult to interpret increasingly complex seismic data. Deep learning techniques can help to solve these problems and achieve automatic seismic facies classification. We regard seismic facies classification as a target segmentation problem and propose new method and training strategies. Our workflow primarily involves four sections. First, we process the manually annotated labels and seismic data with mirroring and cropping operations to ensure that network can accept input with arbitrary size and the model training is not limited to GPU memory. Second, data augmentation is applied to automatically generate massive training samples from the processed data. Third, we build two independent networks based on encoder-decoder architecture: one identifies all seismic facies simultaneously, and the other identifies single seismic facies in each model. However, both the results of the two networks have some drawbacks. Fourth, to overcome these drawbacks, we propose an ensemble learning method to get optimized model and test it on 3-D seismic data. The testing results manifest that the proposed method can improve the predictive ability of model, accurately describe the seismic facies, and can be applicable to entire seismic data volume.
In recent years, the use of wireless sensor networks has become increasingly widespread. Because of the instability of wireless networks, packet loss occasionally occurs. To reduce the impact of packet loss on data in...
详细信息
In recent years, the use of wireless sensor networks has become increasingly widespread. Because of the instability of wireless networks, packet loss occasionally occurs. To reduce the impact of packet loss on data integrity, we take advantage of the deep neural network's excellent ability to understand natural data and propose a data repair method based on a deep convolutional neural network with an encoder-decoder architecture. Compared with common interpolation algorithms and compressed sensing algorithms, this method obtains better repair results, is suitable for a wider range of applications, and does not need prior knowledge. This method adopts measures such as preparing training set data as well as the design and optimization of loss functions to achieve faster convergence speed, higher repair accuracy, and better stability. To fairly compare the repair performance of different signals, the mean squared error, relative peak-to-peak average error, and relative peak-to-peak max error are adopted to quantitatively evaluate the repair results of different signals. Comparative experiments prove that this method has better data recovery performance than traditional interpolation and compressed sensing algorithms.
Removing hair from digital dermoscopy images is occasionally a necessary step before further analysis is applied to the images. This work considers two machine learning approaches that segment the hair pixels from der...
详细信息
Removing hair from digital dermoscopy images is occasionally a necessary step before further analysis is applied to the images. This work considers two machine learning approaches that segment the hair pixels from dermoscopy images. Subsequently, morphological post-processing is applied to refine the segmented hair and an image inpainting algorithm replaces the hair pixels with values based on the surrounding image structures. The first hair segmentation approach combines pixel-wise features extracted using the well-known Gaussian image pyramid with a traditional shallow multilayer perceptron (MLP-ANN), to detect hair pixels in images. The second approach uses a deep neural convolutional encoder – decoder (ED) network to segment hair. Both hair segmentation methods (MLP-ANN and ED) are trained with a set of 32 dermoscopy images with manually annotated hair, whereas the MLP-ANN dataset is constructed in a pixel-wise manner. Both proposed methods underwent three different assessments. First a set of 50 images with a-priori known hair is used for hair segmentation evaluation. Secondly, a set of 13 different dermoscopy images with hair added using a suitably trained Generative Adversarial Network -GAN- are used to assess the quality of hair removal that generates the hair-free image, in terms of several error metrics with respect to the original hair-free image. Finally, both proposed hair segmentation methods (MLP-ANN and ED) are applied on a set of 200 hair and hair-free images, which is used for training an image classifier to recognize melanoma against nevi lesions and the improvement in the image classification accuracy is measured. Comparative results against several other state-of-the-art hair removal techniques are also presented. Results show that in terms of hair removal, both the proposed hair removal techniques outperform the best performing of the state-of-the-art methods under comparison, in terms of several error metrics. Considering the effect of hair re
Multi-stage attack is a kind of sophisticated intrusion strategy that has been widely used for penetrating the well protected network infrastructures. To detect such attacks, state-of-theart research advocates the use...
详细信息
Multi-stage attack is a kind of sophisticated intrusion strategy that has been widely used for penetrating the well protected network infrastructures. To detect such attacks, state-of-theart research advocates the use of hidden markov model (HMM). However, despite the HMM can model the relationships and dependencies among different alerts and stages for detection, they cannot handle well the stage dependencies buried in a longer sequence of alerts. In this paper, we tackle the challenge of the stages' long-term dependency and propose a new detection solution using a sequence-to-sequence (seq2seq) model. The basic idea is to encode a sequence of alerts (i.e., detector's observation) into a latent feature vector using a long-short term memory (LSTM) network and then decode this vector to a sequence of predicted attacking stages with another LSTM. By the encoder-decoder collaboration, we can decouple the local constraint between the observed alerts and the potential attacking stages, and thus able to take the full knowledge of all the alerts for the detection of stages in a sequence basis. By the LSTM, we can learn to "forget" irrelevant alerts and thereby have more opportunities to "remember" the long-term dependency between different stages for our sequence detection. To evaluate our model's effectiveness, we have conducted extensive experiments using four public datasets, all of which include simulated or re-constructed samples of real-world multi-stage attacks in controlled testbeds. Our results have successfully confirmed the better detection performance of our model compared with the previous HMM solutions. (c) 2021 Elsevier Ltd. All rights reserved.
Neural response generation is to generate human-like response given human utterance by using a deep learning. In the previous studies, expressing emotion in response generation improve user performance, user engagemen...
详细信息
ISBN:
(纸本)9781728160344
Neural response generation is to generate human-like response given human utterance by using a deep learning. In the previous studies, expressing emotion in response generation improve user performance, user engagement, and user satisfaction. Also, the conversational agents can communicate with users at the human level. However, the previous emotional response generation model cannot understand the subtle part of emotions, because this model use the desired emotion of response as a token form. Moreover, this model is difficult to generate natural responses related to input utterance at the content level, since the information of input utterance can be biased to the emotion token. To overcome these limitations, we propose an emotional response generation model which generates emotional and natural responses by using the emotion feature extraction. Our model consists of two parts: Extraction part and Generation part. The extraction part is to extract the emotion of input utterance as a vector form by using the pre-trained LSTM based classification model. The generation part is to generate an emotional and natural response to the input utterance by reflecting the emotion vector from the extraction part and the thought vector from the encoder. We evaluate our model on the emotion-labeled dialogue dataset: DailyDialog. We evaluate our model on quantitative analysis and qualitative analysis: emotion classification;response generation modeling;comparative study. In general, experiments show that the proposed model can generate emotional and natural responses.
The encoder-decoder based methods for semi-supervised video object segmentation (Semi-VOS) have received extensive attention due to their superior performances. However, most of them have complex intermediate networks...
详细信息
ISBN:
(纸本)9781728163956
The encoder-decoder based methods for semi-supervised video object segmentation (Semi-VOS) have received extensive attention due to their superior performances. However, most of them have complex intermediate networks which generate strong specifiers to be robust against challenging scenarios, and this is quite inefficient when dealing with relatively simple scenarios. To solve this problem, we propose a real-time network, Clue Refining Network for Video Object Segmentation (CRVOS), that does not have any intermediate network to efficiently deal with these scenarios. In this work, we propose a simple specifier, referred to as the Clue, which consists of the previous frame's coarse mask and coordinates information. We also propose a novel refine module which shows the better performance compared with the general ones by using a deconvolution layer instead of a bilinear upsampling layer. Our proposed method shows the fastest speed among the existing methods with a competitive accuracy. On DAVIS 2016 validation set, our method achieves 63.5 fps and J&F score of 81.6%.
Although autonomous driving have become applicable to the industry, the prevalent application of key techniques to the autonomous vehicles still needs to be refined. For instance, how to fast and accurately segment ro...
详细信息
Although autonomous driving have become applicable to the industry, the prevalent application of key techniques to the autonomous vehicles still needs to be refined. For instance, how to fast and accurately segment road markings in order to assist the next pedestrian path prediction and the creation of high-definition (HD) map respectively is useful for autonomous driving to be more practical. Current road marking segmentation mainly rely on the techniques of semantic segmentation of computer vision with encoder-decoder architecture. However, as demonstrated in this paper, the upsampling layer of convolutional neural networks with encoder-decoder architecture plays a significant role in the efficiency and accuracy of the road marking segmentation. The bilinear upsampling layer is fast due to its intrinsic simple interpolation but with less accuracy;on the contrary, the upsampling layer with offsets is relatively accurate but with more computational cost. Therefore, at least, in terms of prevalent application, efficiency, and accuracy, the upsampling layer of decoder of convolution neural networks should be paid more attention to for the next research work of autonomous driving. Copyright (C) 2020 The Authors.
Concrete deck delamination often demonstrates strong variations in size, shape, and temperature distribution under the influences of outdoor weather conditions. The strong variations create challenges for pure analyti...
详细信息
Concrete deck delamination often demonstrates strong variations in size, shape, and temperature distribution under the influences of outdoor weather conditions. The strong variations create challenges for pure analytical solutions in infrared image segmentation of delaminated areas. The recently developed supervised deep learning approach demonstrated the potentials in achieving automatic segmentation of RGB images. However, its effectiveness in segmenting thermal images remains under-explored. The main challenge lies in the development of specific models and the generation of a large range of labeled infrared images for training. To address this challenge, a customized deep learning model based on encoder-decoder architecture is proposed to segment the delaminated areas in thermal images at the pixel level. Data augmentation strategies were implemented in creating the training data set to improve the performance of the proposed model. The deep learning generated model was deployed in a real-world project to further evaluate the model's applicability and robustness. The results of these experimental studies supported the effectiveness of the deep learning model in segmenting concrete delamination areas from infrared images. It also suggested that data augmentation is a helpful technique to address the small size issue of training samples. The field test with validation further demonstrated the generalizability of the proposed framework. Limitations of the proposed approach were also briefed at the end of the paper.
Blood vessel segmentation is an important step in the automated diagnosis of ophthalmic disease from retinal fundus images. The UNet is a popular encoder-decoder architecture widely used in biomedical pixel-wise segme...
详细信息
ISBN:
(纸本)9781510638297
Blood vessel segmentation is an important step in the automated diagnosis of ophthalmic disease from retinal fundus images. The UNet is a popular encoder-decoder architecture widely used in biomedical pixel-wise segmentation problems. In this paper, we analyze how the UNet can be used in a more computationally efficient way. Pre-trained weights are used to initialize the network and 3 different architectures are used to compare and analyze the efficacy of the models in terms of both computational cost and performance. Three different deep architectures (VGG16, ResNet34, DenseNet121) are discussed and their efficiencies are compared for the blood vessel segmentation task. Resnet34 architecture achieved highest sensitivity of 0.849 and accuracy and specificity of 0.961, 0.9843 with number of parameters as low as 510178 compared to normal UNet with 34525168 parameters and a sensitivity of 0.756.
暂无评论