Although autonomous driving have become applicable to the industry, the prevalent application of key techniques to the autonomous vehicles still needs to be refined. For instance, how to fast and accurately segment ro...
详细信息
Although autonomous driving have become applicable to the industry, the prevalent application of key techniques to the autonomous vehicles still needs to be refined. For instance, how to fast and accurately segment road markings in order to assist the next pedestrian path prediction and the creation of high-definition (HD) map respectively is useful for autonomous driving to be more practical. Current road marking segmentation mainly rely on the techniques of semantic segmentation of computer vision with encoder-decoder architecture. However, as demonstrated in this paper, the upsampling layer of convolutional neural networks with encoder-decoder architecture plays a significant role in the efficiency and accuracy of the road marking segmentation. The bilinear upsampling layer is fast due to its intrinsic simple interpolation but with less accuracy; on the contrary, the upsampling layer with offsets is relatively accurate but with more computational cost. Therefore, at least, in terms of prevalent application, efficiency, and accuracy, the upsampling layer of decoder of convolution neural networks should be paid more attention to for the next research work of autonomous driving.
Timely and accurate change detection on satellite images by using computer vision techniques has been attracting lots of research efforts in recent years. Existing approaches based on deep learning frameworks have ach...
详细信息
Timely and accurate change detection on satellite images by using computer vision techniques has been attracting lots of research efforts in recent years. Existing approaches based on deep learning frameworks have achieved good performance for the task of change detection on satellite images. However, under the scenario of disjoint changed areas in various shapes on land surface, existing methods still have shortcomings in detecting all changed areas correctly and representing the changed areas boundary. To deal with these problems, we design a coarse-to-fine detection framework via a boundary-aware attentive network with a hybrid loss to detect the change in high resolution satellite images. Specifically, we first perform an attention guided encoder-decoder subnet to obtain the coarse change map of the bi-temporal image pairs, and then apply residual learning to obtain the refined change map. We also propose a hybrid loss to provide the supervision from pixel, patch, and map levels. Comprehensive experiments are conducted on two benchmark datasets: LEBEDEV and SZTAKI to verify the effectiveness of the proposed method and the experimental results show that our model achieves state-of-the-art performance.
Recent approaches for real-time semantic segmentation usually employ the encoder-decoder architecture as the backbone to generate a high-quality segmentation prediction. There has been a lot of research on designing e...
详细信息
ISBN:
(纸本)9781728137988
Recent approaches for real-time semantic segmentation usually employ the encoder-decoder architecture as the backbone to generate a high-quality segmentation prediction. There has been a lot of research on designing efficient encoding methods. However, enhancing the performance of components in decoder is also crucial for pixel-level recognition. In this paper, we propose a self-learned feature reconstruction (SFR) method and an offset-dilated feature fusion (ODFF) module to improve the prediction reconstruction capability of the decoder. Concretely, SFR can effectively reconstruct the high-resolution feature maps by recombining feature space, in which the space transformation matrix implicitly contained in a convolution layer can selectively highlight features at each position by leveraging the knowledge of label space in a self-learned way. Moreover, ODFF module can effectively fuse multilevel features with multiscale contextual information by feeding the feature maps into designed parallel offset-dilated convolutions, which enhances the feature representation capability of the decoder. Experiments on Cityscapes and CamVid datasets demonstrate the superior performance of our proposed methods embedded in ESPNet.
Nuclear segmentation is an important step in quantitative profiling of colony organization in 3D cell culture models. However, complexities arise from technical variations and biological heterogeneities. We proposed a...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Nuclear segmentation is an important step in quantitative profiling of colony organization in 3D cell culture models. However, complexities arise from technical variations and biological heterogeneities. We proposed a new 3D segmentation model based on convolutional neural networks for 3D nuclear segmentation, which overcomes the complexities associated with non-uniform staining, aberrations in cellular morphologies, and cells being in different states. The uniqueness of the method originates from (i) volumetric operations to capture all the three-dimensional features, and (ii) the encoder-decoder architecture, which enables segmentation of the spheroid models in one forward pass. The method is validated with four human mammary epithelial cell (HMEC) lines-each with unique genetic makeup. The performance of the proposed method is compared with the previous methods and is shown that the deep learning model has a superior pixel-based segmentation, and an F1-score of 0.95 is reported.
Change detection (CD) is essential to the accurate understanding of land surface changes using available Earth observation data. Due to the great advantages in deep feature representation and nonlinear problem modelin...
详细信息
Change detection (CD) is essential to the accurate understanding of land surface changes using available Earth observation data. Due to the great advantages in deep feature representation and nonlinear problem modeling, deep learning is becoming increasingly popular to solve CD tasks in remote-sensing community. However, most existing deep learning-based CD methods are implemented by either generating difference images using deep features or learning change relations between pixel patches, which leads to error accumulation problems since many intermediate processing steps are needed to obtain final change maps. To address the above-mentioned issues, a novel end-to-end CD method is proposed based on an effective encoder-decoder architecture for semantic segmentation named UNet++, where change maps could be learned from scratch using available annotated datasets. Firstly, co-registered image pairs are concatenated as an input for the improved UNet++ network, where both global and fine-grained information can be utilized to generate feature maps with high spatial accuracy. Then, the fusion strategy of multiple side outputs is adopted to combine change maps from different semantic levels, thereby generating a final change map with high accuracy. The effectiveness and reliability of our proposed CD method are verified on very-high-resolution (VHR) satellite image datasets. Extensive experimental results have shown that our proposed approach outperforms the other state-of-the-art CD methods.
This work models the reliability of software systems using recurrent neural networks with long short-term memory (LSTM) units and truncated backpropagation algorithm, and encoder-decoder LSTM architecture and proposes...
详细信息
ISBN:
(纸本)9781450355735
This work models the reliability of software systems using recurrent neural networks with long short-term memory (LSTM) units and truncated backpropagation algorithm, and encoder-decoder LSTM architecture and proposes LSTM with software reliability functions as activation functions and LSTM with input features as the output of software reliability functions. An initial evaluation on data coming from 4 industrial projects is also provided.
We introduce a novel approach that is used to convert images into the corresponding language descriptions. This method follows the most popular encoder-decoder architecture. The encoder uses the recently proposed dens...
详细信息
ISBN:
(纸本)9781538660676
We introduce a novel approach that is used to convert images into the corresponding language descriptions. This method follows the most popular encoder-decoder architecture. The encoder uses the recently proposed densely convolutional neural network (DenseNet) to extract the feature maps. Meanwhile, the decoder uses the long short time memory (LSTM) to parse the feature maps to descriptions. We predict the next word of descriptions by taking the effective combination of feature maps with word embedding of current input word by "visual attention switch". Finally, we compare the performance of the proposed model with other baseline models and achieve good results.
Transliteration is the process of converting words from a given source language alphabet to a target language alphabet, in a way that best preserves the phonetic and orthographic aspects of the transliterated words. E...
详细信息
Transliteration is the process of converting words from a given source language alphabet to a target language alphabet, in a way that best preserves the phonetic and orthographic aspects of the transliterated words. Even though an important effort has been made towards improving this process for many languages such as English, French and Chinese, little research work has been accomplished with regard to the Arabic language. In this work, an attention-based encoder-decoder system is proposed for the task of Machine Transliteration between the Arabic and English languages. Our experiments proved the efficiency of our proposal approach in comparison to some previous research developed in this area. (C) 2017 The Authors. Published by Elsevier B.V.
This paper presents an adaptive encoding framework for the reduction of transition activity in high-capacitance off-chip data buses, since power dissipation associated with those buses can be significant for high-spee...
详细信息
This paper presents an adaptive encoding framework for the reduction of transition activity in high-capacitance off-chip data buses, since power dissipation associated with those buses can be significant for high-speed communication. The technique relies on the observation of data characteristics over fixed window sizes and formation of cluster with bit lines having highly correlated switching patterns. The proposed method utilizes redundancy in space and time to prevent loss of information while retrieving data. We present analytical and experimental analyses, which demonstrate the activity reduction of our encoding scheme for various data. The extra power cost due to the encoder and decoder circuitry along with redundancy is offset due to reduced number of off-chip transitions.
In bioacoustics, automatic animal voice detection and recognition from audio recordings is an emerging topic for animal preservation. Our research focuses on bird bioacoustics, where the goal is to segment bird syllab...
详细信息
ISBN:
(纸本)9781509041176
In bioacoustics, automatic animal voice detection and recognition from audio recordings is an emerging topic for animal preservation. Our research focuses on bird bioacoustics, where the goal is to segment bird syllables from the recording and predict the bird species for the syllables. Traditional methods for this task addresses the segmentation and species prediction separately, leading to propagated errors. This work presents a new approach that performs simultaneous segmentation and classification of bird species using a Convolutional Neural Network (CNN) with encoder-decoder architecture. Experimental results on bird recordings show significant improvement compared to recent state-of-the-art methods for both segmentation and species classification.
暂无评论