检索结果-内蒙古大学图书馆

Learning and Adapting Robust Features for Satellite Image Segmentation on Heterogeneous Data Sets

IEEE TRANSACTIONS ON GEOSCIENCE AND REMOTE SENSING 2019年第9期57卷 6517-6529页

作者： Ghassemi, Sina Fiandrotti, Attilio Francini, Gianluca Magli, Enrico Polytech Univ Turin Elect & Telecommun Dept I-10129 Turin Italy Univ Paris Saclay Telecom Paristech Image Data Signals Dept Multimedia Grp St Aubin France Politecn Torino Telecom Italia I-10148 Turin Italy

This paper addresses the problem of training a deep neural network for satellite image segmentation so that it can be deployed over images whose statistics differ from those used for training. For example, in postdisaster damage assessment, the tight time constraints make it impractical to train a network from scratch for each image to be segmented. We propose a convolutional encoder- decoder network able to learn visual representations of increasing semantic level as its depth increases, allowing it to generalize over a wider range of satellite images. Then, we propose two additional methods to improve the network performance over each specific image to be segmented. First, we observe that updating the batch normalization layers' statistics over the target image improves the network performance without human intervention. Second, we show that refining a trained network over a few samples of the image boosts the network performance with minimal human intervention. We evaluate our architecture over three data sets of satellite images, showing the state-of-the-art performance in binary segmentation of previously unseen images and competitive performance with respect to more complex techniques in a multiclass segmentation task.

关键词： Convolutional neural network (CNN) deep learning domain adaptation encoder-decoder architecture satellite image segmentation

来源：评论

学校读者我要写书评

暂无评论

Convolutional Edge Constraint-Based U-Net for Salient Object Detection

引用

IEEE ACCESS 2019年 7卷 48890-48900页

作者： Han, Le Li, Xuelong Dong, Yongsheng Chinese Acad Sci Xian Inst Opt & Precis Mech Key Lab Spectral Imaging Technol CAS Xian 710119 Shaanxi Peoples R China Univ Chinese Acad Sci Beijing 100049 Peoples R China Northwestern Polytech Univ Sch Comp Sci Xian 710072 Shaanxi Peoples R China Northwestern Polytech Univ Ctr OPT IMagery Anal & Learning OPTIMAL Xian 710072 Shaanxi Peoples R China Henan Univ Sci & Technol Sch Informat Engn Luoyang 471023 Peoples R China

The salient object detection is receiving more and more attention from researchers. An accurate saliency map will be useful for subsequent tasks. However, in most saliency maps predicted by existing models, the objects regions are very blurred and the edges of objects are irregular. The reason is that the hand-crafted features are the main basis for existing traditional methods to predict salient objects, which results in different pixels belonging to the same object often being predicted different saliency scores. Besides, the convolutional neural network (CNN)-based models predict saliency maps at patch scale, which causes the objects edges of the output to be fuzzy. In this paper, we attempt to add an edge convolution constraint to a modified U-Net to predict the saliency map of the image. The network structure we adopt can fuse the features of different layers to reduce the loss of information. Our SalNet predicts the saliency map pixel-by-pixel, rather than at the patch scale as the CNN-based models do. Moreover, in order to better guide the network mining the information of objects edges, we design a new loss function based on image convolution, which adds an L1 constraint to the edge information of saliency map and ground-truth. Finally, experimental results reveal that our SalNet is effective in salient object detection task and is also competitive when compared with 11 state-of-the-art models.

关键词： encoder-decoder architecture image convolution edge extraction salient object detection skip connection U-Net

来源：评论

学校读者我要写书评

暂无评论

Deep Road Scene Understanding

引用

IEEE SIGNAL PROCESSING LETTERS 2019年第4期26卷 587-591页

作者： Zhou, Wujie Lv, Sijia Jiang, Qiuping Yu, Lu Zhejiang Univ Sci & Technol Sch Informat & Elect Engn Hangzhou 310023 Zhejiang Peoples R China Zhejiang Univ Inst Informat & Commun Engn Hangzhou 310027 Zhejiang Peoples R China Ningbo Univ Fac Informat Sci & Engn Ningbo 315211 Zhejiang Peoples R China

Road scene understanding is a difficult task in autonomous driving. In this letter, we propose a novel deep encoder-decoder architecture for road scene understanding in an end-to-end manner. This core trainable understanding engine includes an encoder network, a decoder network with two streams, and a pixel-level fusion network with classification layer. The encoder network is composed of the front-end model of the classical convolution neural network, VGGNet. The decoder network with two streams includes multi-scale skip connection modules to reduce the down-scaling effect. Finally, a fusion network fuses the two-level information from the two streams of the decoder network for precise pixel-level classification. Additionally, the convolution layer is added to each skip connection module to increase the depth of the architecture. Our architecture achieves outstanding performance on the publicly available Cam Vid dataset and significantly outperforms previous architectures. This deep architecture is ideal for road scene understanding.

关键词： Road scene understanding encoder-decoder architecture skip connection fusion layer

来源：评论

学校读者我要写书评

暂无评论

Mixed spatial pyramid pooling for semantic segmentation

引用

APPLIED SOFT COMPUTING 2020年 91卷 106209-106209页

作者： Xia, Zhengyu Kim, Joohee IIT Dept Elect & Comp Engn Chicago IL 60616 USA

Semantic segmentation is a challenging task as each pixel should be labeled accurately in the image. To improve the performance of semantic segmentation, some Fully Convolutional Network (FCN) based semantic segmentation methods adopt a spatial pyramid pooling structure to enrich contextual information. Others employ an encoder-decoder architecture to recover object details gradually. In this paper, we propose a semantic segmentation framework which combines the benefits of these approaches. Specifically, we propose a Mixed Spatial Pyramid Pooling (MSPP) module based on region-based average pooling and dilated convolution to obtain dense multi-level contextual priors. To further refine the details of objects more effectively, we also propose a Global-Attention Fusion (GAF) module to provide global context as guidance for low-level features. Our proposed method achieves mIoU of 84.1% on PASCAL VOC 2012 dataset and 80.4% on Cityscapes dataset without using any post-processing or additional datasets for pretrained model. (C) 2020 Elsevier B.V. All rights reserved.

关键词： Semantic segmentation Convolutional neural network Spatial pyramid pooling Dilated convolution encoder-decoder architecture Scene understanding

来源：评论

学校读者我要写书评

暂无评论

Self-co-attention neural network for anatomy segmentation in whole breast ultrasound

引用

MEDICAL IMAGE ANALYSIS 2020年 64卷 101753-101753页

作者： Lei, Baiying Huang, Shan Li, Hang Li, Ran Bian, Cheng Chou, Yi-Hong Qin, Jing Zhou, Peng Gong, Xuehao Cheng, Jie-Zhi Shenzhen Univ Natl Reg Key Technol Engn Lab Med Ultrasound Guangdong Key Lab Biomed Measurements & Ultrasoun Sch Biomed EngnHlth Sci Ctr Shenzhen 518060 Peoples R China Yuanpei Univ Med Technol Dept Med Imaging & Radiol Technol Hsinchu Taiwan Natl Yang Ming Univ Taipei Vet Gen Hosp Dept Radiol Taipei Taiwan Natl Yang Ming Univ Sch Med Taipei Taiwan Yee Zen Gen Hosp Dept Radiol Taoyuan Taiwan Hong Kong Polytech Univ Sch Nursing Ctr Smart Hlth Hong Kong Peoples R China Shenzhen Univ Peoples Hosp Shenzhen 2 Affiliated Hosp 1 Dept UltrasoundShenzhen Peoples Hosp 2 Shenzhen 518035 Peoples R China Shanghai United Imaging Intelligence Co Ltd UII Shanghai Peoples R China

The automated whole breast ultrasound (AWBUS) is a new breast imaging technique that can depict the whole breast anatomy. To facilitate the reading of AWBUS images and support the breast density estimation, an automatic breast anatomy segmentation method for AWBUS images is proposed in this study. The problem at hand is quite challenging as it needs to address issues of low image quality, ill-defined boundary, large anatomical variation, etc. To address these issues, a new deep learning encoder-decoder segmentation method based on a self-co-attention mechanism is developed. The self-attention mechanism is comprised of spatial and channel attention module (SC) and embedded in the ResNeXt (i.e., Res-SC) block in the encoder path. A non-local context block (NCB) is further incorporated to augment the learning of high-level contextual cues. The decoder path of the proposed method is equipped with the weighted up-sampling block (WUB) to attain class-specific better up-sampling effect. Meanwhile, the co-attention mechanism is also developed to improve the segmentation coherence among two consecutive slices. Extensive experiments are conducted with comparison to several the state-of-the-art deep learning segmentation methods. The experimental results corroborate the effectiveness of the proposed method on the difficult breast anatomy segmentation problem on AWBUS images. (C) 2020 Elsevier B.V. All rights reserved.

关键词： Breast anatomy segmentation Self-co-attention mechanism Non-local cue encoder-decoder architecture

来源：评论

学校读者我要写书评

暂无评论

A novel fully convolutional network for visual saliency prediction

引用

PEERJ COMPUTER SCIENCE 2020年 6卷 e280页

作者： Ghariba, Bashir Muftah Shehata, Mohamed S. McGuire, Peter Mem Univ Newfoundland Fac Engn & Appl Sci St John NF Canada Elmergib Univ Fac Engn Dept Elect & Comp Engn Khoms Libya Univ British Columbia Dept Comp Sci Math Phys & Stat Kelowna BC Canada C CORE St John NF Canada

A human Visual System (HVS) has the ability to pay visual attention, which is one of the many functions of the HVS. Despite the many advancements being made in visual saliency prediction, there continues to be room for improvement. Deep learning has recently been used to deal with this task. This study proposes a novel deep learning model based on a Fully Convolutional Network (FCN) architecture. The proposed model is trained in an end-to-end style and designed to predict visual saliency. The entire proposed model is fully training style from scratch to extract distinguishing features. The proposed model is evaluated using several benchmark datasets, such as MIT300, MIT1003, TORONTO, and DUT-OMRON. The quantitative and qualitative experiment analyses demonstrate that the proposed model achieves superior performance for predicting visual saliency.

关键词： Deep learning Convolutional neural networks Fully Convolutional Network Semantic Segmentation encoder-decoder architecture Human eye fixation

来源：评论

学校读者我要写书评

暂无评论

Upsampling Matters for Road Marking Segmentation of Autonomous Driving

引用

IFAC-PapersOnLine 2020年第5期53卷 232-237页

作者： Ye Liu Xi Zhang Lei Liu Lei Zhang School of Mechanical Engineering Shanghai Jiao Tong University CSSC

Although autonomous driving have become applicable to the industry, the prevalent application of key techniques to the autonomous vehicles still needs to be refined. For instance, how to fast and accurately segment road markings in order to assist the next pedestrian path prediction and the creation of high-definition (HD) map respectively is useful for autonomous driving to be more practical. Current road marking segmentation mainly rely on the techniques of semantic segmentation of computer vision with encoder-decoder architecture. However, as demonstrated in this paper, the upsampling layer of convolutional neural networks with encoder-decoder architecture plays a significant role in the efficiency and accuracy of the road marking segmentation. The bilinear upsampling layer is fast due to its intrinsic simple interpolation but with less accuracy; on the contrary, the upsampling layer with offsets is relatively accurate but with more computational cost. Therefore, at least, in terms of prevalent application, efficiency, and accuracy, the upsampling layer of decoder of convolution neural networks should be paid more attention to for the next research work of autonomous driving.

关键词： Semantic Segmentation Upsampling encoder-decoder architecture Road Marking Autonomous Driving

来源：评论

学校读者我要写书评

暂无评论

Deep Learning-Based Feature Silencing for Accurate Concrete Crack Detection

引用

SENSORS 2020年第16期20卷 4403-4403页

作者： Billah, Umme Hafsa La, Hung Manh Tavakkoli, Alireza Univ Nevada Dept Comp Sci & Engn Reno NV 89557 USA

An autonomous concrete crack inspection system is necessary for preventing hazardous incidents arising from deteriorated concrete surfaces. In this paper, we present a concrete crack detection framework to aid the process of automated inspection. The proposed approach employs a deep convolutional neural network architecture for crack segmentation, while addressing the effect of gradient vanishing problem. A feature silencing module is incorporated in the proposed framework, capable of eliminating non-discriminative feature maps from the network to improve performance. Experimental results support the benefit of incorporating feature silencing within a convolutional neural network architecture for improving the network's robustness, sensitivity, and specificity. An added benefit of the proposed architecture is its ability to accommodate for the trade-off between specificity (positive class detection accuracy) and sensitivity (negative class detection accuracy) with respect to the target application. Furthermore, the proposed framework achieves a high precision rate and processing time than the state-of-the-art crack detection architectures.

关键词： convolutional neural network encoder-decoder architecture semantic segmentation feature silencing crack detection

来源：评论

学校读者我要写书评

暂无评论

Self-Learned Feature Reconstruction and Offset-Dilated Feature Fusion for Real-Time Semantic Segmentation 31

Self-Learned Feature Reconstruction and Offset-Dilated Featu...

引用

31st IEEE International Conference on Tools with Artificial Intelligence (ICTAI)

作者： Qi, Gege Pan, Lin Liu, Song Luo, Zhengding Zhu, Yuesheng Peking Univ Shenzhen Grad Sch Commun & Informat Secur Lab Shenzhen Peoples R China

ISBN: (纸本)9781728137988

Recent approaches for real-time semantic segmentation usually employ the encoder-decoder architecture as the backbone to generate a high-quality segmentation prediction. There has been a lot of research on designing efficient encoding methods. However, enhancing the performance of components in decoder is also crucial for pixel-level recognition. In this paper, we propose a self-learned feature reconstruction (SFR) method and an offset-dilated feature fusion (ODFF) module to improve the prediction reconstruction capability of the decoder. Concretely, SFR can effectively reconstruct the high-resolution feature maps by recombining feature space, in which the space transformation matrix implicitly contained in a convolution layer can selectively highlight features at each position by leveraging the knowledge of label space in a self-learned way. Moreover, ODFF module can effectively fuse multilevel features with multiscale contextual information by feeding the feature maps into designed parallel offset-dilated convolutions, which enhances the feature representation capability of the decoder. Experiments on Cityscapes and CamVid datasets demonstrate the superior performance of our proposed methods embedded in ESPNet.

关键词： real-time semantic segmentation encoder-decoder architecture feature reconstruction feature fusion

来源：评论

学校读者我要写书评

暂无评论

Coarse-to-Fine Satellite Images Change Detection Framework via Boundary-Aware Attentive Network

引用

SENSORS 2020年第23期20卷 6735-6735页

作者： Zhang, Yi Zhang, Shizhou Li, Ying Zhang, Yanning Northwestern Polytech Univ Shaanxi Prov Key Lab Speech & Image Informat Proc Natl Engn Lab Integrated Aerosp Ground Ocean Big Sch Comp Sci Xian 710129 Peoples R China Xian Univ Posts & Telecommun Sch Commun & Informat Engn Xian 710121 Peoples R China

Timely and accurate change detection on satellite images by using computer vision techniques has been attracting lots of research efforts in recent years. Existing approaches based on deep learning frameworks have achieved good performance for the task of change detection on satellite images. However, under the scenario of disjoint changed areas in various shapes on land surface, existing methods still have shortcomings in detecting all changed areas correctly and representing the changed areas boundary. To deal with these problems, we design a coarse-to-fine detection framework via a boundary-aware attentive network with a hybrid loss to detect the change in high resolution satellite images. Specifically, we first perform an attention guided encoder-decoder subnet to obtain the coarse change map of the bi-temporal image pairs, and then apply residual learning to obtain the refined change map. We also propose a hybrid loss to provide the supervision from pixel, patch, and map levels. Comprehensive experiments are conducted on two benchmark datasets: LEBEDEV and SZTAKI to verify the effectiveness of the proposed method and the experimental results show that our model achieves state-of-the-art performance.

关键词： change detection deep learning attentive coarse-to-fine encoder-decoder architecture end-to-end

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：