检索结果-内蒙古大学图书馆

Deep successor feature learning for text generation

NEUROCOMPUTING 2020年 396卷 495-500页

作者： Xu, Cong Li, Qing Zhang, Dezheng Xie, Yonghong Li, Xisheng Univ Sci & Technol Beijing Sch Automat & Elect Engn Beijing 100083 Peoples R China Univ Sci & Technol Beijing Sch Comp & Commun Engn Beijing 100083 Peoples R China Beijing Key Lab Knowledge Engn Mat Sci Beijing 100083 Peoples R China

In this paper we present an approach to training neural network to generate sequences using successor feature learning from reinforcement learning. The model can be thought as two components, an MLE-based token generator and an estimator that predicts the future value of whole sentence. As we know, reinforcement learning has been applied to dealing with the exposure bias problem of generating sequences. Compared with other RL algorithm, successor feature(SF) can learn robust value function provided observations and reward by decomposing the value function into two components - a reward predictor and a successor map. The encoder-decoder framework with SF enables the decoder to generate outputs that receive more future reward, which means that the model pays attention on not only the current word but also the rest words. We demonstrate that the approach improves performance on two translation tasks. (C) 2019 Elsevier B.V. All rights reserved.

关键词： Reinforcement learning Successor feature encoder-decoder Neural machine translation Deep learning

来源：评论

学校读者我要写书评

暂无评论

Dense feature pyramid network for cartoon dog parsing

引用

VISUAL COMPUTER 2020年第10-12期36卷 2471-2483页

作者： Wan, Jerome Mougeot, Guillaume Yang, Xubo Shanghai Jiao Tong Univ Shanghai Peoples R China Shanghai Jiao Tong Univ Sch Software Shanghai Peoples R China

While traditional cartoon character drawings are simple for humans to create, it remains a highly challenging task for machines to interpret. Parsing is a way to alleviate the issue with fine-grained semantic segmentation of images. Although well studied on naturalistic images, research toward cartoon parsing is very sparse. Due to the lack of available dataset and the diversity of artwork styles, the difficulty of the cartoon character parsing task is greater than the well-known human parsing task. In this paper, we study one type of cartoon instance: cartoon dogs. We introduce a novel dataset toward cartoon dog parsing and create a new deep convolutional neural network (DCNN) to tackle the problem. Our dataset contains 965 precisely annotated cartoon dog images with seven semantic part labels. Our new model, called dense feature pyramid network (DFPnet), makes use of recent popular techniques on semantic segmentation to efficiently handle cartoon dog parsing. We achieve a mIoU of 68.39%, a Mean Accuracy of 79.4% and a Pixel Accuracy of 93.5% on our cartoon dog validation set. Our method outperforms state-of-the-art models of similar tasks trained on our dataset: CE2P for single human parsing and Mask R-CNN for instance segmentation. We hope this work can be used as a starting point for future research toward digital artwork understanding with DCNN. Our DFPnet and dataset will be publicly available.

关键词： Cartoon character parsing Semantic part segmentation Pyramid network encoder-decoder Deep learning for vision

来源：评论

学校读者我要写书评

暂无评论

Improved Deep Multi-Patch Hierarchical Network With Nested Module for Dynamic Scene Deblurring

引用

IEEE ACCESS 2020年 8卷 62116-62126页

作者： Zhao, Zunjin Xiong, Bangshu Gai, Shan Wang, Lei Nanchang Hongkong Univ Sch Informat Engn Key Lab Jiangxi Prov Image Proc & Pattern Recogni Nanchang 330063 Jiangxi Peoples R China

Dynamic scene deblurring is a significant technique in the field of computer vision. The multi-scale strategy has been successfully extended to the deep end-to-end learning-based deblurring task. Its expensive computation gives birth to the multi-patch framework. The success of the multi-patch framework benefits from the local residual information passed across the hierarchy. One problem is that the finest levels rarely contribute to their residuals so that the contributions of the finest levels to their residuals are excluded by coarser levels, which limits the deblurring performance. To this end, we substitute the nested module blocks, whose powerful and complex representation ability is utilized to improve the deblurring performance, for the building blocks of the encoder-decoders in the multi-patch network. Additionally, the attention mechanism is introduced to enable the network to differentiate blur across the whole blurry image from dynamic scene, thereby further improving the ability to handle the motion object blur. Our modification boosts the contributions of the finest levels to their residuals and enables the network to learn different weights for feature information extracted from spatially-varying blur image. Extensive experiments show that the improved network achieves competitive performance on the GoPro dataset according to PSNR and SSIM.

关键词： Dynamic scene deblurring multi-patch framework nested module attention mechanism encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

Generative Text Summary Based on Enhanced Semantic Attention and Gain-Benefit Gate

引用

IEEE ACCESS 2020年 8卷 92659-92668页

作者： Ding, Jianli Li, Yang Ni, Huiyu Yang, Zhengquan Civil Aviat Univ China Coll Comp Sci & Technol Tianjin 300300 Peoples R China Civil Aviat Univ China Coll Sci Tianjin 300300 Peoples R China

Generative text summary is an important branch of natural language processing. Aiming at the problems of insufficient use of semantic information, insufficient summary precision and the problem of semantics-loss in the current generated text summary method, an enhanced semantic model is proposed based on dual-encoder, which can provide richer semantic information for sequence-to-sequence architecture through dual-encoder. The enhanced attention architecture with dual-channel semantics is optimized, and the empirical distribution and Gain-Benefit gate are built for decoding. In addition, the position embedding and word embedding are merged into the word embedding technology, and the TF-IDF(term frequency-inverse document frequency), part of speech, key score are added to word's feature. Meanwhile, the optimal dimension of word embedding is optimized. This paper aims to optimize the traditional sequence mapping and word feature representation, enhance the model's semantic understanding, and improve the quality of the summary. The LCSTS and SOGOU datasets are used to validate proposed method. The experimental results show that the proposed method can improve the performance of the ROUGE evaluation system by 10-13 percentage points compared with other listed algorithms. We can observe that the semantic understanding of the text summaries is more accurate and the generation effect is better, which has a better application prospect.

关键词： Semantics Decoding Logic gates Task analysis Probability distribution Natural language processing Neural networks Generative text summarization encoder-decoder enhanced semantic attention empirical distribution improved word feature

来源：评论

学校读者我要写书评

暂无评论

Analysis of technical features in basketball video based on deep learning algorithm

引用

SIGNAL PROCESSING-IMAGE COMMUNICATION 2020年第0期83卷 115786-000页

作者： Chen, Li Wang, Wenbo Chongqing Normal Univ Coll Phys Educ & Hlth Sci Chongqing Peoples R China Sichuan Univ Phys Educ Coll Chengdu Sichuan Peoples R China

The research of video-based sports movement analysis technology has an important application value. The introduction of digital video, human-computer interaction and other technologies in sports training can greatly improve training efficiency. This paper studies the technical characteristics of the players in the basketball game video and proposes a behavior analysis method based on deep learning. We first design a method to automatically extract the basketball court and stadium marking line. Subsequently, key frames in the video are captured using a spatiotemporal scoring mechanism. Afterward, we develop a behavior recognition and prediction method based on an encoder-decoder framework. The analysis results can be fed back to coaches and data analysts in real-time to help them analyze the tactics and technical choices. Experiments on the proposed method are carried out on a large basketball video dataset. The results show that the proposed method can effectively identify the motion of video characters while achieving high behavior analysis accuracy.

关键词： Basketball Movement analysis HCI Video CNN RNN encoder-decoder

来源：评论

学校读者我要写书评

暂无评论

encoder-decoder Design for SAC-OCDMA using Flexible Cross Correlation (FCC) Code Algorithms

Encoder-Decoder Design for SAC-OCDMA using Flexible Cross Co...

引用

IEEE 4th International Conference on Photonics (ICP)

作者： Rashidi, C. B. M. Aljunid, S. A. Fadhil, Hilal A. Anuar, M. S. Univ Malaysia Perlis UniMAP Sch Comp & Commun Eng Kangar 01000 Perlis Malaysia

ISBN: (纸本)9781467360739;9781467360753

In this paper, we present an effective encoder-decoder design utilizing Flexible Cross Correlation (FCC) code for Spectral Amplitude Coding-Optical Code Division Multiple Access (SAC-OCDMA) systems. The FCC code has advantages, such as flexibility cross-correlation property for any given number of users and weights, as well as effectively reduces the impacts of Multiple-Access Interference (MAI). The proposed FCC SAC-OCDMA encoder-decoder had shown superior performance indicated FCC SAC-OCDMA encoder-decoder offers 100%, 287% and 331% much larger number of active users compared to MDW K=60, MFH K=31 and Hadamard K=29, respectively.

关键词： FCC Code encoder-decoder MAI SAC-OCDMA Systems

来源：评论

学校读者我要写书评

暂无评论

CNN Prediction Enhancement by Post-Processing for Hydrocarbon Detection in Seismic Images

引用

IEEE ACCESS 2020年 8卷 120447-120455页

作者： Souza, Jose Fabricio L. Santana, Gabriel L. Batista, Leonardo V. Oliveira, Gustavo P. Roemers-Oliveira, Eduardo Santos, Moises D. Univ Fed Paraiba Petr Engn Modelling Lab BR-58051900 Joao Pessoa Paraiba Brazil Univ Fed Paraiba Comp Vis Lab BR-58051900 Joao Pessoa Paraiba Brazil Petrobras Res Ctr BR-21941915 Rio De Janeiro Brazil

Seismic image interpretation is indispensable for oil and gas industry. Currently, artificial intelligence has been undertaken to increase the level of confidence in exploratory activities. Detecting potentially recoverable hydrocarbon zones (leads) under the viewpoint of computer vision is an emerging problem that demands thorough examination. This paper introduces a processing workflow to recognize geologic leads in seismic images that resorts to encoder-decoder architectures of a convolutional neural network (CNN) accompanied by segmentation maps and post-processing operations. We have used seismic images collected at offshore sites of the Sergipe-Alagoas Basin (northeast of Brazil) as input. After performing a patch-based data augmentation, a total of 29600 patches were achieved. Out of these, 24000 were used for training, 5000 for validation, and 600 for testing. Each image generated for the training set was post-processed through reconstruction, thresholding & x2013;binarization and deblurring & x2013;, and outlier removal. By using the dice loss function, intersection-over-union index, and relative areal residual computed after intense cross-validation training rounds, we have shown that the accuracy of the network to detect leads was higher than 80 & x0025;. Furthermore, the validation error limits were found stable within 5 & x0025;- 10 & x0025;in all validation rounds, thereby resulting in a fairly accurate prediction of the pre-labelled hydrocarbon spots.

关键词： encoder-decoder petroleum exploration segmentation seismic imaging

来源：评论

学校读者我要写书评

暂无评论

Fused GRU with semantic-temporal attention for video captioning

引用

NEUROCOMPUTING 2020年 395卷 222-228页

作者： Gao, Lianli Wang, Xuanhan Song, Jingkuan Liu, Yang Univ Elect Sci & Technol China Sch Comp Sci & Engn Beijing 611731 Peoples R China Tsinghua Univ Dept Comp Sci Beijing 10000 Peoples R China

The encoder-decoder framework has been widely used for video captioning to achieve promising results, and various attention mechanisms are proposed to further improve the performance. While temporal attention determines where to look, semantic decides the context. However, the combination of semantic and temporal attention has never be exploited for video captioning. To tackle this issue, we propose an end-to-end pipeline named Fused GRU with Semantic-Temporal Attention (STA-FG), which can explicitly incorporate the high-level visual concepts to the generation of semantic-temporal attention for video captioning. The encoder network aims to extract visual features from the videos and predict their semantic concepts, while the decoder network is focusing on efficiently generating coherent sentences using both visual features and semantic concepts. Specifically, the decoder combines both visual and semantic representation, and incorporates a semantic and temporal attention mechanism in a fused GRU network to accurately learn the sentences for video captioning. We experimentally evaluate our approach on the two prevalent datasets MSVD and MSR-VTT, and the results show that our STA-FG achieves the currently best performance on both BLEU and METEOR. (C) 2019 Elsevier B.V. All rights reserved.

关键词： Video captioning GRU encoder-decoder Attention mechanism

来源：评论

学校读者我要写书评

暂无评论

CNN-DMRI: A Convolutional Neural Network for Denoising of Magnetic Resonance Images

引用

PATTERN RECOGNITION LETTERS 2020年 135卷 57-63页

作者： Tripathi, Prasun Chandra Bag, Soumen Indian Inst Technol Dept Comp Sci & Engn Indian Sch Mines Dhanabd Dhanbad 826004 Bihar India

Magnetic Resonance Images (MRI) are often contaminated by rician noise at the acquisition time. This type of noise typically deteriorates the performance of disease diagnosis by a human observer or an automated system. Thus, it is necessary to remove the rician noise from MRI scans as a preprocessing step. In this letter, we propose a novel Convolutional Neural Network (CNN), viz. CNN-DMRI, for denoising of MRI scans. The network uses a set of convolutions to separate the image features from the noise. The network also employs encoder-decoder structure for preserving the prominent features of the image while ignoring unnecessary ones. The training of the network is carried out in an end-to-end way by utilizing residual learning scheme. The performance of the proposed CNN has been tested qualitatively and quantitatively on one simulated and four real MRI datasets. Extensive experimental findings suggest that the proposed network can denoise MRI images effectively without losing crucial image details. (C) 2020 Elsevier B.V. All rights reserved.

关键词： Convolutional Neural Network Denoising encoder-decoder Magnetic Resonance Imaging Residual learning

来源：评论

学校读者我要写书评

暂无评论

FCN-Based DenseNet Framework for Automated Detection and Classification of Skin Lesions in Dermoscopy Images

引用

IEEE ACCESS 2020年 8卷 150377-150396页

作者： Adegun, Adekanmi A. Viriri, Serestina Univ KwaZulu Natal Sch Math Stat & Comp Sci ZA-4000 Durban South Africa

Skin Lesion detection and classification are very critical in diagnosing skin malignancy. Existing Deep learning-based Computer-aided diagnosis (CAD) methods still perform poorly on challenging skin lesions with complex features such as fuzzy boundaries, artifacts presence, low contrast with the background and, limited training datasets. They also rely heavily on a suitable turning of millions of parameters which often leads to over-fitting, poor generalization, and heavy consumption of computing resources. This study proposes a new framework that performs both segmentation and classification of skin lesions for automated detection of skin cancer. The proposed framework consists of two stages: the first stage leverages on an encoder-decoder Fully Convolutional Network (FCN) to learn the complex and inhomogeneous skin lesion features with the encoder stage learning the coarse appearance and the decoder learning the lesion borders details. Our FCN is designed with the sub-networks connected through a series of skip pathways that incorporate long skip and short-cut connections unlike, the only long skip connections commonly used in the traditional FCN, for residual learning strategy and effective training. The network also integrates the Conditional Random Field (CRF) module which employs a linear combination of Gaussian kernels for its pairwise edge potentials for contour refinement and lesion boundaries localization. The second stage proposes a novel FCN-based DenseNet framework that is composed of dense blocks that are merged and connected via the concatenation strategy and transition layer. The system also employs hyper-parameters optimization techniques to reduce network complexity and improve computing efficiency. This approach encourages feature reuse and thus requires a small number of parameters and effective with limited data. The proposed model was evaluated on publicly available HAM10000 dataset of over 10000 images consisting of 7 different categories of d

关键词： Lesions Skin Feature extraction Machine learning Skin cancer Training Support vector machines Skin lesion deep leraning CAD classification FCN CRF DenseNet encoder-decoder hyper-parameter skin cancer

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：