图像标注任务是人工智能领域中将机器视觉(Computer Vision)与自然语言处理(Natural Language Processing)两大方向相结合的任务,受到学界极大的关注。本文针对目前主流的图像描述算法进行综合的研究,基于目前图像标注任务中取得优秀效...
详细信息
图像标注任务是人工智能领域中将机器视觉(Computer Vision)与自然语言处理(Natural Language Processing)两大方向相结合的任务,受到学界极大的关注。本文针对目前主流的图像描述算法进行综合的研究,基于目前图像标注任务中取得优秀效果的CNN-LSTM描述生成算法,引入目前机器视觉方向上取得长足发展的目标检测框架Faster R-CNN作编码器替换CNN,使用图像区域特征输入解码器;在解码器部分的循环神经网络中使用注意力机制,进一步强化区域图像特征对解码器生成自然语言描述的贡献,从而构成从区域特征到全局描述的结构化图像标注框架。这一图像标注算法在MSCO⁃CO数据集上进行训练与测试(分别在训练集与测试集上进行),我们提出的模型获得了超过了基线模型的效果。
Recently,image caption which aims to generate a textual description for an image automatically has attracted researchers from various *** performance has been achieved by applying deep neural *** of these works aim at...
详细信息
ISBN:
(纸本)9783319690049
Recently,image caption which aims to generate a textual description for an image automatically has attracted researchers from various *** performance has been achieved by applying deep neural *** of these works aim at generating a single caption which may be incomprehensive,especially for complex *** paper proposes a topic-specific multi-caption generator,which in-fer topics from image first and then generate a variety of topic-specific captions,each of which depicts the image from a particular *** per-form experiments on flickr8k,flickr30k and *** results show that the proposed model performs better than single-caption generator when generating topic-specific *** proposed model effectively generates diversity of captions under reasonable topics and they differ from each other in topic level.
暂无评论