文献详情 >Integrating Scene Semantic Kno... 收藏

Integrating Scene Semantic Knowledge into Image Captioning

作者：Wei, Haiyang Li, Zhixin Huang, Feicheng Zhang, Canlong Ma, Huifang Shi, Zhongzhi

作者机构：Guangxi Normal Univ Guangxi Key Lab Multisource Informat Min & Secur 15 Yucai Rd Guilin 541004 Guangxi Peoples R China Northwest Normal Univ Coll Comp Sci & Engn 967 Anning East Rd Lanzhou 730070 Gansu Peoples R China Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc 6 Kexueyuan South Rd Beijing 100190 Peoples R China

出版物：《ACM TRANSACTIONS ON MULTIMEDIA COMPUTING COMMUNICATIONS AND APPLICATIONS》 (ACM Trans. Multimedia Comput. Commun. Appl.)

年卷期：2021年第17卷第2期

页面：1–22页

核心收录：

学科分类：0809[工学-电子科学与技术（可授工学、理学学位）] 08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术（可授工学、理学学位）]

基　　金：National Natural Science Foundation of China [61966004, 61663004, 61866004, 61762078] Guangxi Natural Science Foundation [2019GXNSFDA245018, 2018GXNSFDA281009] Guangxi "Bagui Scholar" Teams for Innovation and Research Project Guangxi Talent Highland Project of Big Data Intelligence and Application Guangxi Collaborative Innovation Center of Multi-Source Information Integration and Intelligent Processing

主　　题：Image captioning attention mechanism scene semantics encoder-decoder framework

摘要：Most existing image captioning methods use only the visual information of the image to guide the generation of captions, lack the guidance of effective scene semantic information, and the current visual attention mechanism cannot adjust the focus intensity on the image. In this article, we first propose an improved visual attention model. At each timestep, we calculated the focus intensity coefficient of the attention mechanism through the context information of themodel, then automatically adjusted the focus intensity of the attention mechanism through the coefficient to extract more accurate visual information. In addition, we represented the scene semantic knowledge of the image through topic words related to the image scene, then added them to the language model. We used the attention mechanism to determine the visual information and scene semantic information that the model pays attention to at each timestep and combined them to enable the model to generate more accurate and scene-specific captions. Finally, we evaluated our model on Microsoft COCO (MSCOCO) and Flickr30k standard datasets. The experimental results show that our approach generates more accurate captions and outperforms many recent advanced models in various evaluation metrics.

本地馆藏 | 借阅须知 | 我要预约

已订购，未入库

sda

目录详情 | 试阅读 |

读者评论与其他读者分享你的观点

学校读者

用户名:未登录

我的评分

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Integrating Scene Semantic Knowledge into Image Captioning

读者评论与其他读者分享你的观点

请选择收藏分类：

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

看过本文的还看了

相关文献

该作者的其他文献

CADAL相关文献

Integrating Scene Semantic Knowledge into Image Captioning

读者评论 与其他读者分享你的观点

请选择收藏分类： 新增自定义分类 确定 取消

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

读者评论与其他读者分享你的观点

请选择收藏分类：