检索结果-内蒙古大学图书馆

19th ACM International Conference on multimedia ACM multimedia 2011, MM'11

作者： Ushiku, Yoshitaka Harada, Tatsuya Kuniyoshi, Yasuo Grad. School of Information Science and Technology University of Tokyo Japan Grad. School of Information Science and Technology University of Tokyo / JST PRESTO Japan

ISBN: (纸本)9781450306164

We propose a novel system which generates sentential captions for general images. For people to use numerous images effectively on the web, technologies must be able to explain image contents and must be capable of searching for data that users need. Moreover, images must be described with natural sentences based not only on the names of objects contained in an image but also on their mutual relations. The proposed system uses general images and captions available on the web as training data to generate captions for new images. Furthermore, because the learning cost is independent from the amount of data, the system has scalability, which makes it useful with large-scale data. © 2011 ACM.

关键词： multi-stack decoding probabilistic canonical correlation analysis similarity measure

来源：评论

学校读者我要写书评

暂无评论

Automatic sentence generation from images 11

Automatic sentence generation from images

引用

19th ACM International Conference on multimedia ACM multimedia 2011, MM'11

ISBN: (纸本)9781450306164

For the overwhelming amounts of multimedia used on the Web, methods of search and understanding with sentences are necessary. Representing the contents not only using labels but also using sentences including labels'relations enables users to search with a story and to understand multimedia deeply. However, few existing works describe such sentences because obtaining objects'relations and grammar is difficult. We specifically examine captions of images that are similar to an input image. They are expected to explain the input image to some degree. Therefore, we propose a novel approach to generate a sentential caption for the input image by summarizing those captions. Our experiment using a dataset consisting of images and text demonstrates that the proposed method can generate sentential captions. Copyright 2011 ACM.

关键词： probabilistic canonical correlation analysis similarity measure multi-stack decoding

来源：评论

学校读者我要写书评

暂无评论

Improving the multi-stack decoding algorithm in a segment-based speech recognizer 16th

引用

16th International Conference on Industrial and Engineering Applications of Artificial Intelligence and Expert Systems

作者： Gosztolya, G Kocsor, A Hungarian Acad Sci Res Grp Artificial Intelligence H-6720 Szeged Hungary Univ Szeged H-6720 Szeged Hungary

ISBN: (纸本)3540404554

During automatic speech recognition selecting the best hypothesis over a combinatorially huge hypothesis space is a very hard task, so selecting fast and efficient heuristics is a reasonable strategy. In this paper a general purpose heuristic, the multi-stack decoding method, was refined in several ways. For comparison, these improved methods were tested along with the well-known Viterbi beam search algorithm on a Hungarian number recognition task where the aim was to minimize the scanned hypothesis elements during the search process. The test showed that our method runs 6 times faster than the basic multi-stack decoding method, and 9 times faster than the Viterbi beam search method.

关键词： search methods segmental speech model speech recognition multi-stack decoding Viterbi beam search

来源：评论

学校读者我要写书评

暂无评论

A Hierarchical Evaluation Methodology in Speech Recognition

引用

ACTA CYBERNETICA 2005年第2期17卷 213-224页

作者： Gosztolya, Gabor Kocsor, Andras Hungarian Acad Sci Res Grp Artificial Intelligence Budapest Hungary Univ Szeged H-6720 Szeged Hungary

In speech recognition vast hypothesis spaces are generated, so the search methods used and their speedup techniques are both of great importance. One way of getting a speedup gain is to search in multiple steps. In this multi-pass search technique the first steps use only a rough estimate, while the latter steps apply the results of the previous ones. To construct these raw tests we use simplified phoneme groups which are based on some distance function defined over phonemes. The tests we performed show that this technique could significantly speed up the recognition process.

关键词： speech recognition search methods multi-stack decoding multi-pass search phoneme grouping

来源：评论

学校读者我要写书评

暂无评论

Various robust search methods in a Hungarian speech recognition system

引用

Acta Cybernetica 2003年第2期16卷 229-240页

作者： Gosztolya, Gábor Kocsor, András Tóth, László Felföldi, László Research Group on Artificial Intell. Hungarian Academy of Sciences University of Szeged Aradi Vértanúk tere 1. H-6720 Szeged Hungary Department of Informatics University of Szeged Arpád tér 2. H-6720 Szeged Hungary

This work focuses on the search aspect of speech recognition. We describe some standard algorithms such as stack decoding, multi-stack decoding, the Viterbi beam search and an A* heuristic, then present improvements on these search methods. Finally we compare the performance of each algorithm, grading them according to their performance. We will show that our improvements can outperform the standard methods.

关键词： multi-stack decoding Search methods stack decoding Viterbi beam search

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：