We propose a novel system which generates sentential captions for general images. For people to use numerous images effectively on the web, technologies must be able to explain image contents and must be capable of se...
详细信息
For the overwhelming amounts of multimedia used on the Web, methods of search and understanding with sentences are necessary. Representing the contents not only using labels but also using sentences including labels&#...
详细信息
ISBN:
(纸本)9781450306164
For the overwhelming amounts of multimedia used on the Web, methods of search and understanding with sentences are necessary. Representing the contents not only using labels but also using sentences including labels'relations enables users to search with a story and to understand multimedia deeply. However, few existing works describe such sentences because obtaining objects'relations and grammar is difficult. We specifically examine captions of images that are similar to an input image. They are expected to explain the input image to some degree. Therefore, we propose a novel approach to generate a sentential caption for the input image by summarizing those captions. Our experiment using a dataset consisting of images and text demonstrates that the proposed method can generate sentential captions. Copyright 2011 ACM.
During automatic speech recognition selecting the best hypothesis over a combinatorially huge hypothesis space is a very hard task, so selecting fast and efficient heuristics is a reasonable strategy. In this paper a ...
详细信息
ISBN:
(纸本)3540404554
During automatic speech recognition selecting the best hypothesis over a combinatorially huge hypothesis space is a very hard task, so selecting fast and efficient heuristics is a reasonable strategy. In this paper a general purpose heuristic, the multi-stack decoding method, was refined in several ways. For comparison, these improved methods were tested along with the well-known Viterbi beam search algorithm on a Hungarian number recognition task where the aim was to minimize the scanned hypothesis elements during the search process. The test showed that our method runs 6 times faster than the basic multi-stack decoding method, and 9 times faster than the Viterbi beam search method.
In speech recognition vast hypothesis spaces are generated, so the search methods used and their speedup techniques are both of great importance. One way of getting a speedup gain is to search in multiple steps. In th...
详细信息
In speech recognition vast hypothesis spaces are generated, so the search methods used and their speedup techniques are both of great importance. One way of getting a speedup gain is to search in multiple steps. In this multi-pass search technique the first steps use only a rough estimate, while the latter steps apply the results of the previous ones. To construct these raw tests we use simplified phoneme groups which are based on some distance function defined over phonemes. The tests we performed show that this technique could significantly speed up the recognition process.
This work focuses on the search aspect of speech recognition. We describe some standard algorithms such as stackdecoding, multi-stack decoding, the Viterbi beam search and an A* heuristic, then present improvements o...
详细信息
This work focuses on the search aspect of speech recognition. We describe some standard algorithms such as stackdecoding, multi-stack decoding, the Viterbi beam search and an A* heuristic, then present improvements on these search methods. Finally we compare the performance of each algorithm, grading them according to their performance. We will show that our improvements can outperform the standard methods.
暂无评论