It is well known that speech sounds evolve at multiple timescales over the course of tens to hundreds of milliseconds. Such temporal modulations are crucial for speech perception and are believed to directly influence...
详细信息
Are conclusions about loudness drawn from tones presented via earphones in laboratories applicable to listening to a talker in a room? The present experiment tests the following hypothesis: speech from the same talker...
详细信息
ISBN:
(纸本)9781617827457
Are conclusions about loudness drawn from tones presented via earphones in laboratories applicable to listening to a talker in a room? The present experiment tests the following hypothesis: speech from the same talkers presented under more ecologically valid conditions results in a smaller binaural-to-monaural loudness ratio than speech presented without visual cues and/or presented via headphones. Twelve normal listeners were presented two types of stimuli (recorded speech, with and without visual cues) monaurally and binaurally across a wide range of levels. The same stimuli were presented via earphones and loudspeakers. Loudness was measured using magnitude estimation. Results show that the binaural-to-monaural loudness ratio was significantly less for speech with visual cues presented via a loudspeaker than for stimuli with any other combination of test parameters (i.e., speech without visual cues presentedvia both headphones and loudspeakers, and speech presented with visual cues via headphones). The present results indicate that the loudness of a visually present talker in daily environments is little affected by switching between binaural and monaural listening. This phenomenon has been dubbed "Binaural Loudness Constancy," because of its similarity to loudness constancy that occurs with distance from the speaker. The present experiment supports the importance of ecological validity in loudness research, which could change how perception of loudness is understood.
Community Coordinated Multimedia (CCM) provides an extended and enhanced human experience by collaboratively consuming electronic and networked content and multimedia-intensive services. Community coordinated multimed...
详细信息
We propose to use graph-based diffusion techniques with data-dependent kernels to build unigram language models. Our approach entails building graphs, where each vertex corresponds uniquely to a word from a closed voc...
详细信息
We propose to use graph-based diffusion techniques with data-dependent kernels to build unigram language models. Our approach entails building graphs, where each vertex corresponds uniquely to a word from a closed vocabulary, and the existence of an edge (with an appropriate weight) between two words indicates some form of similarity between them. In one of our constructions, we place an edge between two words if the number of times these words were seen in a training set differs by at most one count. This graph construction results in a similarity matrix with small intrinsic dimension, since words with the same counts have the same neighbors. Experimental results from a benchmark task from language modeling show that our method is competitive with the Good-Turing estimator.
Molecular substructure mining is currently an intensively studied research area. In this paper we present an implementation of an algorithm for finding frequent substructures in a set of molecules, which may also be u...
ISBN:
(纸本)1595932100
Molecular substructure mining is currently an intensively studied research area. In this paper we present an implementation of an algorithm for finding frequent substructures in a set of molecules, which may also be used to find substructures that discriminate well between a focus and a complement group. In addition to the basic algorithm, we discuss advanced pruning techniques, demonstrating their effectiveness with experiments on two publicly available molecular data sets, and briefly mention some other extensions. Copyright 2005 ACM.
Information retrieval using word senses is emerging as a good research challenge on semantic information retrieval. In this paper, we propose a new method using word senses in information retrieval: root sense tagging...
详细信息
ISBN:
(纸本)1581138814
Information retrieval using word senses is emerging as a good research challenge on semantic information retrieval. In this paper, we propose a new method using word senses in information retrieval: root sense tagging method. This method assigns coarse-grained word senses defined in WordNet to query terms and document terms by unsupervised way using co-occurrence information constructed automatically. Our sense tagger is crude, but performs consistent disambiguation by considering only the single most informative word as evidence to disambiguate the target word. We also allow multiple-sense assignment to alleviate the problem caused by incorrect disambiguation. Experimental results on a large-scale TREC collection show that our approach to improve retrieval effectiveness is successful, while most of the previous work failed to improve performances even on small text collection. Our method also shows promising results when is combined with pseudo relevance feedback and state-of-the-art retrieval function such as BM25.
Human papillomavirus (HPV) is considered to be the most common sexually transmitted disease and the infection of HPV is known as the major factor for cervical cancer. There are more than 100 types in HPV and each HPV ...
详细信息
In this study, we discuss a number of issues for audio stream phrase recognition for information retrieval for a new National Gallery of the Spoken Word (NGSW). NGSW is the first large-scale repository of its kind, co...
详细信息
In this study, we discuss a number of issues for audio stream phrase recognition for information retrieval for a new National Gallery of the Spoken Word (NGSW). NGSW is the first large-scale repository of its kind, consisting of speeches, news broadcasts, and recordings that are of historical content from the 20th Century. We propose a system diagram and discuss critical tasks associated with effective audio information retrieval that include: advanced audio segmentation, speech recognition model adaptation for acoustic background noise and speaker variability, and natural languageprocessing for text query requests. A number of questions regarding copyright assessment, metadata construction, digital watermarking must also be addressed for a sustainable audio collection of this magnitude. Our experimental online system entitled "SpeechFind" is presented which allows for audio retrieval from a portion of the NGSW corpus. We discuss a number of research challenges to address the overall task of robust phrase searching in unrestricted audio corpora.
Automatic data entry plays an important role for improving speed and effectiveness for information technology. In order to recognize forms and join results into a database, it is necessary to correctly isolate geometr...
详细信息
暂无评论