检索结果-内蒙古大学图书馆

Pinyin-indexed method for approximate matching in Chinese

Qinghua Daxue Xuebao/Journal of Tsinghua University 2009年第SUPPL. 1期49卷 1328-1332页

作者： Cao, Jiang Wu, Xiaojun Xia, Yunqing Zheng, Fang Department of Computer Science and Technology Tsinghua University Beijing 100084 China Center for Speech and Language Technologies Division of Technical Innovation and Development Tsinghua National Laboratory for Information Science and Technology Beijing 100084 China

The exact matching of keywords is key to popular commercial search engines. A Chinese approximate matching method with an index structure was developed to achieve better retrieval when the input contains errors. Three types of similarity measurement between two Chinese strings were developed based on the character edit-distance, the Pinyin edit-distance and the Pinyin improved edit-distance. The similarity measurements were used to expand the user's query so that the approximate matching task can be represented as several exact matching sub-tasks. The results of these exact matchings are merged and sorted by their similarity to the original query. Tests on a webpage text database gave a 50.4% recall rate with the Pinyin improved edit-distance with a 60.4% precision with a small increase in time and space complexity.

关键词： Search engines

来源：评论

学校读者我要写书评

暂无评论

Linear scaling based dynamic programming algorithm for accurate matching in QBH

引用

Qinghua Daxue Xuebao/Journal of Tsinghua University 2009年第SUPPL. 1期49卷 1402-1407页

作者： Cao, Wenxiao Liu, Yi Zheng, Fang Jiang, Danning Qin, Yong Center for Speech and Language Technologies Research Institute of Information Technology Tsinghua University Beijing 100084 China Division of Technology Innovation and Development Center for Speech and Language Technologies Tsinghua National Laboratory for Information Science and Technology Beijing 100084 China IBM China Research Lab Beijing 100094 China

A linear scaling (LS) based dynamic programming (DP) algorithm was developed for accurate matching of queries by humming. The query contours are split into phrases, with the LS match calculated for each phrase. Finally dynamic programming is used to analyze on all the phrases to choose the optimal matching path. The algorithm more efficiently considers the query contours related to the phrases, thus, overcoming the missing-global-optimal-path disadvantage of dynamic programming for long path matching. Tests on a 5 223 MIDI database show that the algorithm outperforms the traditional LS method by 10.5%, the DP method by 6.0% and recursive alignment by 2.8% for the top-1 rate. Thus, the algorithm is more efficient and more accurate while being less expense.

关键词： Dynamic programming

来源：评论

学校读者我要写书评

暂无评论

Automatic grammar inference based on sentence segmentation for spoken Chinese

引用

Qinghua Daxue Xuebao/Journal of Tsinghua University 2009年第SUPPL. 1期49卷 1322-1327页

作者： Zhang, He Wu, Xiaojun Wang, Xiaodong Zheng, Fang College of Computer and Information Technology Henan Normal University Xinxiang 453007 China Center for Speech and Language Technologies Tsinghua National Laboratory for Information Science and Development Tsinghua University Beijing 100084 China

The grammar for spoken dialogue systems for information enquiry is often manually designed by experts. Automatic grammar inference method based on sentence segmentation was developed based on an enhanced context free grammar for spoken Chinese. The system parses the training sentences with an initial rule set. If the parsed syntactic tree is incomplete, the top-most constituents are used to recursively infer the missing rules after disambiguation and normalization, and then the rule set is updated. The output grammar is improved by adjusting the processing order of the training sentences to refine the process. Evaluations based on weather forecast enquiries gave a parsing accuracy for the output grammar of 64.8% with an empty initial rule set and 86.4% with an initial rule set including only rules for date descriptions.

关键词： Context free grammars

来源：评论

学校读者我要写书评

暂无评论

VP-tree based multi-stage matching algorithm for query-by-humming systems

引用

Qinghua Daxue Xuebao/Journal of Tsinghua University 2009年第SUPPL. 1期49卷 1419-1424页

作者： Hou, Jue Liu, Yi Zheng, Fang Jiang, Danning Qin, Yong Huang, Shilei Liu, Yong Center for Speech and Language Technologies Research Institute of Information Technology Tsinghua University Beijing 100084 China Center for Speech and Language Technologies Division of Technology Innovation and Development Tsinghua National Laboratory for Information Science and Technology Beijing 100084 China IBM China Research Lab Beijing 100094 China PKU HKUST Shenzhen Hong Kong Institution Shenzhen 518057 China

Query by humming (QBH) is an important application for musical information retrieval. The key challenges in QBH are the unstructured data modules in audio songs and the balance between searching speed and accuracy. This paper presents a data structure for audio songs using a hand labeling method to label the melody and to divide the songs into natural segments. The search index uses the segmentation structure rather than the entire lyrics for the song. The system generates a VP-tree search structure with a multi-level searching algorithm that includes coarse searching for fast match and dynamic time warping (DTW) that leads to a fine match. Evaluations with 2 213 melody segments reduce the search time by over 40% without greatly reducing the recognition accuracy.

关键词： Music

来源：评论

学校读者我要写书评

暂无评论

Local mismatch phone for confidence measure in STANDARD and accented Chinese speech recognition

Local mismatch phone for confidence measure in STANDARD and ...

引用

2008 6th International Symposium on Chinese Spoken language Processing, ISCSLP 2008

作者： Wenxiao, Cao Yi, Liu Zheng, Thomas Fang Center for Speech and Language Technologies Division of Technical Innovation and Development Tsinghua National Laboratory for Information Science and Technology Beijing China

ISBN: (纸本)9781424429431

High error rate in speech recognition is largely due to effects of phone local mismatch caused by unclear speaking or noises. In this paper, we propose an approach of using local mismatch phone to improve the reliability of confidence measure. The features of local mismatch phone can be extracted from the recognition phone sequence by computing occurrence frequency of each phone and comparing with a preset threshold. Occurrence frequency is defined as occurrence time of recognition phone in its frame best phone sequence divided by interval. Frame best phone is the symbol of HMM state at the end of maximum likelihood token at certain frame. The effectiveness of this feature is evaluated on standard and accented Mandarin speech databases. It gives significant Equal Error Rate reduction of 19.7% and 8.4%, respectively. In addition to fast computation, this feature is independent of acoustic model, and is convenient for combination with other features. © 2008 IEEE.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

Automatic rule acquisition for Chinese intra-chunk relations 3

Automatic rule acquisition for Chinese intra-chunk relations

引用

3rd International Joint Conference on Natural language Processing, IJCNLP 2008

作者： Zhou, Qiang Center for Speech and Language Technologies Division of Technical Innovation and Development Tsinghua National Laboratory for Information Science and Technology Tsinghua University Beijing100084 China

Multiword chunking is defined as a task to automatically analyze the external function and internal structure of the multiword chunk(MWC) in a sentence. To deal with this problem, we proposed a rule acquisition algorithm to automatically learn a chunk rule base, under the support of a large scale annotated corpus and a lexical knowledge base. We also proposed an expectation precision index to objectively evaluate the descriptive capabilities of the refined rule base. Some experimental results indicate that the algorithm can acquire about 9% useful expanded rules to cover 86% annotated positive examples, and improve the expectation precision from 51% to 83%. These rules can be used to build an efficient rule-based Chinese MWC parser. © 2008 IJCNLP 2008 - 3rd International Joint Conference on Natural language Processing, Proceedings of the Conference. All rights reserved.

关键词： Knowledge based systems

来源：评论

学校读者我要写书评

暂无评论

Text-dependent speaker identification using LPC and DTW for Thai language

Text-dependent speaker identification using LPC and DTW for ...

引用

1999 IEEE Region 10 Conference, TENCON 1999

作者： Wutiwiwatchai, C. Achariyakulporn, V. Tanprasert, C. Software and Language Engineering Laboratory National Electronics and Computer Technology Center Nation Science and Technology Development Agency Ministry of Science Technology and Environment 22d Floor Gypsum-Metropolitan Tower Sri-Ayudhaya Rd. Phayathai Bangkok10400 Thailand

ISBN: (纸本)0780357396

This paper proposes a text-dependent speaker identification system applied to Thai language. Isolated digits 0-9 and their concatenations are used for speaking text. Linear prediction coefficients (LPC) are extracted and formed as feature vectors represented each speech signal. Dynamic time warping (DTW) is used to measure distances between referenced and evaluated vectors. These distances, indicating nearness of unknown vectors to references, incorporated with the K-nearest neighbor (KNN) decision technique are used to decide who possesses those unknown vectors. The experimental results have shown that the best identification rate for a single digit is 95.83% and the highest rate for concatenated digits of top-3, top-5, and top-7 are 98.75%, 100%, and 99.20%, respectively. © 1999 IEEE.

关键词： Vectors

来源：评论

学校读者我要写书评

暂无评论

Text-dependent speaker identification using neural network on distinctive Thai tone marks

Text-dependent speaker identification using neural network o...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： C. Tanprasert C. Wutiwiwatchai Sutat Sae-Tang Software and Language Engineering Laboratory National Electronics and Computer Technology CenterMinistry of Science Technology and Environment National Science and Technology Development Agency Bangkok Thailand

Presents a neural network based text-dependent speaker identification system for Thai language. Linear prediction coefficients (LPC) are extracted from speech signal and formed feature vectors. These features are fed into a multilayer perceptron (MLP) neural network with backpropagation learning algorithm for training and identification processes. Five Thai tone marks are considered very closely in choosing the sentences in order to achieve the best speaker identification accuracy. Five speaking texts with each Thai tone and a mixed tone text are comparatively experimented. Average identification rate on 9 speakers achieves above 95% when using mixed tone text, and poor results occur with middle and low tone texts, which usually cause vagueness or unclear voices.

关键词： Neural networks Speech Linear predictive coding Frequency Natural languages Paper technology Signal processing Backpropagation Speaker recognition Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Text-dependent speaker identification using LPC and DTW for Thai language

Text-dependent speaker identification using LPC and DTW for ...

引用

IEEE Region 10 International Conference TENCON

作者： C. Wutiwiwatchai V. Achariyakulporn C. Tanprasert Software and Language Engineering Laboratory National Electronics and Computer Technology Center Nation Science and Technology Development Agency Ministry of Science Technology and Environment Thailand Bangkok Thailand

关键词： Linear predictive coding Natural languages Speaker recognition Feature extraction Artificial neural networks Vectors Loudspeakers Isolation technology Pattern matching Hidden Markov models

来源：评论

学校读者我要写书评

暂无评论

Advances in Natural language Processing, Intelligent Informatics and Smart technology 1

引用

丛书名： Advances in Intelligent Systems and Computing

1000年

作者： Thanaruk Theeramunkong Rachada Kongkachandra Thepchai Supnithi

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：