检索结果-内蒙古大学图书馆

Corrigendum to “Self-adaptive statistical process control for anomaly detection in time series” [Expert Systems With Applications 57 (2016) 324–336]

引用

Expert Systems with Applications 2016年 62卷 385-385页

作者： Dequan Zheng Fenghuan Li Tiejun Zhao MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin 150001 PR China School of Software Engineering South China University of Technology Guangzhou 510006 PR China

来源：评论

学校读者我要写书评

暂无评论

Detection on Inconsistency of Verb Phrase in TreeBank 3

Detection on Inconsistency of Verb Phrase in TreeBank

引用

3rd CIPS-SIGHAN Joint Conference on Chinese language processing, CLP 2014

作者： Duan, Chaoqun Zheng, Dequan Zhu, Conghui Li, Sheng Tan, Hongye MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin150001 China Key Lab. of Computational Intelligence and Chinese Information Processing of Ministry of Education Shanxi University Taiyuan030006 China

Annotating linguistic data is often a complex, time consuming and expensive endeavor. Even with strict annotation guidelines, human subjects often deviate in their analyses, each bring different biases, interpretations of the task and levels of consistency. The aim of this paper is to explore a way to find out the inconsistencies in the corpus TreeBank which is used for syntactic analysis through the procedure we study the inconsistencies of verb phrase tagging in the corpus Tree- Bank. At the same time, we can analyze the inconsistencies of verb phrase tagging which are found in the corpus TreeBank in order that we can find a way to improve the consistency of verb phrase tagging automatically which is effective to improve the quality of corpus. © 2014 CLP 2014 - 3rd CIPS-SIGHAN Joint Conference on Chinese language processing. All rights reserved.

关键词： Syntactics

来源：评论

学校读者我要写书评

暂无评论

Microblog-Oriented Backbone Nodes Identification in Public Opinion Diffusion

Microblog-Oriented Backbone Nodes Identification in Public O...

引用

International Conference on Audio, language and Image processing

作者： Wanlong Sun Dequan Zheng Xinchen Hu Tiejun Zhao MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology

ISBN: (纸本)9781479939046

Backbone nodes in public opinion diffusion could help people understand how it spreads. Previous work relies on the fact that how the opinion diffuses across time, which shows disappointing results. This paper presents a novel method for identifying backbone nodes in public opinion diffusion, which can be applied to different platforms. We take Sina microblog as an example platform. Besides traditional factors, our model takes personal contribution degree and physical contribution degree into consideration by estimating personal features and diffusion scale respectively. Finally, we employ a visual graph to a person's role in public opinion diffusion intuitively. Experimental result shows that this method performs well in identifying backbone nodes.

关键词： Microblog Public opinion transmission Backbone node Contribution degree

来源：评论

学校读者我要写书评

暂无评论

Topic model-based micro-blog user interest analysis

Topic model-based micro-blog user interest analysis

引用

International Conference on Audio, language and Image processing, ICALIP

作者： Xinchen Hu Dequan Zheng Wanglong Sun Sheng Li MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China

ISBN: (纸本)9781479939046

As a popular Internet information exchange platform, Micro-Blog like Twitter attracts a large amount of users to share information through short and noisy messages. In this paper, we aim to discover Micro-Blog users' interest using topic model. In the topic model, users' metadata such as labels are taken as new features and been put into user document which will be used to infer user's interest. Experimental results indicate that this method gives satisfying user interest and is capable for reality project. This paper also introduce two applications based on user interest detected before: 1) keywords extraction based on interest (We calculate word entropy using word topic distribution as new feature). 2) User clustering based on user interest.

关键词： Load modeling Data models Analytical models Training Semantics Entropy Training data

来源：评论

学校读者我要写书评

暂无评论

Measuring Domain Similarity for Statistical Machine Translation

Measuring Domain Similarity for Statistical Machine Translat...

引用

International Conference on Fuzzy Systems and Knowledge Discovery

作者： Lin Liu Hailong Cao Tiejun Zhao MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology

ISBN: (纸本)9781467352512

It is well known that the statistical machine translation (SMT) performance suffers when a model is applied to out-of-domain data. It is also known that the more similar the test domain and the training domain are, the more efficient the training data are for SMT performance. Hence, measuring the similarity of domains is an important task to select appropriate training data. The most widely used method uses the cosine similarity function and word frequency. The lack of exploring other approaches motivates us to propose and compare several similarity measures. Aiming for better SMT performance, we compared 10 similarity measures, which are a combination of 2 feature representations and 5 similarity functions. The results show that using the relative word frequency as the feature representation and using the skew divergence as the similarity function performs the best amongst the 10 measures and outperforms random data selection.

关键词： Domain adaptation Domain similarity Statistical machine translation (SMT)

来源：评论

学校读者我要写书评

暂无评论

Multi-pattern fusion based semi-supervised Name Entity Recognition

Multi-pattern fusion based semi-supervised Name Entity Recog...

引用

International Conference on Machine Learning and Cybernetics (ICMLC)

作者： Ziguang Cheng Dequan Zheng Sheng Li MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China

ISBN: (纸本)9781479902590

Named Entity Recognition (NER) is one of the most important problems in Natural language processing (NLP). NER also has a broad prospect for application and important research value. There are a lot of methods and technology to solve NER problem. In this paper, for a specific application background, a new multi-pattern fusion based semi-supervised NER method is proposed. We use soft-matching method in entity internal pattern first. Then through bootstrapping process in the training corpus, we get an entity external pattern. Finally we use fusion internal and external pattern method to complete the named entity recognition. Experiments on Chinese weapon names, from People's Daily corpus and some military news articles were performed. They showed when the internal characteristic is significant and training corpus has a higher similarity with test corpus, this method performs better than soft matching method and external pattern based bootstrapping method, improving the named entity recognition precision by 18.2%.

关键词： Weapons Abstracts Supervised learning Training data

来源：评论

学校读者我要写书评

暂无评论

An incremental learning strategy for search results optimization

An incremental learning strategy for search results optimiza...

引用

International Conference on Machine Learning and Cybernetics (ICMLC)

作者： Xiang Liu Dequan Zheng Bing Xu MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China

ISBN: (纸本)9781479902590

The traditional search engines rarely consider features of the document set, so the retrieval results are not so satisfactory after new documents are added into the retrieval system. In this paper we combine the features of document set with traditional retrieval models and propose an incremental learning strategy to optimize the retrieval results. We got a feature thesaurus by extracting the document set. Then we collected some new features from the newly added documents and refreshed the feature thesaurus. Finally, the search results were reordered according to how well they matched the feature thesaurus with a query. Several parts of experiments show that this method averagely rises by 9.4% in precision, 14.9% in MAP, 4.6% in DCG towards the top 10 results than traditional retrieval means, which means that it processes better while making a query, even better while querying to the newly added documents, and faster while locating the required information.

关键词： Feature extraction Optimization Abstracts Thesauri

来源：评论

学校读者我要写书评

暂无评论

Research on text categorization based on a weakly-supervised transfer learning method

Research on text categorization based on a weakly-supervised...

引用

13th Annual Conference on Intelligent Text processing and Computational Linguistics, CICLing 2012

作者： Zheng, Dequan Zhang, Chenghe Fei, Geli Zhao, Tiejun MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China

ISBN: (纸本)9783642286001

This paper presents a weakly-supervised transfer learning based text categorization method, which does not need to tag new training documents when facing classification tasks in new area. Instead, we can take use of the already tagged documents in other domains to accomplish the automatic categorization task. By extracting linguistic information such as part-of-speech, semantic, co-occurrence of keywords, we construct a domain-adaptive transfer knowledge base. Relation experiments show that, the presented method improved the performance of text categorization on traditional corpus, and our results were only about 5% lower than the baseline on cross-domain classification tasks. And thus we demonstrate the effectiveness of our method. © 2012 Springer-Verlag.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Syllable-based Machine Transliteration with Extra Phrase Features 4

Syllable-based Machine Transliteration with Extra Phrase Fea...

引用

4th Named Entity Workshop, NEWS 2012 at the 50th Annual Meeting of the Association for Computational Linguistics, ACL 2012

作者： Zhang, Chunyue Li, Tingting Zhao, Tiejun MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China

ISBN: (纸本)9781937284404

This paper describes our syllable-based phrase transliteration system for the NEWS 2012 shared task on English-Chinese track and its back. Grapheme-based Transliteration maps the character(s) in the source side to the target character(s) directly. However, character-based segmentation on English side will cause ambiguity in alignment step. In this paper we utilize Phrase-based model to solve machine transliteration with the mapping between Chinese characters and English syllables rather than English characters. Two heuristic rulebased syllable segmentation algorithms are applied. This transliteration model also incorporates three phonetic features to enhance discriminative ability for phrase. The primary system achieved 0.330 on Chinese-English and 0.177 on English-Chinese in terms of top-1 accuracy. © 2012 Proceedings of the Annual Meeting of the Association for Computational Linguistics. All rights reserved.

关键词： Heuristic algorithms

来源：评论

学校读者我要写书评

暂无评论

Discriminate chinese word segmenter with global and context features

Discriminate chinese word segmenter with global and context ...

引用

2012 International Applied Mechanics, MechatronicsAutomation and System Simulation Meeting, AMMASS 2012

作者： Zhu, Conghui Wang, Shiliang Zheng, Dequan MOE-MS Key Laboratory of Natural Language Processing and Speech Harbin Institute of Technology Harbin China

Chinese Word segmenter is the basis for all subsequent applications of natural language processing. The Corpus-based statistic method has become the predominant method. However, the training corpora are not enough especially in certain areas. Therefore, we introduce some global features and context features in order to get almost the same performance only with much smaller scale corpus. The experiments results show that our approach significantly outperforms the original feature sets in the same training data. Meanwhile, the time-consuming of model training is also reduced. In addition, these features do not depend on classifiers, so our method can easily be changed to other models. © (2012) Trans Tech Publications, Switzerland.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：