检索结果-内蒙古大学图书馆

2010 2nd International Conference on Future Computer and Communication, ICFCC 2010

作者： Zhou, Bing Xiao, Bo Lin, Zhiqing Zhang, Chuang Pattern Recognition and Intelligent System Beijing University of Posts and Telecommunications BUPT Beijing China

ISBN: (纸本)9781424458226

Due to the explosive growth of the web pages, centralized crawlers are no longer sufficient to run on the web efficiently. There are many distributed crawlers in wide use;however, none of them is suitable for template-customized vertical crawling. In this paper, we present a distributed template-customized vertical crawler which is specially used for crawling Internet forums. The Client-Server architecture of the system and the function of every module are described in detail which can be extended to other fields easily. A crawling-period based distribution strategy is also proposed, with which the crawler manager can coordinate the quantity of crawling tasks and the resources of each crawler very well, and the crawler can process websites with different updating frequency flexibly. We also define a communication protocol between crawlers and crawler manager and describe how to solve the duplicated crawling problem in the distributed system. The performance of centralized vertical crawler and distributed vertical crawler are compared in the experiment. Experimental results demonstrate that the parallel operation of all the crawlers in the distributed system can greatly enhance the crawling efficiency. ©2010 IEEE.

关键词： Websites

来源：评论

学校读者我要写书评

暂无评论

引用

The Journal of China Universities of Posts and Telecommunications 2008年第2期15卷 130-134页

作者： ZHAO Jian DONG Yuan ZHAO Xian-yu YANG Hao WANG Hai-la Laboratory of Pattern Recognition and Intelligent System Beijing Universityof Posts and Telecommunications Beijing 100876 China France Telecom Research and Development Center Beijing 100080 China

Speaker adaptive test normalization （ATnorm） is the most effective approach of the widely used score normalization in text-flldependent speaker verification, which selects speaker adaptive impostor cohorts with an extra development corpus in order to enhance the recognition performance. In this paper, an improved implementation of ATnorm that can offer overall significant advantages over the original ATnorm is presented. This method adopts a novel cross similarity measurement in speaker adaptive cohort model selection without an extra development corpus. It can achieve a comparable performance with the original ATnorm and reduce the computation complexity moderately. With the full use of the saved extra development corpus, the overall system performance can be improved significantly. The results are presented on NIST 2006 Speaker recognition Evaluation data corpora where it is shown that this method provides significant improvements in system performance, with relatively 14.4% gain on equal error rate （EER） and 14.6% gain on decision cost function （DCF） obtained as a whole.

关键词： speaker ATnorm score normalization cross similaritymeasurement speaker verification NIST speaker recognitionevaluation

来源：评论

学校读者我要写书评

暂无评论

The study of human behavior dynamics based on blogosphere

The study of human behavior dynamics based on blogosphere

引用

2010 International Conference on Web Information systems and Mining, WISM 2010

作者： Song, Yading Zhang, Chuang Wu, Ming Pattern Recognition and Intelligent System Lab Beijing University of Posts and Telecommunications Beijing China

ISBN: (纸本)9780769542249

Blog and microblog have become one of the most popular applications which individuals could be the message source. Therefore, interactivities between individuals have been largely enhanced in today's world. In terms of message delivery and human behavior, understanding the mechanism behind is of significance. Though human dynamics has become a popular subject, most of relevant studies so far used the Poisson process to approximately describe the human behavior. However, extensive evidences like email exchange, online game, and mobile communication showed a non-Poisson statistics with heavy tail in those applications. Meanwhile the study in human dynamics is still rare to some extent. Thus in this paper, the authors provided empirical evidences to support such a non-Poisson statistics being existed in the web2.0 applications blog and microblog, also showing the difference between these two applications. The fact that both human behavior in blog and microblog follow a heavy tailed distribution with α≈1.3 and α≈2 will be revealed. At last, the study of human behavior related to social network can also be enriched. © 2010 IEEE.

关键词： Blogs

来源：评论

学校读者我要写书评

暂无评论

Multi-class classifier of non-speech audio based on Fisher kernel

引用

Frontiers of Electrical and Electronic Engineering in China 2010年第1期5卷 72-76页

作者： Rongyan WANG Gang LIU Jun GUO Yu FANG Pattern Recognition and Intelligent System Laboratory Beijing University of Posts and TelecommunicationsBeijing 100876China

Traditional multi-class classification methods based on Fisher kernel combine generative models such as Gaussian mixture models(GMMs)of all the classes ***,the combination generates high dimensional feature vectors and leads to large *** this paper,a new classification method is *** method adopts an intelligent feature space selection strategy by clustering similar Gaussian mixtures in order to reduce the feature *** classification experiments show that the proposed method is more accurate and effective with less computation compared with traditional methods.

关键词： Fisher kernel support vector machine(SVM) Gaussian mixture model(GMM) mixture clustering

来源：评论

学校读者我要写书评

暂无评论

Speech enhancement based on modified a priori SNR estimation

引用

Frontiers of Electrical and Electronic Engineering in China 2011年第4期6卷 542-546页

作者： Yu FANG Gang LIU Jun GUO Pattern Recognition and Intelligent System Laboratory Beijing University of Posts and TelecommunicationsBeijing 100876China

To solve the frame delay problem and match the previous frame,Plapous et al.[IEEE Transactions on Audio,Speech,and Language Processing,2006,14(6):2098–2108]introduced a novel approach called two-step noise reduction(TSNR)technique to improve the performance of the speech enhancement ***,TSNR approach results in spectral peaks of short duration and the broken spectral outlier,which degrade the spectral characteristics of the *** solve this problem,a cepstral smoothing step is added in order to remove these spectral peaks brought by TSNR *** analysis shows that the proposed approach can effectively smooth the spectral peaks and keep the spectral outlier so as to protect the speech *** results also show that the proposed approach can bring significant improvement compared to decision-directed(DD)and TSNR approaches,especially in non-stationary noisy environments.

关键词： speech enhancement decision-directed(DD) two-step noise reduction(TSNR) signal-to-noise ratio(SNR)estimation

来源：评论

学校读者我要写书评

暂无评论

Modeling microblogging communication based on human dynamics

Modeling microblogging communication based on human dynamics

引用

2011 8th International Conference on Fuzzy systems and Knowledge Discovery, FSKD 2011, Jointly with the 2011 7th International Conference on Natural Computation, ICNC'11

作者： Xie, Jianjun Zhang, Chuang Wu, Ming Pattern Recognition and Intelligent System Lab Beijing University of Posts and Telecommunications Beijing China

ISBN: (纸本)9781612841816

Microblog is a large-scale information sharing platform where intensive communications are taking place through interactive user behaviors. Previous studies have analyzed and modeled a series of traditional communications, such as letter, email and phone calls. The timing of human activities tends to be non-Poisson with bursts and heavy tails. Hence, several models have been proposed to explain human dynamics of bursts and heavy tails in various fields. However, as a newly developed product of Web2.0, microblog possesses inherent new characteristics, e.g. user-centric broadcast medium and asymmetric user relations. The communication in microblog is still poorly understood. Our work proposed an interest-driven model to simulate basic user communicating behaviors and processes. We came to the conclusion that, in microblogging communication, individual behaviors are bursts of rapidly occurring events separated by long periods of inactivity. Collective behaviors follow heavy-tailed Power Law distribution, whose origin is the descent of interests. Empirical statistics were also given to verify the model. © 2011 IEEE.

关键词： Behavioral research

来源：评论

学校读者我要写书评

暂无评论

Learning Discriminative Representations for Open Relation Extraction with Instance Ranking and Label Calibration

Learning Discriminative Representations for Open Relation Ex...

引用

2022 Findings of the Association for Computational Linguistics: NAACL 2022

作者： Wang, Shusen Duan, Bin Wu, Yanan Xu, Yajing Pattern Recognition & Intelligent System Laboratory Beijing University of Posts and Telecommunications Beijing China

ISBN: (纸本)9781955917766

Open relation extraction is the task to extract relational facts without pre-defined relation types from open-domain corpora. However, since there are some hard or semi-hard instances sharing similar context and entity information but belonging to different underlying relation, current OpenRE methods always cluster them into the same relation type. In this paper, we propose a novel method based on Instance Ranking and Label Calibration strategies (IRLC) to learn discriminative representations for open relation extraction. Due to lacking the original instance label, we provide three surrogate strategies to generate the positive, hard negative, and semi-hard negative instances for the original instance. Instance ranking aims to refine the relational feature space by pushing the hard and semi-hard negative instances apart from the original instance with different margins and pulling the original instance and its positive instance together. To refine the cluster probability distributions of these instances, we introduce a label calibration strategy to model the constraint relationship between instances. Experimental results on two public datasets demonstrate that our proposed method can significantly outperform the previous state-of-the-art methods. © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings.

关键词： Calibration

来源：评论

学校读者我要写书评

暂无评论

RCL: Relation Contrastive Learning for Zero-Shot Relation Extraction

RCL: Relation Contrastive Learning for Zero-Shot Relation Ex...

引用

2022 Findings of the Association for Computational Linguistics: NAACL 2022

作者： Wang, Shusen Zhang, Bosen Xu, Yajing Wu, Yanan Xiao, Bo Pattern Recognition & Intelligent System Laboratory Beijing University of Posts and Telecommunications Beijing China

ISBN: (纸本)9781955917766

Zero-shot relation extraction aims to identify novel relations which cannot be observed at the training stage. However, it still faces some challenges since the unseen relations of instances are similar or the input sentences have similar entities, the unseen relation representations from different categories tend to overlap and lead to errors. In this paper, we propose a novel Relation Contrastive Learning framework (RCL) to mitigate above two types of similar problems: Similar Relations and Similar Entities. By jointly optimizing a contrastive instance loss with a relation classification loss on seen relations, RCL can learn subtle difference between instances and achieve better separation between different relation categories in the representation space simultaneously. Especially in contrastive instance learning, the dropout noise as data augmentation is adopted to amplify the semantic difference between similar instances without breaking relation representation, so as to promote model to learn more effective representations. Experiments conducted on two well-known datasets show that RCL can significantly outperform previous state-of-the-art methods. Moreover, if the seen relations are insufficient, RCL can also obtain comparable results with the model trained on the full training set, showing the robustness of our approach. © Findings of the Association for Computational Linguistics: NAACL 2022 - Findings.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Novel active learning sample evaluation method based on multi-level confusion networks

Novel active learning sample evaluation method based on mult...

引用

2010 2nd IEEE International Conference on Network Infrastructure and Digital Content, IC-NIDC 2010

作者： Chen, Wei Liu, Gang Guo, Jun Pattern Recognition and Intelligent System Laboratory Beijing University of Posts and Telecommunications Beijing China

ISBN: (纸本)9781424468546

Active Learning (AL) is designed to aid the labor-intensive process of training acoustic model for speech recognition. In AL, only the most informative training samples are selected for manual annotation. Thus, how to evaluate the unlabeled samples is worth researching. In this paper, we propose a unified framework to generate confusion networks of multiple levels including character, syllable and phone, and present a novel active learning sample evaluation method for Chinese acoustic modeling, posterior probabilities obtained from multi-level confusion networks are respectively adopted to evaluate the unlabeled samples. Our experiments show that compared with the widely used sample evaluation method using word posterior probability obtained from word confusion network, our proposed method can achieve satisfying performances. ©2010 IEEE.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

An efficient algorithm of hot events detection in text streams

An efficient algorithm of hot events detection in text strea...

引用

International Conference on Cyber-Enabled Distributed Computing and Knowledge Discovery

作者： Bai, Junliang Guo, Jun Chen, Guang Xu, Weiran Du, Gang Pattern Recognition and Intelligent System Lab. Beijing University of Posts and Telecommunications China

ISBN: (纸本)9780769542355

Hot events detection in text streams has drawn increasing attention in recent sequential data mining works. Different from traditional TDT task which find all the real events' cluster, hot events detection only identify hot events concerned by public. This paper proposes a novel approach to identify those events based on burst terms, terms co-occurrence and generative probabilistic model. Experiments with huge text stream sets crawled from WWW suggest that our algorithm can work on-line and identify hot events effectively and efficiently. © 2010 IEEE.

关键词： Data mining

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：