检索结果-内蒙古大学图书馆

International Conference on Big Data and Smart Computing (BigComp)

作者： Hong, Ki-Joo Kim, Han-Joon Univ Seoul Sch Elect & Comp Engn Seoul South Korea

ISBN: (纸本)9781467387965

Semantic search is known as a series of activities and techniques to improve the search accuracy by clearly understanding users' search intent. Usually, semantic search engines requires ontology and semantic metadata to analyze user queries. However, building a particular ontology and semantic metadata intended for large amounts of data is a very time-consuming and costly task. In order to resolve this problem, we propose a novel semantic search method that does not require ontologies and semantic metadata by taking advantage of semantically enriched text model. Through extensive experiments using the OSHUMED document collection and SCOPUS library data, we show that our proposed method improves users' search satisfaction.

关键词： Semantic search text representation model text mining Wikipedia Tensor

来源：评论

学校读者我要写书评

暂无评论

Research on text representation model Integrated semantic relationship

Research on Text Representation Model Integrated semantic re...

引用

IEEE International Conference on Systems, Man, and Cybernetics (SMC)

作者： Zhu, Jianlin Fang, You Yang, Xiaoping Wang, Qian Renmin Univ China Informat Sch Beijing Peoples R China North China Elect Power Univ Student Off Baoding Peoples R China

ISBN: (纸本)9781479986965

Word-text matrix has been usually used as text representation model in text classification and text clustering. However its high dimension and sparsity reduce its expression ability. For improving its expression ability, authors mine word word relation and text-text relation, and integrate these semantic relationships into word-text matrix. The classification experiments show that these new representation models can improve the classification accuracy of text efficiently as well as represent the text information better.

关键词： text classification text representation model feature matrix word-text matrix

来源：评论

学校读者我要写书评

暂无评论

A new text representation model enriched with semantic relations 15

A new text representation model enriched with semantic relat...

引用

15th International Conference on Control, Automation and Systems (ICCAS)

作者： Nugumanova, Aliya Baiburin, Yerzhan Apaev, Kurmash East Kazakhstan State Tech Univ Dept Informat Technol Ust Kamenogorsk Kazakhstan

ISBN: (纸本)9788993215090

In this paper we present a novel approach based on efficient text representation which employs semantic relations between words. We use singular value decomposition of the co-occurrence matrix to overcome its noise and sparseness. Thereby, we obtain a new refined co-occurrence matrix, which allows us to determine relations between words as distances in it. We use these distances as correction factors for the Bag-of-words text representation. In other words, we transform text representation vectors by inclusion relations between words. To validate our representation model, we apply it to binary classification task. We study how our model improves classification of documents, which are relevant to a given domain (topic). For this purpose, we implement Support Vector Machine and classify documents from Reuters-21578 collection. Results of our experiments demonstrate the superiority of our model.

关键词： text representation model text classification Singular value decomposition

来源：评论

学校读者我要写书评

暂无评论

Tensor Space model-based textual Data Augmentation for text Classification

Tensor Space Model-based Textual Data Augmentation for Text ...

引用

2023 IEEE International Conference on Big Data, BigData 2023

作者： Chang, Minsuk Kim, Han-Joon Seoul National University Department of Computer Science and Engineering Seoul Korea Republic of University of Seoul Department of Electrical and Computer Engineering Seoul Korea Republic of

ISBN: (纸本)9798350324457

In this paper, we first introduce a new text representation method to convert a textual document into a tensor space model named textCuboid, which can preserve various meanings of polysemy. Based upon the new model, we propose two novel data augmentation techniques (called Boolean augmentation and CuboidGAN) that can be directly applied to the textCuboid model for text classification tasks. Boolean augmentation includes three simple keyword modifications: synonym replacement, synonym insertion, and random deletion. CuboidGAN is composed of two key components, style encoding, and residual regression, and it is trained in two phases to generate unambiguous and plausible concept vectors. Through intensive experiments using five commonly used datasets, we prove that our proposed methods perform better data augmentation than other conventional methods. We also show that each augmentation method component significantly contributes to text classification through ablation studies. © 2023 IEEE.

关键词： Autoencoder Data Augmentation Deep Learning Generative Adversarial Networks Tensor Space model text Classification text representation model

来源：评论

学校读者我要写书评

暂无评论

text clustering algorithm based on deep representation learning

引用

JOURNAL OF ENGINEERING-JOE 2018年第16期2018卷 1407-1414页

作者： Wang, Binyu Liu, Wenfen Lin, Zijie Hu, Xuexian Wei, Jianghong Liu, Chun State Key Lab Math Engn & Adv Comp Zhengzhou Henan Peoples R China Guilin Univ Elect Technol Guangxi Key Lab Cryptog & Informat Secur Guilin 541000 Peoples R China

text clustering is an important method for effectively organising, summarising, and navigating text information. However, in the absence of labels, the text data to be clustered cannot be used to train the text representation model based on deep learning. To address the problem, an algorithm of text clustering based on deep representation learning is proposed using the transfer learning domain adaptation and the parameters update during cluster iteration. First, source domain data is used to perform the pre-training of the deep learning classification model. This procedure acts as an initialisation of the model parameters. Then, the domain discriminator is added to the model, to domain-divide the input sample. If the discriminator cannot distinguish which domain the data belongs to, the common feature space of two domains is obtained, so the domain adaptation problem is solved. Finally, the text feature vectors obtained by the model are clustered with MCSKM++ algorithm. The algorithm not only resolves the model pre-training problem in unsupervised clustering, but also has a good clustering effect on the transfer problem caused by different numbers of domain labels. Experiments suggest that the clustering accuracy of the algorithm is superior to other similar algorithms.

关键词： text analysis feature extraction pattern clustering learning (artificial intelligence) text information text representation model deep representation learning transfer learning domain adaptation parameters update source domain data deep learning classification model domain discriminator domain-divide domain adaptation problem text feature vectors MCSKM++ algorithm clustering iteration process expectation maximisation algorithm target domain data text clustering result model pre-training problem unsupervised clustering transfer problem domain labels clustering accuracy text clustering algorithm

来源：评论

学校读者我要写书评

暂无评论

Study on Hot Topics Identification and Key Issues in on-Line News about Emergency Events

Study on Hot Topics Identification and Key Issues in on-Line...

引用

International Conference on Advanced Intelligence and Awareness Internet (AIAI 2011)

作者： Chen, Liping Song, Maoqiang Beijing Univ Posts & Telecommun Sch Comp Sci Beijing 100876 Peoples R China Beijing Univ Posts & Telecommun Key Lab Trustworthy Distributed Comp & Serv Minist Educ Beijing 100876 Peoples R China

ISBN: (纸本)9781849194716

Concerning the system of hot topics detection about the emergency events, an overall technical framework is established to implement the system. Description and solution strategy about the key issues in the four components of the system are provided. In terms of the content and structure features of the news reports as well as the distribution feature of the report sources, the text clipping method and the modified model of feature weighting calculation are proposed based on the VSM text representation model and the TF-IDF formula. The news reports about the earthquake emergency event are evaluated for this model as the data sources. Experiment results indicate that the information such as the headline, the lead and relevant feature parameters by clipping the main body of the news report can be considered as the sample set of the hot topics to be identified. Furthermore, compared with the classical model, the modified feature items weighting calculation model is more efficient in execution and more adaptive in terms of the text representation capability.

关键词： emergency event news report hot topic identification text clipping text representation model

来源：评论

学校读者我要写书评

暂无评论

Study on Hot Topics Identification and Key Issues in on-Line News about Emergency Events

Study on Hot Topics Identification and Key Issues in on-Line...

引用

2011 International Conference on Advanced Intelligence and Awareness Internet(第二届高等智能和感知网络国际会议 AIAI 2011)

作者： Liping Chen Maoqiang Song School of Computer Science Beijing University of Posts and Telecommunications Beijing 100876 China Key Laboratory of Trustworthy Distributed Computing and Service Ministry of Education Beijing Unive

ISBN: (纸本)9781622761234

关键词： emergency event news report hot topic identification text clipping text representation model

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：