版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Wuhan Univ Comp Sci & Technol Sch Comp Sci & Technol Wuhan Peoples R China Wuhan Univ Comp Sci & Technol Sch Sci Wuhan Peoples R China
出 版 物:《CONCURRENCY AND COMPUTATION-PRACTICE & EXPERIENCE》 (并行学和计算:实践与经验)
年 卷 期:2022年第34卷第9期
核心收录:
学科分类:08[工学] 0835[工学-软件工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
主 题:CNN encoder-decoder hidden information word embedding
摘 要:The paper classification method aims to correctly divide the paper data according to the similarity of its content. However, how to accurately classify according to the content expressed in the paper has always been a problem that various classification algorithms need to face. At present, there is a kind of paper classification method based on deep learning and implemented by the encoder-decoder structure. This method inputs the words from a large number of papers into encoder, after calculating by NN (neural network) algorithm, the similarity degree of different papers is compared to achieve the purpose of classification. However, this type of method only considers the similarity between words, a NN algorithm can only calculate a large number of word information once, and it cannot find the regularity of classification through word information. But it has a difference with the similarity of the content. This paper starts from the perspective of considering the content, its label information is extracted, and the input vector of encoder-decoder structure is formed with labels and words. This improves the original paper classification method based on encoder-decoder structure. Firstly, the label information is based on the content, which can reflect the content of the paper. Secondly, the classification method which combines label information and word information can reflect the content of the paper comprehensively. Thirdly, the label information is independent of word information and NN algorithm is used separately to make this part of the content more consistent in the encoder-decoder structure. Finally, the label information and the word information are combined, respectively, with the output values obtained by different NN algorithms to realize the classification of the content. This paper proves the effectiveness of the proposed method by evaluating the paper data in web of science and obtaining relevant experimental results.