检索结果-内蒙古大学图书馆

Partial Label Feature Selection: An Adaptive Approach

IEEE Transactions on knowledge and data engineering 2024年第8期36卷 4178-4191页

作者： Zan Zhang Jialu Yao Lin Liu Jiuyong Li Lei Li Xindong Wu Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China) Hefei Anhui China School of Computer Science and Information Engineering Hefei University of Technology Hefei Anhui China Intelligent Interconnected Systems Laboratory of Anhui Province Hefei University of Technology Hefei Anhui China UniSA STEM University of South Australia Adelaide SA Australia Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China) School of Computer Science and Information Engineering Hefei University of Technology Hefei Anhui China Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China) Hefei University of Technology Hefei Anhui China

As an emerging weakly supervised learning framework, partial label learning aims to induce a multi-class classifier from ambiguous supervision information where each training example is associated with a set of candidate labels, among which only one is the true label. Traditional feature selection methods, either for single label and multiple label problems, are not applicable to partial label learning as the ambiguous information contained in the label space obfuscates the importance of features and misleads the selection process. This makes the selection of a proper feature subset from partial label examples particularly challenging, and therefore has rarely been investigated. In this paper, we propose a novel feature selection algorithm for partial label learning, named PLFS, which considers not only the relationships between features and labels, but also exploits the relationships between instances to select the most informative and important features to enhance the performance of partial label learning. PLFS constructs an adaptive weighted graph to exploit the similarity information among instances, differentiate the label space and weight the feature space, which leads to the selection of a proper feature subset. Extensive experiments over a broad range of benchmark data sets clearly validate the effectiveness of our proposed feature selection approach.

关键词： Feature extraction Supervised learning Training knowledge engineering Labeling Classification algorithms Big data

来源：评论

学校读者我要写书评

暂无评论

Automatic Fusion for Multimodal Entity Alignment: A New Perspective from Automatic Architecture Search

Automatic Fusion for Multimodal Entity Alignment: A New Pers...

引用

IEEE International Conference on Multimedia and Expo (ICME)

作者： Chenyang Bu Yunpeng Hong Shiji Zang Guojie Chang Xindong Wu Key Laboratory of Knowledge Engineering With Big Data (the Ministry of Education of China) School of Computer Science and Engineering HeFei University of Technology

ISBN: (数字)9798350390155

ISBN: (纸本)9798350390162

Integrating multimodal data from diverse sources is crucial for enhancing various applications. Multimodal entity alignment (MMEA), which discovers equivalent entities across different sources and modalities, aims to eliminate data silos for comprehensive integration. A key challenge in MMEA is effectively fusing vector representations from different modalities of the same entity for optimal entity matching. Existing fusion methods involve individual fusion operators (e.g., concatenation and summation) or the manual design of complex network structures, incurring significant human resource costs. In this paper, for the first time, we introduce the research question of automatic fusion for MMEA and propose an efficient approach from the perspective of automated architecture search. Experimental comparisons with state-of-the-art methods on real-world datasets demonstrate the effectiveness of the proposed approach.

关键词： Costs Search methods Complex networks Vectors

来源：评论

学校读者我要写书评

暂无评论

A Part-of-Speech Tagging Model Employing Word Clustering and Syntactic Parsing

引用

Chinese Journal of Electronics 2025年第1期23卷 109-114页

作者： Lichi Yuan School of Information Technology Jiangxi University of Finance and Economics Nanchang China Jiangxi Key Laboratory of Data and Knowledge Engineering Jiangxi University of Finance and Economics Nanchang China

Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. Firstly, In order to overcome the defects of the classical HMM, Markov family model (MFM), a new statistical model was introduced. Secondly, to solve the problem of data sparseness, we propose a bottom-to-up hierarchical word clustering algorithm. Then we combine syntactic parsing with part-of-speech tagging. The Part-of-Speech tagging experiments show that the improved Part-Of-Speech tagging model has higher performance than Hidden Markov models (HMMs) under the same testing conditions, the precision is enhanced from 94.642% to 97.235%.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Measuring China's Real Estate Financial Innovation from the Perspective of Government,Enterprises and the Public:Index Compilation and Its Spatial-Temporal Characteristics Analysis

引用

Journal of Systems Science and Information 2023年第1期11卷 1-34页

作者： Jichang DONG Lijun YIN Xiaoting LIU Xiuting LI School of Economics and Management University of Chinese Academy of SciencesBeijing 100190China Key Laboratory of Big Data Mining and Knowledge Management Chinese Academy of SciencesBeijing 100190China China Construction Second Engineering Bureau Co.Ltd Beijing 100070China

In recent years,China has witnessed the rapid development in housing finance,and there have emerged constantly real estate finance innovations;however,there exists no relevant index for measuring the innovations of China's real estate *** on the perspectives of the governments,enterprises and the public,this paper constructs the"innovation index of real estate finance"on a quarterly basis from 2009 to 2019,with the method of empowerment which combines the subjective method(analytic hierarchy process)and the objective one(range coefficient method).It clearly and concretely depicts the innovations in housing finance and the related temporal-spatial characteristics in China since the outbreak of the financial crisis in *** index covers 30 provinces,autonomous regions and municipalities directly under the central government,and analyzes its temporal and spatial *** findings show that there exist a strong spatial autocorrelation and a big regional difference in innovations.

关键词： governments enterprises public real estate finance innovation

来源：评论

学校读者我要写书评

暂无评论

SLMP: A Scientific Literature Management Platform Based on Large Language Models 15

SLMP: A Scientific Literature Management Platform Based on L...

引用

15th IEEE International Conference on knowledge Graph, ICKG 2024

作者： Guo, Menghao Jiang, Jinling Wu, Fan Sun, Shanxin Zhang, Chen Li, Wenhui Sun, Zeyi Chen, Guangyong Wu, Xindong Research Center for Life Sciences Computing Zhejiang Lab Hangzhou China Research Center for Data Hub and Security Zhejiang Lab Hangzhou China Research Center for High Efficiency Computing System Zhejiang Lab Hangzhou China Hefei University of Technology Key Laboratory of Knowledge Engineering With Big Data Hefei China

ISBN: (纸本)9798331508821

This paper presents a Scientific Literature Management Platform (SLMP, demo link1 ) based on large language models (LLMs). The platform consists of four modules: literature management, literature extraction, literature retrieval, and question answering. The core techniques used to support the four modules across the platform include a fine-tuned model PaperExtractGPT and a continual pre-training model ChatPaperGPT based on ChatGLM2 using the data from scientific research literature, responsible for information extraction and communication, respectively. Due to their powerful capabilities in natural language understanding and generation, LLMs can understand complex scientific concepts based on the provided contexts, and thus generate high-quality texts and conduct in-depth information retrieval and question answering. Our platform can help researchers manage and utilize literature more effectively and efficiently for finding relevant literature, obtaining required information, and generating new knowledge. © 2024 IEEE.

关键词： Question answering

来源：评论

学校读者我要写书评

暂无评论

Enabling Efficient NVM-Based Text Analytics without Decompression

Enabling Efficient NVM-Based Text Analytics without Decompre...

引用

International Conference on data engineering

作者： Xiaokun Fang Feng Zhang Junxiang Nong Mingxing Zhang Puyun Hu Yunpeng Chai Xiaoyong Du Key Laboratory of Data Engineering and Knowledge Engineering (MOE) and School of Information Renmin University of China Department of Computer Science and Engineering Tsinghua University

ISBN: (数字)9798350317152

ISBN: (纸本)9798350317169

Text analytics directly on compression (TADOC) is a promising technology designed for handling big data analytics. However, a substantial amount of DRAM is required for high performance, which limits its usage in many important scenarios where the capacity of DRAM is limited, such as memory-constrained systems. Non-volatile memory (NVM) is a novel storage technology that combines the advantage of reading per-formance and byte addressability of DRAM with the durability of traditional storage devices like SSD and HDD. Unfortunately, no research demonstrates how to use NVM to reduce DRAM utilization in compressed data analytics. In this paper, we propose N-TADOC, which substitutes DRAM with NVM while maintaining TADOC's analytics performance and space savings. Utilizing an NVM block device to reduce DRAM utilization presents two challenges, including poor data locality in traversing datasets and auxiliary data structure reconstruction on NVM. We develop novel designs to solve these challenges, including a pruning method with NVM pool management, bottom-up upper bound estimation, correspondent data structures, and persistence strategy at different levels of cost. Experimental results show that on four real-world datasets, N-TADOC achieves 2.04× performance speedup compared to the processing directly on the uncompressed data and 70.7% DRAM space saving compared to the original TADOC.

关键词： Performance evaluation Upper bound data analysis Costs Nonvolatile memory Random access memory Estimation

来源：评论

学校读者我要写书评

暂无评论

Short Text Topic Modeling Techniques, Applications, and Performance: A Survey

引用

IEEE Transactions on knowledge and data engineering 2022年第3期34卷 1427-1445页

作者： Qiang, Jipeng Qian, Zhenyu Li, Yun Yuan, Yunhao Wu, Xindong Department of Computer Science Yangzhou University Jiangsu225009 China Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Hefei University of Technology Hefei Anhui230026 China Mininglamp Academy of Sciences Minininglamp Beijing China

Analyzing short texts infers discriminative and coherent latent topics that is a critical and fundamental task since many real-world applications require semantic understanding of short texts. Traditional long text topic modeling algorithms (e.g., PLSA and LDA) based on word co-occurrences cannot solve this problem very well since only very limited word co-occurrence information is available in short texts. Therefore, short text topic modeling has already attracted much attention from the machine learning research community in recent years, which aims at overcoming the problem of sparseness in short texts. In this survey, we conduct a comprehensive review of various short text topic modeling techniques proposed in the literature. We present three categories of methods based on Dirichlet multinomial mixture, global word co-occurrences, and self-aggregation, with example of representative approaches in each category and analysis of their performance on various tasks. We develop the first comprehensive open-source library, called STTM, for use in Java that integrates all surveyed algorithms within a unified interface, benchmark datasets, to facilitate the expansion of new methods in this research field. Finally, we evaluate these state-of-the-art methods on many real-world datasets and compare their performance against one another and versus long text topic modeling algorithm. © 2020 IEEE.

关键词： Surveys

来源：评论

学校读者我要写书评

暂无评论

Learning Inter-Entity Interaction for Few-Shot knowledge Graph Completion

Learning Inter-Entity Interaction for Few-Shot Knowledge Gra...

引用

2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

作者： Li, Yuling Yu, Kui Huang, Xiaoling Zhang, Yuhong Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Hefei China School of Computer Science and Information Enginerring Hefei University of Technology China

Few-shot knowledge graph completion (FKGC) aims to infer unknown fact triples of a relation using its few-shot reference entity pairs. Recent FKGC studies focus on learning semantic representations of entity pairs by separately encoding the neighborhoods of head and tail entities. Such practice, however, ignores the inter-entity interaction, resulting in low-discrimination representations for entity pairs, especially when these entity pairs are associated with 1-to-N, N-to-1, and N-to-N relations. To address this issue, this paper proposes a novel FKGC model, named Cross-Interaction Attention Network (CIAN) to investigate the inter-entity interaction between head and tail entities. Specifically, we first explore the interactions within entities by computing the attention between the task relation and each entity neighbor, and then model the interactions between head and tail entities by letting an entity to attend to the neighborhood of its paired entity. In this way, CIAN can figure out the relevant semantics between head and tail entities, thereby generating more discriminative representations for entity pairs. Extensive experiments on two public datasets show that CIAN outperforms several state-of-the-art methods. The source code is available at https://***/cjlyl/FKGC-CIAN. © 2022 Association for Computational Linguistics.

关键词： knowledge graph

来源：评论

学校读者我要写书评

暂无评论

Stable Learning via Triplex Learning

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on Artificial Intelligence 2024年第10期5卷 5267-5276页

作者： Yang, Shuai Jiang, Tingting Dang, Qianlong Gu, Lichuan Wu, Xindong Anhui Agricultural University School of Information and Artificial Intelligence Hefei230036 China Anhui Provincial Engineering Research Center for Agricultural Information Perception and Intelligent Computing Hefei230036 China Northwest A & F University College of Science Yangling712100 China Hefei University of Technology Key Laboratory of Knowledge Engineering with Big Data The Ministry of Education of China Hefei230601 China Hefei University of Technology School of Computer Science and Information Engineering Hefei230601 China

Stable learning aims to learn a model that generalizes well to arbitrary unseen target domain by leveraging a single source domain. Recent advances in stable learning have focused on balancing the distribution of confounders for each feature to eliminate spurious correlations. However, previous studies treat all features equally without considering the difficulties of confounder balancing associated with different features, and regard irrelevant features as confounders, deteriorating generalization performance. To tackle these issues, this article proposes a novel triplex learning (TriL) based stable learning algorithm, which performs sample reweighting, causal feature selection, and representation learning to remove spurious correlations. Specifically, first, TriL adaptively assigns weights to the confounder balancing term of each feature in accordance with the difficulties of confounder balancing, and aligns the confounder distribution of each feature by learning a group of sample weights. Second, TriL integrates the sample weights into a weighted cross-entropy model to compute causal effects of features for excluding irrelevant features from the confounder set. Finally, TriL relearns a set of sample weights and uses them to guide a new supervised dual-autoencoder containing two classifiers to learn feature representations. TriL forces the results of two classifiers to remain consistent for removing spurious correlations by using a cross-classifier consistency regularization. Extensive experiments on synthetic and two real-world datasets show the superiority of TriL compared with seven methods. © 2024 IEEE.

关键词： Feature Selection

来源：评论

学校读者我要写书评

暂无评论

Gender Disparity in Expressed Emotions within Health-Related Online Support Groups

引用

Proceedings of the Association for Information Science and Technology 2021年第1期58卷 883-885页

作者： Zhao, Yuehua Wang, Hao Deng, Sanhong Chen, Ye Nanjing University Jiangsu Key Laboratory of Data Engineering and Knowledge Service Nanjing China Central China Normal University Wuhan China

Online support groups offer a new way to users to communicate with others regarding certain health issues. Taking autism-related support groups on Facebook as an example, we examine whether the expressed emotions differ between female and male users in online health-related support groups and whether such gender disparity varied based on the topics of the groups. Experimental results reveal a significant gender difference of expressed emotions in the groups. We find that female users tended to express more positive emotions in the group discussions than the male group members did. In addition, users appeared to express different sentiments within the groups focused on various topics. Male users tend to convey more negative emotions in the group that related to treatment, while female users were more positive when posted in the research-related group than male users were. This study is beneficial for tracking and moderating the emotional environment in online support groups. 84 Annual Meeting of the Association for Information Science & Technology | Oct. 29 – Nov. 3, 2021 | Salt Lake City, UT. Author(s) retain copyright, but ASIS&T receives an exclusive publication license.

关键词： Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：