As an emerging weakly supervised learning framework, partial label learning aims to induce a multi-class classifier from ambiguous supervision information where each training example is associated with a set of candid...
详细信息
As an emerging weakly supervised learning framework, partial label learning aims to induce a multi-class classifier from ambiguous supervision information where each training example is associated with a set of candidate labels, among which only one is the true label. Traditional feature selection methods, either for single label and multiple label problems, are not applicable to partial label learning as the ambiguous information contained in the label space obfuscates the importance of features and misleads the selection process. This makes the selection of a proper feature subset from partial label examples particularly challenging, and therefore has rarely been investigated. In this paper, we propose a novel feature selection algorithm for partial label learning, named PLFS, which considers not only the relationships between features and labels, but also exploits the relationships between instances to select the most informative and important features to enhance the performance of partial label learning. PLFS constructs an adaptive weighted graph to exploit the similarity information among instances, differentiate the label space and weight the feature space, which leads to the selection of a proper feature subset. Extensive experiments over a broad range of benchmark data sets clearly validate the effectiveness of our proposed feature selection approach.
Integrating multimodal data from diverse sources is crucial for enhancing various applications. Multimodal entity alignment (MMEA), which discovers equivalent entities across different sources and modalities, aims to ...
详细信息
ISBN:
(数字)9798350390155
ISBN:
(纸本)9798350390162
Integrating multimodal data from diverse sources is crucial for enhancing various applications. Multimodal entity alignment (MMEA), which discovers equivalent entities across different sources and modalities, aims to eliminate data silos for comprehensive integration. A key challenge in MMEA is effectively fusing vector representations from different modalities of the same entity for optimal entity matching. Existing fusion methods involve individual fusion operators (e.g., concatenation and summation) or the manual design of complex network structures, incurring significant human resource costs. In this paper, for the first time, we introduce the research question of automatic fusion for MMEA and propose an efficient approach from the perspective of automated architecture search. Experimental comparisons with state-of-the-art methods on real-world datasets demonstrate the effectiveness of the proposed approach.
Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. First...
Part-Of-Speech tagging is a basic task in the field of natural language processing. This paper builds a POS tagger based on improved Hidden Markov model, by employing word clustering and syntactic parsing model. Firstly, In order to overcome the defects of the classical HMM, Markov family model (MFM), a new statistical model was introduced. Secondly, to solve the problem of data sparseness, we propose a bottom-to-up hierarchical word clustering algorithm. Then we combine syntactic parsing with part-of-speech tagging. The Part-of-Speech tagging experiments show that the improved Part-Of-Speech tagging model has higher performance than Hidden Markov models (HMMs) under the same testing conditions, the precision is enhanced from 94.642% to 97.235%.
In recent years,China has witnessed the rapid development in housing finance,and there have emerged constantly real estate finance innovations;however,there exists no relevant index for measuring the innovations of Ch...
详细信息
In recent years,China has witnessed the rapid development in housing finance,and there have emerged constantly real estate finance innovations;however,there exists no relevant index for measuring the innovations of China's real estate *** on the perspectives of the governments,enterprises and the public,this paper constructs the"innovation index of real estate finance"on a quarterly basis from 2009 to 2019,with the method of empowerment which combines the subjective method(analytic hierarchy process)and the objective one(range coefficient method).It clearly and concretely depicts the innovations in housing finance and the related temporal-spatial characteristics in China since the outbreak of the financial crisis in *** index covers 30 provinces,autonomous regions and municipalities directly under the central government,and analyzes its temporal and spatial *** findings show that there exist a strong spatial autocorrelation and a big regional difference in innovations.
This paper presents a Scientific Literature Management Platform (SLMP, demo link1 ) based on large language models (LLMs). The platform consists of four modules: literature management, literature extraction, literatur...
详细信息
Text analytics directly on compression (TADOC) is a promising technology designed for handling big data analytics. However, a substantial amount of DRAM is required for high performance, which limits its usage in many...
详细信息
ISBN:
(数字)9798350317152
ISBN:
(纸本)9798350317169
Text analytics directly on compression (TADOC) is a promising technology designed for handling big data analytics. However, a substantial amount of DRAM is required for high performance, which limits its usage in many important scenarios where the capacity of DRAM is limited, such as memory-constrained systems. Non-volatile memory (NVM) is a novel storage technology that combines the advantage of reading per-formance and byte addressability of DRAM with the durability of traditional storage devices like SSD and HDD. Unfortunately, no research demonstrates how to use NVM to reduce DRAM utilization in compressed data analytics. In this paper, we propose N-TADOC, which substitutes DRAM with NVM while maintaining TADOC's analytics performance and space savings. Utilizing an NVM block device to reduce DRAM utilization presents two challenges, including poor data locality in traversing datasets and auxiliary data structure reconstruction on NVM. We develop novel designs to solve these challenges, including a pruning method with NVM pool management, bottom-up upper bound estimation, correspondent data structures, and persistence strategy at different levels of cost. Experimental results show that on four real-world datasets, N-TADOC achieves 2.04× performance speedup compared to the processing directly on the uncompressed data and 70.7% DRAM space saving compared to the original TADOC.
Analyzing short texts infers discriminative and coherent latent topics that is a critical and fundamental task since many real-world applications require semantic understanding of short texts. Traditional long text to...
详细信息
Few-shot knowledge graph completion (FKGC) aims to infer unknown fact triples of a relation using its few-shot reference entity pairs. Recent FKGC studies focus on learning semantic representations of entity pairs by ...
详细信息
Stable learning aims to learn a model that generalizes well to arbitrary unseen target domain by leveraging a single source domain. Recent advances in stable learning have focused on balancing the distribution of conf...
详细信息
Online support groups offer a new way to users to communicate with others regarding certain health issues. Taking autism-related support groups on Facebook as an example, we examine whether the expressed emotions diff...
详细信息
Online support groups offer a new way to users to communicate with others regarding certain health issues. Taking autism-related support groups on Facebook as an example, we examine whether the expressed emotions differ between female and male users in online health-related support groups and whether such gender disparity varied based on the topics of the groups. Experimental results reveal a significant gender difference of expressed emotions in the groups. We find that female users tended to express more positive emotions in the group discussions than the male group members did. In addition, users appeared to express different sentiments within the groups focused on various topics. Male users tend to convey more negative emotions in the group that related to treatment, while female users were more positive when posted in the research-related group than male users were. This study is beneficial for tracking and moderating the emotional environment in online support groups. 84 Annual Meeting of the Association for Information Science & Technology | Oct. 29 – Nov. 3, 2021 | Salt Lake City, UT. Author(s) retain copyright, but ASIS&T receives an exclusive publication license.
暂无评论