检索结果-内蒙古大学图书馆

A Heuristic Sampling Method for Maintaining the Probability Distribution

Journal of Computer Science & Technology 2021年第4期36卷 896-909页

作者： Jiao-Yun Yang Jun-Da Wang Yi-Fang Zhang Wen-Juan Cheng Lian Li Key Laboratory of Knowledge Engineering with Big Data of Ministry of Education Hefei University of Technology Hefei 230601China National Smart EldeTca/re International Science and Technology Coopemtion Base Hefei University of Technology Hefei 230601China School of Computer Science and Information Engirwering Hefei University of TechnologyHefei 230601China School of Mathematics Hefei University of TechnologyHefei 230601China

Sampling is a fundamental method for generating data *** many data analysis methods are developed based on probability distributions,maintaining distributions when sampling can help to ensure good data analysis ***,sampling a minimum subset while maintaining probability distributions is still a *** this paper,we decompose a joint probability distribution into a product of conditional probabilities based on Bayesian networks and use the chi-square test to formulate a sampling problem that requires that the sampled subset pass the distribution test to ensure the ***,a heuristic sampling algorithm is proposed to generate the required subset by designing two scoring functions:one based on the chi-square test and the other based on likelihood *** on four types of datasets with a size of 60000 show that when the significant difference level,a,is set to 0.05,the algorithm can exclude 99.9%,99.0%,93.1%and 96.7%of the samples based on their Bayesian networks-ASIA,ALARM,HEPAR2,and ANDES,*** subsets of the same size are sampled,the subset generated by our algorithm passes all the distribution tests and the average distribution difference is approximately 0.03;by contrast,the subsets generated by random sampling pass only 83.8%of the tests,and the average distribution difference is approximately 0.24.

关键词： Bayesian network chi-square test sampling probability distribution

来源：评论

学校读者我要写书评

暂无评论

Generalized Category Discovery with Large Language Models in the Loop

arXiv

引用

arXiv 2023年

作者： An, Wenbin Shi, Wenkai Tian, Feng Lin, Haonan Wang, QianYing Wu, Yaqiang Cai, Mingxiang Wang, Luyan Chen, Yan Zhu, Haiping Chen, Ping School of Automation Science and Engineering Xi’an Jiaotong University China School of Computer Science and Technology Xi’an Jiaotong University China Ministry of Education Key Laboratory of Intelligent Networks and Network Security China Shaanxi Province Key Laboratory of Big Data Knowledge Engineering China Lenovo Research China University of Massachusetts Boston United States

Generalized Category Discovery (GCD) is a crucial task that aims to recognize both known and novel categories from a set of unlabeled data by utilizing a few labeled data with only known categories. Due to the lack of supervision and category information, current methods usually perform poorly on novel categories and struggle to reveal semantic meanings of the discovered clusters, which limits their applications in the real world. To mitigate the above issues, we propose Loop, an end-to-end active-learning framework that introduces Large Language Models (LLMs) 1 into the training loop, which can boost model performance and generate category names without relying on any human efforts. Specifically, we first propose Local Inconsistent Sampling (LIS) to select samples that have a higher probability of falling to wrong clusters, based on neighborhood prediction consistency and entropy of cluster assignment probabilities. Then we propose a Scalable Query strategy to allow LLMs to choose true neighbors of the selected samples from multiple candidate samples. Based on the feedback from LLMs, we perform Refined Neighborhood Contrastive Learning (RNCL) to pull samples and their neighbors closer to learn clustering-friendly representations. Finally, we select representative samples from clusters corresponding to novel categories to allow LLMs to generate category names for them. Extensive experiments on three benchmark datasets show that Loop outperforms SOTA models by a large margin and generates accurate category names for the discovered clusters. Code and data are available at https://***/Lackel/LOOP. Copyright © 2023, The Authors. All rights reserved.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Text-guided Reconstruction Network for Sentiment Analysis with Uncertain Missing Modalities

引用

IEEE Transactions on Affective Computing 2025年

作者： Shi, Piao Hu, Min Nakagawa, Satoshi Zheng, Xiangming Shi, Xuefeng Ren, Fuji Hefei University of Technology Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine National Smart Eldercare International Science and Technology Cooperation Base School of Computer Science and Information Engineering Anhui Hefei230601 China Bozhou University School of Electronic and Information Engineering Bozhou236800 China University of Tokyo Graduate School of Information Science and Technology Tokyo113-8656 Japan University of Electronic Science and Technology of China College of Computer Science and Engineering Chengdu611731 China University of Electronic Science and Technology of China Shenzhen Institute for Advanced Study Shenzhen518110 China

Multimodal Sentiment Analysis (MSA) is an attractive research that aims to integrate sentiment expressed in textual, visual, and acoustic signals. There are two main problems in the existing methods: 1) the dominant role of the text is underutilization in unaligned multimodal data, and 2) the modality under uncertain missing feature is not sufficiently explored. This paper proposes a Text-guided Reconstruction Network (TgRN) for MSA with uncertain missing modalities in non-aligned sequences. The TgRN network includes three primary modules: Text-guided Extraction Module (TEM), Reconstruction Module (RM) and Text-guided Fusion Module (TFM). First, the TEM consists of the text-guided cross attention units and self-attention units to capture inter-modal features and intra-modal features, respectively. Second, leveraging enhanced attention units and a three-way squeeze-and-excitation block, the RM is designed to learn semantic information from incomplete data and reconstruct missing modality features. Third, the TFM utilizes a progressive modality-mixing adaptation gate to explore the dynamic correlations between nonverbal and verbal modalities, effectively addressing the modality gap issue. Finally, under the supervision of sentiment prediction loss and reconstruction loss, the TgRN effectively processes both uncertain missing-modality conditions and ideal complete modality conditions. Extensive experiments on CMU-MOSI and CH-SIMS demonstrate that our proposed method outperforms state-of-the-art approaches. © 2010-2012 IEEE.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Fuzzy Preference Completion with Ranked and Unranked Preferences

引用

Cognitive Computation 2025年第3期17卷 1-21页

作者： Li, Lei Liu, Pan Zhang, Renjie Tao, Zhenchao Wu, Xindong Key Laboratory of Knowledge Engineering with Big Data (the Ministry of Education of China) Hefei University of Technology Hefei China School of Computer Science and Information Engineering Hefei University of Technology Hefei China Department of Radiation Oncology The First Affiliated Hospital of USTC Division of Life Sciences and Medicine University of Science and Technology of China Hefei China Department of Radiation Oncology Anhui Provincial Cancer Hospital Hefei China

As for social choice, all alternatives are ranked by agents to form preferences as linear orders. However, in applications, sometimes some alternatives cannot be ranked, or it is unnecessary to rank them, which leads to unranked alternatives. Hence, without loss of generality, by dividing the set of alternatives into three ranked and unranked subsets, including top-k alternatives, intermediate-r alternatives, and last-l alternatives, the Mallows model on ranked and unranked preferences can be analyzed systematically. Technically, a repeated insertion model is adopted during sampling, and probability distributions are derived for ranked and unranked preferences of alternatives. Experimental results verify the accuracy of the probability distributions for different ranked and unranked preferences of alternatives. Furthermore, in order to solve the preference completion problem where agents have multiple partial rankings, a fuzzy preference completion algorithm, Fuzzy-Multi-Rankings, is proposed, which introduces a fuzzy ranking to complete the target agent’s preference in addition to the traditional nearest-neighbor-based methods. Based on the three ranked and unranked preferences, seven cases can be classified and analyzed for fuzzy preference completion. Experiments on the synthetic datasets and MovieLens dataset confirm the effectiveness and efficiency of our proposed Fuzzy-Multi-Rankings algorithm and also verify the accuracy of the evaluated probability distributions for the proposed seven cases.

关键词：

来源：评论

学校读者我要写书评

暂无评论

High-Quality Noise Detection for knowledge Graph Embedding with Rule-Based Triple Confidence 18th

High-Quality Noise Detection for Knowledge Graph Embedding w...

引用

18th Pacific Rim International Conference on Artificial Intelligence, PRICAI 2021

作者： Hong, Yan Bu, Chenyang Wu, Xindong Ministry of Education Key Laboratory of Knowledge Engineering with Big Data Hefei University of Technology Hefei China School of Computer Science and Information Engineering Hefei University of Technology Hefei China Mininglamp Academy of Sciences Mininglamp Technology Beijing China

ISBN: (纸本)9783030891879

knowledge representation learning is usually used in knowledge reasoning and other related fields. Its goal is to use low-dimensional vectors to represent the entities and relations in a knowledge graph. In the process of automatic knowledge graph construction, the complexity of unstructured text and the incorrect text may make automatic construction tools unable to accurately obtain the semantic information in the text. This leads to high-quality noise with matched entity types but semantic errors. Currently knowledge representation learning methods assume that the knowledge in knowledge graphs is completely correct, and ignore the noise data generated in the process of automatic construction of knowledge graphs, resulting in errors in the vector representation of entities and relations. In order to reduce the negative impact of noise data on the construction of a representation learning model, in this study, a high-quality noise detection method with rule information is proposed. Based on the semantic association between triples in the same rule, we propose the concept of rule-based triple confidence. The calculation strategy of triple confidence is designed inspired by probabilistic soft logic (PSL). The influence of high-quality noise data in the training process of the model can be weakened by this confidence. Experiments show the effectiveness of the proposed method in dealing with high-quality noise. © 2021, Springer Nature Switzerland AG.

关键词： knowledge graph

来源：评论

学校读者我要写书评

暂无评论

A Baseline for Early Classification of Time Series in An Open World

A Baseline for Early Classification of Time Series in An Ope...

引用

IEEE Annual International Computer Software and Applications Conference (COMPSAC)

作者： Junwei Lv Xuegang Hu Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Hefei University of Technology Hefei China School of Computer Science and Information Engineering Hefei University of Technology Hefei China

ISBN: (数字)9781665488105

ISBN: (纸本)9781665488112

Early classification of time series aims to accurately predict the class label of a time series as early as possible, which is significant but challenging in many time-sensitive applications. Existing early classification methods hold a basic closed-world assumption that the classifier must have seen the classes of test samples. However, new samples that do not belong to any trained class may appear in the real world. In this paper, we first address the early classification in an open world and design two detectors to identify which known class or unknown class a sample belongs to. Specifically, based on the observed data, an early known-class detector is designed to determine the known-class confidence and an early unknown-class detector is designed to determine the unknown-class confidence according to the Minimum Reliable Length (MRL) and the Weibull distribution of each class. Experimental results evaluated on real-world datasets demonstrate that the proposed model can identify samples of unknown and known classes accurately and early.

关键词： Computational modeling Conferences Time series analysis Detectors Reliability engineering Software Weibull distribution

来源：评论

学校读者我要写书评

暂无评论

MEGCF: Multimodal Entity Graph Collaborative Filtering for Personalized Recommendation

arXiv

引用

arXiv 2022年

作者： Liu, Kang Xue, Feng Guo, Dan Wu, Le Li, Shujie Hong, Richang Hefei University of Technology School of Computer Science and Information Engineering Key Laboratory of Knowledge Engineering with Big Data Intelligent Interconnected Systems Laboratory of Anhui Province 485 Danxia Road Anhui Province Hefei230601 China Hefei University of Technology School of Software Key Laboratory of Knowledge Engineering with Big Data Intelligent Interconnected Systems Laboratory of Anhui Province 485 Danxia Road Anhui Province Hefei230601 China

In most E-commerce platforms, whether the displayed items trigger the user's interest largely depends on their most eye-catching multimodal content. Consequently, increasing efforts focus on modeling multimodal user preference, and the pressing paradigm is to incorporate complete multimodal deep features of the items into the recommendation module. However, the existing studies ignore the mismatch problem between multimodal feature extraction (MFE) and user interest modeling (UIM). That is, MFE and UIM have different emphases. Specifically, MFE is migrated from and adapted to upstream tasks such as image classification. In addition, it is mainly a content-oriented and non-personalized process, while UIM, with its greater focus on understanding user interaction, is essentially a user-oriented and personalized process. Therefore, the direct incorporation of MFE into UIM for purely user-oriented tasks, tends to introduce a large number of preference-independent multimodal noise and contaminate the embedding representations in UIM. This paper aims at solving the mismatch problem between MFE and UIM, so as to generate high-quality embedding representations and better model multimodal user preferences. Towards this end, we develop a novel model, multimodal entity graph collaborative filtering, short for MEGCF. The UIM of the proposed model captures the semantic correlation between interactions and the features obtained from MFE, thus making a better match between MFE and UIM. More precisely, semantic-rich entities are first extracted from the multimodal data, since they are more relevant to user preferences than other multimodal information. These entities are then integrated into the user-item interaction graph. Afterwards, a symmetric linear Graph Convolution Network (GCN) module is constructed to perform message propagation over the graph, in order to capture both high-order semantic correlation and collaborative filtering signals. Finally, the sentiment information fr

关键词： Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

Scientific Value Weights more than Being Open or Toll Access:An analysis of the OA advantage in Nature and Science

引用

Journal of data and Information Science 2021年第4期6卷 62-75页

作者： Howell Y.Wang Shelia X.Wei Cong Cao Xianwen Wang Fred Y.Ye International Joint Informatics Laboratory&Jiangsu Key Laboratory of Data Engineering and Knowledge Service School of Information ManagementNanjing UniversityNanjing 210023China Nottingham University Business School China University of Nottingham Ningbo ChinaNingbo315100China WISE Lab. Dalian University of TechnologyDalian 116024China

Purpose:We attempt to find out whether OA or TA really affects the dissemination of scientific ***/methodology/approach:We design the indicators,hot-degree,and R-index to indicate a topic OA or TA ***,according to the OA classification of the Web of Science(WoS),we collect data from the WoS by downloading OA and TA articles,letters,and reviews published in Nature and Science during 2010–*** papers are divided into three broad disciplines,namely biomedicine,physics,and ***,taking a discipline in a journal and using the classical Latent Dirichlet Allocation(LDA)to cluster 100 topics of OA and TA papers respectively,we apply the Pearson correlation coefficient to match the topics of OA and TA,and calculate the hot-degree and R-index of every OA-TA topic ***,characteristics of the discipline can be *** qualitative comparison,we choose some high-quality papers which belong to Nature remarkable papers or Science breakthroughs,and analyze the relations between OA/TA and citation ***:The result shows that OA hot-degree in biomedicine is significantly greater than that of TA,but significantly less than that of TA in *** on the R-index,it is found that OA advantages exist in biomedicine and TA advantages do in ***,the dissemination of average scientific discoveries in all fields is not necessarily affected by OA or ***,OA promotes the spread of important scientific discoveries in high-quality *** limitations:We lost some citations by ignoring other open sources such as arXiv and *** limitation came from that Nature employs some strong measures for access-promoting subscription-based articles,on which the boundary between OA and TA became *** implications:It is useful to select hot topics in a set of publications by the hotdegree *** finding comprehensively reflects the differences of OA and TA in different disciplines,which is a u

关键词： Open access Toll access Scientific discovery Academic dissemination data analysis

来源：评论

学校读者我要写书评

暂无评论

Hierarchical Alignment-enhanced Adaptive Grounding Network for Generalized Referring Expression Comprehension

arXiv

引用

arXiv 2025年

作者： Wang, Yaxian Ding, Henghui He, Shuting Jiang, Xudong Wei, Bifan Liu, Jun School of Computer Science and Technology Xi'an Jiaotong University China Ministry of Education Key Laboratory of Intelligent Networks and Network Security Xi'an Jiaotong University China Institute of Big Data Fudan University China Shanghai University of Finance and Economics China Nanyang Technological University Singapore School of Continuing Education Xi'an Jiaotong University China Shaanxi Province Key Laboratory of Big Data Knowledge Engineering Xi'an Jiaotong University China

In this work, we address the challenging task of Generalized Referring Expression Comprehension (GREC). Compared to the classic Referring Expression Comprehension (REC) that focuses on single-target expressions, GREC extends the scope to a more practical setting by further encompassing notarget and multi-target expressions. Existing REC methods face challenges in handling the complex cases encountered in GREC, primarily due to their fixed output and limitations in multi-modal representations. To address these issues, we propose a Hierarchical Alignment-enhanced Adaptive Grounding Network (HieA2G) for GREC, which can flexibly deal with various types of referring expressions. First, a Hierarchical Multi-modal Semantic Alignment (HMSA) module is proposed to incorporate three levels of alignments, including word-object, phrase-object, and text-image alignment. It enables hierarchical cross-modal interactions across multiple levels to achieve comprehensive and robust multi-modal understanding, greatly enhancing grounding ability for complex cases. Then, to address the varying number of target objects in GREC, we introduce an Adaptive Grounding Counter (AGC) to dynamically determine the number of output targets. Additionally, an auxiliary contrastive loss is employed in AGC to enhance object-counting ability by pulling in multi-modal features with the same counting and pushing away those with different counting. Extensive experimental results show that HieA2G achieves new state-of-the-art performance on the challenging GREC task and also the other 4 tasks, including REC, Phrase Grounding, Referring Expression Segmentation (RES), and Generalized Referring Expression Segmentation (GRES), demonstrating the remarkable superiority and generalizability of the proposed HieA2G. Copyright © 2025, The Authors. All rights reserved.

关键词： Electric grounding

来源：评论

学校读者我要写书评

暂无评论

Few-shot Open Relation Extraction with Gaussian Prototype and Adaptive Margin

arXiv

引用

arXiv 2024年

作者： Guo, Tianlin Zhang, Lingling Wang, Jiaxin Lei, Yunkuo Li, Yifei Wang, Haofen Liu, Jun School of Computer Science and Technology Ministry of Education Key Laboratory of Intelligent Networks and Network Security Xi'an Jiaotong University Xi’an710049 China School of Computer Science and Technology Shaanxi Province Key Laboratory of Big Data Knowledge Engineering Xi'an Jiaotong University Xi’an710049 China College of Design and Innovation Tongji University Shanghai200092 China

Few-shot relation extraction with none-of-the-above (FsRE with NOTA) aims at predicting labels in few-shot scenarios with unknown classes. FsRE with NOTA is more challenging than the conventional few-shot relation extraction task, since the boundaries of unknown classes are complex and difficult to learn. Meta-learning based methods, especially prototype-based methods, are the mainstream solutions to this task. They obtain the classification boundary by learning the sample distribution of each class. However, their performance is limited because few-shot overfitting and NOTA boundary confusion lead to misclassification between known and unknown classes. To this end, we propose a novel framework based on Gaussian prototype and adaptive margin named GPAM for FsRE with NOTA, which includes three modules, semi-factual representation, GMM-prototype metric learning and decision boundary learning. The first two modules obtain better representations to solve the few-shot problem through debiased information enhancement and Gaussian space distance measurement. The third module learns more accurate classification boundaries and prototypes through adaptive margin and negative sampling. In the training procedure of GPAM, we use contrastive learning loss to comprehensively consider the effects of range and margin on the classification of known and unknown classes to ensure the model's stability and robustness. Sufficient experiments and ablations on the FewRel dataset show that GPAM surpasses previous prototype methods and achieves state-of-the-art performance. Copyright © 2024, The Authors. All rights reserved.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：