Node classification is an essential problem in graph learning. However, many models typically obtain unsatisfactory performance when applied to few-shot scenarios. Some studies have attempted to combine meta-learning ...
详细信息
The integration of psychology and computer science has become the mainstream contemporary research method on psychological data. Weibo, China's largest open platform for communication and information sharing betwe...
详细信息
In a world brimming with new products continually, novel waste types are ubiquitous. This makes current image-based garbage classification systems difficult to perform well due to the long-tailed effects of distributi...
详细信息
Spreadsheets contain a lot of valuable data and have many practical *** key technology of these practical applications is how to make machines understand the semantic structure of spreadsheets,e.g.,identifying cell fu...
详细信息
Spreadsheets contain a lot of valuable data and have many practical *** key technology of these practical applications is how to make machines understand the semantic structure of spreadsheets,e.g.,identifying cell function types and discovering relationships between cell *** existing methods for understanding the semantic structure of spreadsheets do not make use of the semantic information of cells.A few studies do,but they ignore the layout structure information of spreadsheets,which affects the performance of cell function classification and the discovery of different relationship types of cell *** this paper,we propose a Heuristic algorithm for Understanding the Semantic Structure of spreadsheets(HUSS).Specifically,for improving the cell function classification,we propose an error correction mechanism(ECM)based on an existing cell function classification model[11]and the layout features of *** improving the table structure analysis,we propose five types of heuristic rules to extract four different types of cell pairs,based on the cell style and spatial location *** experimental results on five real-world datasets demonstrate that HUSS can effectively understand the semantic structure of spreadsheets and outperforms corresponding baselines.
This paper presents a Scientific Literature Management Platform (SLMP, demo link1 ) based on large language models (LLMs). The platform consists of four modules: literature management, literature extraction, literatur...
详细信息
Cross-lingual image captioning, with its ability to caption an unlab.led image in a target language other than English, is an emerging topic in the multimedia field. In order to save the precious human resource from r...
详细信息
Relation prediction in knowledge graphs (KGs) aims at predicting missing relations in incomplete triples, whereas the dominant paradigm by KG embeddings has a limitation to predict the relation between unseen entities...
详细信息
Relation prediction in knowledge graphs (KGs) aims at predicting missing relations in incomplete triples, whereas the dominant embedding paradigm has a restriction on handling unseen entities during testing. In the re...
详细信息
Semi-Supervised Learning (SSL) under class distribution mismatch aims to tackle a challenging problem wherein unlab.led data contain lots of unknown categories unseen in the lab.led ones. In such mismatch scenarios, t...
Semi-Supervised Learning (SSL) under class distribution mismatch aims to tackle a challenging problem wherein unlab.led data contain lots of unknown categories unseen in the lab.led ones. In such mismatch scenarios, traditional SSL suffers severe performance damage due to the harmful invasion of the instances with unknown categories into the target classifier. In this study, by strict mathematical reasoning, we reveal that the SSL error under class distribution mismatch is composed of pseudo-lab.ling error and invasion error, both of which jointly bound the SSL population risk. To alleviate the SSL error, we propose a robust SSL framework called Weight-Aware Distillation (WAD) that, by weights, selectively transfers knowledge beneficial to the target task from unsupervised contrastive representation to the target classifier. Specifically, WAD captures adaptive weights and high-quality pseudo-lab.ls to target instances by exploring point mutual information (PMI) in representation space to maximize the role of unlab.led data and filter unknown categories. Theoretically, we prove that WAD has a tight upper bound of population risk under class distribution mismatch. Experimentally, extensive results demonstrate that WAD outperforms five state-of-the-art SSL approaches and one standard baseline on two benchmark datasets, CIFAR10 and CIFAR100, and an artificial cross-dataset. The code is availab.e at https://***/RUC-DWBI-ML/research/tree/main/WAD-master.
Relation clustering is a general approach for open relation extraction (OpenRE). Current methods have two major problems. One is that their good performance relies on large amounts of lab.led and pre-defined relationa...
详细信息
暂无评论