Multimodal machinelearning has achieved remarkable progress in a wide range of scenarios. However, the reliability of multimodal learning remains largely unexplored. In this paper, through extensive empirical studies...
详细信息
Graph Neural Networks (GNNs) have achieved great success in various data mining tasks but they heavily rely on a large number of annotated nodes, requiring considerable human efforts. Despite the effectiveness of exis...
详细信息
Graph Neural Networks (GNNs) have achieved great success in various data mining tasks but they heavily rely on a large number of annotated nodes, requiring considerable human efforts. Despite the effectiveness of exis...
详细信息
ISBN:
(数字)9798350317152
ISBN:
(纸本)9798350317169
Graph Neural Networks (GNNs) have achieved great success in various data mining tasks but they heavily rely on a large number of annotated nodes, requiring considerable human efforts. Despite the effectiveness of existing GNN-based Active learning (AL) methods, they assume that the annotated lab.ls are always correct, which is contradictory to the error-prone lab.ling process in a practical crowdsourcing environment. Besides, due to this impractical assumption, existing works only focus on optimizing the node selection in AL but neglect optimizing the lab.ling process. Therefore, we present NC-ALG, the first GNN-based AL framework that optimizes both the node selection and node lab.ling process under a noisy crowd. For node selection, NC-ALG introduces a new measurement to model influence reliability and an effective influence maximization objective to select nodes. For node lab.ling, NC-ALG significantly reduces the lab.ling cost by considering the model-predicted lab.ls and the lab.ls of mirror nodes. To the best of our knowledge, this is the first attempt to consider GNN-based AL under the practical noisy crowd. Empirical studies on public datasets demonstrate that NC-ALG significantly outperforms existing methods in terms lab.ling efficiency. Notably, it only takes NC-ALG one-third of the lab.ling budget that the competitive baseline GRAIN needs to achieve an accuracy of 70.7 % on PubMed.
Multi-view clustering has attracted more attention recently since many real-world data are comprised of different representations or views. Recent multi-view clustering works mainly exploit the instance consistency to...
详细信息
Multi-view clustering has attracted more attention recently since many real-world data are comprised of different representations or views. Recent multi-view clustering works mainly exploit the instance consistency to obtain the shared representations across different views, and apply a single-view clustering method to perform data partitions. However, these existing methods often ignore the inconsistency of instance associations within the views, which may enlarge the intra-class diversity among the views and therefore degrade the clustering performance. To address this issue, this paper proposes an efficient mutual contrastive teacher-student leaning (MC-TSL) model to enhance the multi-view clustering, which is the first attempt to study the inconsistency distillation for consistency learning. First, the proposed MC-TSL approach exploits a view-specific encoder with two heads, an instance encoding head and a semantic distillation head, respectively, for capturing the consistent and discriminative feature representations. To be specific, the former head exploits a cross-view contrastive learning method to obtain a redundancy-free consistent representation at the instance level, while the latter head designs a mutual teacher-student learning module to capture the intra-view information at semantic level. By training these two heads in an end-to-end manner, the discriminative multi-view embeddings are efficiently obtained and refined by minimizing the weighted sum of the reconstruction loss, contrastive loss and contrast distillation loss. Extensive experiments verify the superiorities of the proposed MC-TSL framework and show its competitive clustering performances.
The rapid advancements in artificial intelligence(AI)are catalyzing transformative changes in atomic modeling,simulation,and ***-driven potential energy models havedemonstrated the capability to conduct large-scale,lo...
详细信息
The rapid advancements in artificial intelligence(AI)are catalyzing transformative changes in atomic modeling,simulation,and ***-driven potential energy models havedemonstrated the capability to conduct large-scale,long-duration simulations with the accuracy of ab initio electronic structure ***,the model generation process remains a bottleneck for large-scale *** propose a shift towards a model-centric ecosystem,wherein a large atomic model(LAM),pretrained across multiple disciplines,can be efficiently fine-tuned and distilled for various downstream tasks,thereby establishing a new framework for molecular *** this study,we introduce the DPA-2 architecture as a prototype for ***-trained on a diverse array of chemical and materials systemsusing a multi-task approach,DPA-2demonstrates superior generalization capabilities across multiple downstream tasks compared to the traditional single-task pre-training and fine-tuning *** approach sets the stage for the development and broad application of LAMs in molecular and materials simulation research.
Feature representation and feature fusion are important factors in image classification problem. In this paper, the local features, mid-level features and convolutional features are combined using the multiple kernel ...
Feature representation and feature fusion are important factors in image classification problem. In this paper, the local features, mid-level features and convolutional features are combined using the multiple kernel learning method. Experimental results show that the local features, mid-level features and convolutional features can be fused effectively to improve the classification performance about 4%-6% on several popular benchmarks.
It is meaningful to study high performance image classification algorithms for massive image management and effective organization. Image feature representations directly affect the performance of classification algor...
详细信息
Three-way decisions model proposed by Yao gives a semantic interpretation of positive region, negative region and boundary region. This model was developed in the framework of classical rough set, the approached targe...
详细信息
The rapid advancements in artificial intelligence (AI) are catalyzing transformative changes in atomic modeling, simulation, and design. AI-driven potential energy models have demonstrated the capability to conduct la...
详细信息
In practice, there are many imbalanced data classification problems, for example, spam filtering, credit card fraud detection and software defect prediction etc. it is important in theory as well as in application for...
详细信息
ISBN:
(纸本)9781538652152
In practice, there are many imbalanced data classification problems, for example, spam filtering, credit card fraud detection and software defect prediction etc. it is important in theory as well as in application for investigating the problem of imbalanced data classification. In order to deal with this problem, based on extreme learningmachine autoencoder, this paper proposed an approach for addressing the problem of binary imbalanced data classification. The proposed method includes 3 steps. (1) the positive instances are used as seeds, new samples are generated for increasing the number of positive instances by extreme learningmachine autoencoder, the generated new samples are similar with the positive instances but not same. (2) step (1) is repeated several times, and a balanced data set is obtained. (3) a classifier is trained with the balanced data set and used to classify unseen samples. The experimental results demonstrate that the proposed approach is feasible and effective.
暂无评论