检索结果-内蒙古大学图书馆

Robust domain adaptation with noisy and shifted label distribution

Frontiers of Computer Science 2025年第3期19卷 25-36页

作者： Shao-Yuan LI Shi-Ji ZHAO Zheng-Tao CAO Sheng-Jun HUANG Songcan CHEN MIIT Key Laboratory of Pattern Analysis and Machine Intelligence College of Computer Science and TechnologyNanjing University of Aeronautics and AstronauticsNanjing 211106China

Unsupervised Domain Adaptation(UDA)intends to achieve excellent results by transferring knowledge from labeled source domains to unlabeled target domains in which the data or label distribution *** UDA methods have acquired great success when labels in the source domain are ***,even the acquisition of scare clean labels in the source domain needs plenty of costs as *** the presence of label noise in the source domain,the traditional UDA methods will be seriously degraded as they do not deal with the label *** this paper,we propose an approach named Robust Self-training with Label Refinement(RSLR)to address the above *** adopts the self-training framework by maintaining a Labeling Network(LNet)on the source domain,which is used to provide confident pseudo-labels to target samples,and a Target-specific Network(TNet)trained by using the pseudo-labeled *** combat the effect of label noise,LNet progressively distinguishes and refines the mislabeled source *** combination with class rebalancing to combat the label distribution shift issue,RSLR achieves effective performance on extensive benchmark datasets.

关键词： unsupervised domain adaptation label noise label distribution shift self-training class rebalancing

来源：评论

学校读者我要写书评

暂无评论

Boundary Data Augmentation for Offline Reinforcement Learning

引用

ZTE Communications 2023年第3期21卷 29-36页

作者： SHEN Jiahao JIANG Ke TAN Xiaoyang College of Computer Science and Technology Nanjing University of Aeronautics and AstronauticsNanjing 211106China MIIT Key Laboratory of Pattern Analysis and Machine Intelligence Nanjing 211106China

Offline reinforcement learning(ORL)aims to learn a rational agent purely from behavior data without any online *** of the major challenges encountered in ORL is the problem of distribution shift,i.e.,the mismatch between the knowledge of the learned policy and the reality of the underlying *** works usually handle this in a too pessimistic manner to avoid out-of-distribution(OOD)queries as much as possible,but this can influence the robustness of the agents at unseen *** this paper,we propose a simple but effective method to address this *** key idea of our method is to enhance the robustness of the new policy learned offline by weakening its confidence in highly uncertain regions,and we propose to find those regions by simulating them with modified Generative Adversarial Nets(GAN)such that the generated data not only follow the same distribution with the old experience but are very difficult to deal with by themselves,with regard to the behavior policy or some other reference *** then use this information to regularize the ORL algorithm to penalize the overconfidence behavior in these *** experiments on several publicly available offline RL benchmarks demonstrate the feasibility and effectiveness of the proposed method.

关键词： offline reinforcement learning out‐of‐distribution state robustness uncertainty

来源：评论

学校读者我要写书评

暂无评论

DouGNN: An End-to-End Deep Learning Framework for Predicting Individual Behaviors from fMRI Data 2

DouGNN: An End-to-End Deep Learning Framework for Predicting...

引用

2nd International Conference on Image Processing, Computer Vision and machine Learning, ICICML 2023

作者： Cao, Qumei Wen, Xuyun MIIT Key Laboratory of Pattern Analysis and Machine Intelligence Nanjing University of Aeronautics and Astronautics College of Computer Science and Technology Jiangsu Nanjing China

ISBN: (纸本)9798350331417

Predicting individual behavior from functional connectivity (FC) using machine learning is a critical research topic in neuroscience. While various models have been proposed, they mainly focus on designing behavior prediction methods, overlooking the influence of upstream FC network construction approach on downstream tasks. In this paper, leveraging graph neural networks (GNNs), we introduce an end-to-end deep learning framework named DouGNN, aimed at learning individual behavior from fMRI data. Through joint training of two GNNs, DouGNN innovatively integrates individualized cortical parcellation (for FC network construction) and individual behavior prediction into a unified optimization model. By integrating individualized cortical parcellation, the model gains the ability to dynamically modify the subject's cortical parcellation and the downstream tasks. This enhancement enables the subsequently constructed FC networks to better represent individual-specific features and task-related characteristics, thereby significantly improving the performance of individual behavior prediction. Moreover, the generated individual-specific, task-aware cortical parcellations from this process can also help us in understanding the relationship between FC and behaviors. We conducted three representative behavior prediction experiments on the publicly available HCP dataset. The results demonstrate that DouGNN outperforms existing methods in all behavior prediction tasks, achieving superior performance and generating more functionally homogeneous individualized cortical parcellations. © 2023 IEEE.

关键词： Behavior prediction functional magnetic resonance imaging graph neural network individualized cortical parcellation

来源：评论

学校读者我要写书评

暂无评论

A Systematic Evaluation of Large Language Models for Natural Language Generation Tasks 22

A Systematic Evaluation of Large Language Models for Natural...

引用

22nd Chinese National Conference on Computational Linguistics, CCL 2023

作者： Ni, Xuanfan Li, Piji College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics MIIT Key Laboratory of Pattern Analysis and Machine Intelligence Jiangsu Nanjing210016 China

ISBN: (纸本)9781713889892

Recent efforts have evaluated large language models (LLMs) in areas such as commonsense reasoning, mathematical reasoning, and code generation. However, to the best of our knowledge, no work has specifically investigated the performance of LLMs in natural language generation (NLG) tasks, a pivotal criterion for determining model excellence. Thus, this paper conducts a comprehensive evaluation of well-known and high-performing LLMs, namely ChatGPT, ChatGLM, T5-based models, LLaMA-based models, and Pythia-based models, in the context of NLG tasks. We select English and Chinese datasets encompassing Dialogue Generation and Text Summarization. Moreover, we propose a common evaluation setting that incorporates input templates and post-processing strategies. Our study reports both automatic results, accompanied by a detailed analysis. © 2023 China National Conference on Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

AE-TPGG:a novel autoencoder-based approach for single-cell RNA-seq data imputation and dimensionality reduction

引用

Frontiers of Computer Science 2023年第3期17卷 217-234页

作者： Shuchang ZHAO Li ZHANG Xuejun LIU MIIT Key Laboratory of Pattern Analysis and Machine Intelligence College of Computer Science and TechnologyNanjing University of Aeronautics and AstronauticsNanjing 211106China Collaborative Innovation Center of Novel Software Technology and Industrialization Nanjing 210023China College of Computer Science and Technology Nanjing Forestry UniversityNanjing 210037China

Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new perspectives on the cellular diversity of potential superficially homogeneous *** various sequencing techniques have decreased the amplification bias and improved capture efficiency caused by the low amount of starting material,the technical noise and biological variation are inevitably introduced into experimental process,resulting in high dropout events,which greatly hinder the downstream *** the bimodal expression pattern and the right-skewed characteristic existed in normalized scRNA-seq data,we propose a customized autoencoder based on a twopart-generalized-gamma distribution(AE-TPGG)for scRNAseq data analysis,which takes mixed discrete-continuous random variables of scRNA-seq data into account using a twopart model and utilizes the generalized gamma(GG)distribution,for fitting the positive and right-skewed continuous *** adopted autoencoder enables AE-TPGG to captures the inherent relationship between *** addition to the ability of achieving low-dimensional representation,the AETPGG model also provides a denoised imputation according to statistical characteristic of gene *** on real datasets demonstrate that our proposed model is competitive to current imputation methods and ameliorates a diverse set of typical scRNA-seq data analyses.

关键词： scRNA-seq autoencoder TPGG data imputation dimensionality reduction

来源：评论

学校读者我要写书评

暂无评论

Characteristic AI Agents via Large Language Models 30

Characteristic AI Agents via Large Language Models

引用

Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024

作者： Wang, Xi Dai, Hongliang Gao, Shen Li, Piji College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics China MIIT Key Laboratory of Pattern Analysis and Machine Intelligence Nanjing China School of Computer Science and Technology Shandong University China

ISBN: (纸本)9782493814104

The advancement of Large Language Models (LLMs) has led to significant enhancements in the performance of chatbot systems. Many researchers have dedicated their efforts to the development of bringing characteristics to chatbots. While there have been commercial products for developing role-driven chatbots using LLMs, it is worth noting that academic research in this area remains relatively scarce. Our research focuses on investigating the performance of LLMs in constructing Characteristic AI Agents by simulating real-life individuals across different settings. Current investigations have primarily focused on act on roles with simple profiles. In response to this research gap, we create a benchmark for the characteristic AI agents task, including dataset, techniques, and evaluation metrics. A dataset called "Character100" is built for this benchmark, comprising the most-visited people on Wikipedia for language models to role-play. With the constructed dataset, we conduct comprehensive assessment of LLMs across various settings. In addition, we devise a set of automatic metrics for quantitative performance evaluation. The experimental results underscore the potential directions for further improvement in the capabilities of LLMs in constructing characteristic AI agents. The benchmark is available at https://***/nuaa-nlp/Character100. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

On the learning dynamics of two-layer quadratic neural networks for understanding deep learning

引用

Frontiers of Computer Science 2022年第3期16卷 77-82页

作者： Zhenghao TAN Songcan CHEN College of Computer Science and Technology Nanjing University of Aeronautics and AstronauticsNanjing 211106China College of Computer Science and Technology Nanjing University of Aeronautics and AstronauticsMIIT Key Laboratory of Pattern Analysis and Machine IntelligenceNanjing 211106China

Deep learning performs as a powerful paradigm in many real-world applications;however,its mechanism remains much of a *** gain insights about nonlinear hierarchical deep networks,we theoretically describe the coupled nonlinear learning dynamic of the two-layer neural network with quadratic activations,extending existing results from the linear *** quadratic activation,although rarely used in practice,shares convexity with the widely used ReLU activation,thus producing similar *** this work,we focus on the case of a canonical regression problem under the standard normal distribution and use a coupled dynamical system to mimic the gradient descent method in the sense of a continuous-time limit,then use the high order moment tensor of the normal distribution to simplify these ordinary differential *** simplified system yields unexpected fixed *** existence of these non-global-optimal stable points leads to the existence of saddle points in the loss surface of the quadratic *** analysis shows there are conserved quantities during the training of the quadratic *** quantities might result in a failed learning process if the network is initialized ***,We illustrate the comparison between the numerical learning curves and the theoretical one,which reveals the two alternately appearing stages of the learning process.

关键词： learning dynamic quadratic network ordinary differential equations

来源：评论

学校读者我要写书评

暂无评论

Forgetting, Ignorance or Myopia: Revisiting key Challenges in Online Continual Learning 38

Forgetting, Ignorance or Myopia: Revisiting Key Challenges i...

引用

38th Conference on Neural Information Processing Systems, NeurIPS 2024

作者： Wang, Xinrui Geng, Chuanxing Wan, Wenhai Li, Shao-Yuan Chen, Songcan College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics China MIIT Key Laboratory of Pattern Analysis and Machine Intelligence China School of Computer Science and Technology Huazhong University of Science and Technology China

Online continual learning (OCL) requires the models to learn from constant, endless streams of data. While significant efforts have been made in this field, most were focused on mitigating the catastrophic forgetting issue to achieve better classification ability, at the cost of a much heavier training workload. They overlooked that in real-world scenarios, e.g., in high-speed data stream environments, data do not pause to accommodate slow models. In this paper, we emphasize that model throughput- defined as the maximum number of training samples that a model can process within a unit of time - is equally important. It directly limits how much data a model can utilize and presents a challenging dilemma for current methods. With this understanding, we revisit key challenges in OCL from both empirical and theoretical perspectives, highlighting two critical issues beyond the well-documented catastrophic forgetting: (i) Model's ignorance: the single-pass nature of OCL challenges models to learn effective features within constrained training time and storage capacity, leading to a trade-off between effective learning and model throughput;(ii) Model's myopia: the local learning nature of OCL on the current task leads the model to adopt overly simplified, task-specific features and excessively sparse classifier, resulting in the gap between the optimal solution for the current task and the global objective. To tackle these issues, we propose the Non-sparse Classifier Evolution framework (NsCE) to facilitate effective global discriminative feature learning with minimal time cost. NsCE integrates non-sparse maximum separation regularization and targeted experience replay techniques with the help of pre-trained models, enabling rapid acquisition of new globally discriminative features. Extensive experiments demonstrate the substantial improvements of our framework in performance, throughput and real-world practicality. © 2024 Neural information processing systems foundation.

关键词：

来源：评论

学校读者我要写书评

暂无评论

All-around Neural Collapse for Imbalanced Classification

arXiv

引用

arXiv 2024年

作者： Zhang, Enhao Li, Chaohua Geng, Chuanxing Chen, Songcan MIIT Key Laboratory of Pattern Analysis and Machine Intelligence China Nanjing211106 China

Neural Collapse (NC) presents an elegant geometric structure that enables individual activations (features), class means and classifier (weights) vectors to reach optimal interclass separability during the terminal phase of training on a balanced dataset. Once shifted to imbalanced classification, such an optimal structure of NC can be readily destroyed by the notorious minority collapse, where the classifier vectors corresponding to the minority classes are squeezed. In response, existing works endeavor to recover NC typically by optimizing classifiers. However, we discover that this squeezing phenomenon is not only confined to classifier vectors but also occurs with class means. Consequently, reconstructing NC solely at the classifier aspect may be futile, as the feature means remain compressed, leading to the violation of inherent self-duality in NC (i.e., class means and classifier vectors converge mutually) and incidentally, resulting in an unsatisfactory collapse of individual activations towards the corresponding class means. To shake off these dilemmas, we present a unified All-around Neural Collapse framework (AllNC), aiming to comprehensively restore NC across multiple aspects including individual activations, class means and classifier vectors. We thoroughly analyze its effectiveness and verify on multiple benchmark datasets that it achieves state-of-the-art in both balanced and imbalanced settings. Copyright © 2024, The Authors. All rights reserved.

关键词： Contrastive Learning

来源：评论

学校读者我要写书评

暂无评论

Causality-enhanced Discreted Physics-informed Neural Networks for Predicting Evolutionary Equations 33

Causality-enhanced Discreted Physics-informed Neural Network...

引用

33rd International Joint Conference on Artificial intelligence, IJCAI 2024

作者： Li, Ye Chen, Siqi Shan, Bin Huang, Sheng-Jun College of Computer Science and Technology Nanjing University of Aeronautics and Astronautics China College of Electronic and Information Engineering Nanjing University of Aeronautics and Astronautics China MIIT Key Laboratory of Pattern Analysis and Machine Intelligence Nanjing China

ISBN: (纸本)9781956792041

Physics-informed neural networks (PINNs) have shown promising potential for solving partial differential equations (PDEs) using deep learning. However, PINNs face training difficulties for evolutionary PDEs, particularly for dynamical systems whose solutions exhibit multi-scale or turbulent behavior over time. The reason is that PINNs may violate the temporal causality property since all the temporal features in the PINNs loss are trained simultaneously. This paper proposes to use implicit time differencing schemes to enforce temporal causality, and use transfer learning to sequentially update the PINNs in space as surrogates for PDE solutions in different time frames. The evolving PINNs are better able to capture the varying complexities of the evolutionary equations, while only requiring minor updates between adjacent time frames. Our method is theoretically proven to be convergent if the time step is small and each PINN in different time frames is well-trained. In addition, we provide state-of-the-art (SOTA) numerical results for a variety of benchmarks for which existing PINNs formulations may fail or be inefficient. We demonstrate that the proposed method improves the accuracy of PINNs approximation for evolutionary PDEs and improves efficiency by a factor of 4-40x. The code is available at https://***/SiqiChen9/TL-DPINNs. © 2024 International Joint Conferences on Artificial intelligence. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：