检索结果-内蒙古大学图书馆

A unified pruning framework for vision transformers

science China(Information sciences) 2023年第7期66卷 303-304页

作者： Hao YU Jianxin WU State Key Laboratory for Novel Software Technology Nanjing University

The transformer architecture [1] has been widely used for natural language processing(NLP) tasks. Under the inspiration of its excellent performance in NLP, transformer-based models [2, 3] have established many new records in various computer vision tasks. However, most vision transformers(Vi Ts) suffer from large model sizes, large run-time memory consumption, and high computational costs. Therefore, impending needs exist to develop and deploy lightweight and efficient vision transformers.

关键词：

来源：评论

学校读者我要写书评

暂无评论

PyCIL: a Python toolbox for class-incremental learning

引用

science China(Information sciences) 2023年第9期66卷 291-292页

作者： Da-Wei ZHOU Fu-Yun WANG Han-Jia YE De-Chuan ZHAN State Key Laboratory for Novel Software Technology Nanjing University

With the rapid development of deep learning, current deep models can learn a fixed number of classes with high performance. However, in our ever-changing world, data often come from the open environment, which is with stream format or available temporarily due to privacy issues. As a result, the classification model should learn new classes incrementally instead of restarting the training process.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Optimizing Monitoring Utility of Unmanned Aerial Vehicles Considering Adverse Effects

引用

IEEE Transactions on Mobile Computing 2025年第7期24卷 5996-6013页

作者： Zhang, Haihan Dai, Haipeng Qiu, Yu Yu, Enze Zhou, Ruiben Wang, Weijun Wang, Jingwu Chen, Guihai Nanjing University State Key Laboratory for Novel Software Technology Nanjing210023 China South China University of Technology School of Computer Science and Engineering Guangzhou China Beijing China

For Unmanned Aerial Vehicles (UAVs) monitoring tasks, capturing high quality images of target objects is important for subsequent recognition. Concerning the problem, many prior works study placement/trajectory planning for UAVs to maximize the quality of captured images. However, all of them overlook a fact that UAV monitoring may cause a huge risk/annoyance on living objects. In this paper, we investigate the novel problem of oPtimizing unmanned aErial vehicles plAcement by Considering both monitoring utility and adverse Effects (PEACE). We propose an approach to solve PEACE, which is proved to be NP-hard. Overall, our approach achieves a 1-1/e- approximation ratio. First, we approximate the original problem of PEACE as a classical problem of Monotone Submodular function Maximization under a Uniform Matroid constraint (MSMUM) with a controlled gap. Then, for MSMUM, we propose a combination of algorithms achieving a 1-1/e approximation and O(n\log n) time complexity considering the correlation among the UAV monitoring strategies. The proposed algorithms outperform existing algorithms for MSMUM through theoretical analysis and experimental results. Extensive simulations and field experiments demonstrate the effectiveness of our approach, achieving performance gains of 9.0% to 1434.5% compared to existing methods. © 2002-2012 IEEE.

关键词： Unmanned aerial vehicles (UAV)

来源：评论

学校读者我要写书评

暂无评论

Mix-Lingual Relation Extraction:Dataset and a Training Approach

引用

计算机科学技术学报（英文版） 2025年第1期40卷 42-59页

作者：孔令兴褚有刚马征张建兵陈家骏 State Key Laboratory for Novel Software Technology Nanjing University Nanjing China School of Artificial Intelligence Nanjing University Nanjing China

Relation extraction is a pivotal task within the field of natural language processing,boasting numerous real-world *** research predominantly centers on monolingual relation extraction or cross-lingual enhance-ment for relation ***,there exists a notable gap in understanding relation extraction within mix-lingual(or code-switching)*** these scenarios,individuals blend content from different languages within sentences,gen-erating mix-lingual *** effectiveness of existing relation extraction models in such scenarios remains largely unex-plored due to the absence of dedicated *** address this gap,we introduce the Mix-Lingual Relation Extraction(MixRE)task and construct a human-annotated dataset MixRED to support this ***,we propose a hierar-chical training approach for the mix-lingual scenario named Mix-Lingual Training(MixTrain),designed to enhance the performance of large language models(LLMs)when capturing relational dependencies from mix-lingual content spanning different semantic *** experiments involve evaluating state-of-the-art supervised models and LLMs on the con-structed dataset,with results indicating that MixTrain notably improves model ***,we investigate the effectiveness of using mix-lingual content as a tool to transfer learned relational dependencies across different ***,we delve into factors influencing model performance for both supervised models and LLMs in the novel MixRE task.

关键词： natural language processing(NLP) code-switching large language model(LLM) relation extraction hu-man-annotated dataset

来源：评论

学校读者我要写书评

暂无评论

Woodpecker: hallucination correction for multimodal large language models

引用

science China(Information sciences) 2024年第12期67卷 52-64页

作者： Shukang YIN Chaoyou FU Sirui ZHAO Tong XU Hao WANG Dianbo SUI Yunhang SHEN Ke LI Xing SUN Enhong CHEN School of Artificial Intelligence and Data Science University of Science and Technology of China State Key Laboratory for Novel Software Technology Nanjing University School of Intelligence Science and Technology Nanjing University Institute of Automation Chinese Academy of Sciences YouTu

Hallucinations is a big shadow hanging over the rapidly evolving multimodal large language models(MLLMs), referring to that the generated text is inconsistent with the image content. To mitigate hallucinations, existing studies mainly resort to an instruction-tuning manner that requires retraining the models with specific data. In this paper, we pave a different way, introducing a training-free method named Woodpecker. Like woodpeckers heal trees, it picks out and corrects hallucinations from the generated text. Concretely, Woodpecker consists of five stages: key concept extraction, question formulation, visual knowledge validation, visual claim generation, and hallucination correction. Implemented in a post-remedy manner, Woodpecker can easily serve different MLLMs, while being interpretable by accessing intermediate outputs of the five stages. We evaluate Woodpecker both quantitatively and qualitatively and show the huge potential of this new paradigm. On the POPE benchmark, our method obtains a 30.66%/24.33% improvement in accuracy over the baseline MiniGPT-4/mPLUG-Owl. The source code is released at https://***/BradyFU/Woodpecker.

关键词： multimodal learning multimodal large language models hallucination correction large language models vision and language

来源：评论

学校读者我要写书评

暂无评论

Pairwise tagging framework for end-to-end emotion-cause pair extraction

引用

Frontiers of computer science 2023年第2期17卷 111-120页

作者： Zhen WU Xinyu DAI Rui XIA National Key Laboratory for Novel Software Technology Nanjing UniversityNanjing 210023China Collaborative Innovation Center of Novel Software Technology and Industrialization Nanjing 210023China School of Computer Science and Engineering Nanjing University of Science and TechnologyNanjing 210023China

Emotion-cause pair extraction(ECPE)aims to extract all the pairs of emotions and corresponding causes in a *** generally contains three subtasks,emotions extraction,causes extraction,and causal relations detection between emotions and *** works adopt pipelined approaches or multi-task learning to address the ECPE ***,the pipelined approaches easily suffer from error propagation in real-world *** multi-task learning cannot optimize all tasks globally and may lead to suboptimal extraction *** address these issues,we propose a novel framework,Pairwise Tagging Framework(PTF),tackling the complete emotion-cause pair extraction in one unified tagging *** prior works,PTF innovatively transforms all subtasks of ECPE,i.e.,emotions extraction,causes extraction,and causal relations detection between emotions and causes,into one unified clause-pair tagging *** this unified tagging task,we can optimize the ECPE task globally and extract more accurate emotion-cause *** validate the feasibility and effectiveness of PTF,we design an end-to-end PTF-based neural network and conduct experiments on the ECPE benchmark *** experimental results show that our method outperforms pipelined approaches significantly and typical multi-task learning approaches.

关键词： emotion-cause pair extraction pairwise tagging framework end-to-end neural network

来源：评论

学校读者我要写书评

暂无评论

Residual diverse ensemble for long-tailed multi-label text classification

引用

science China(Information sciences) 2024年第11期67卷 92-105页

作者： Jiangxin SHI Tong WEI Yufeng LI National Key Laboratory for Novel Software Technology Nanjing University School of Artificial Intelligence Nanjing University School of Computer Science and Engineering Southeast University Key Laboratory of Computer Network and Information Integration Southeast UniversityMinistry of Education

Long-tailed multi-label text classification aims to identify a subset of relevant labels from a large candidate label set, where the training datasets usually follow long-tailed label distributions. Many of the previous studies have treated head and tail labels equally, resulting in unsatisfactory performance for identifying tail labels. To address this issue, this paper proposes a novel learning method that combines arbitrary models with two steps. The first step is the “diverse ensemble” that encourages diverse predictions among multiple shallow classifiers, particularly on tail labels, and can improve the generalization of tail *** second is the “error correction” that takes advantage of accurate predictions on head labels by the base model and approximates its residual errors for tail labels. Thus, it enables the “diverse ensemble” to focus on optimizing the tail label performance. This overall procedure is called residual diverse ensemble(RDE). RDE is implemented via a single-hidden-layer perceptron and can be used for scaling up to hundreds of thousands of labels. We empirically show that RDE consistently improves many existing models with considerable performance gains on benchmark datasets, especially with respect to the propensity-scored evaluation ***, RDE converges in less than 30 training epochs without increasing the computational overhead.

关键词： multi-label learning extreme multi-label learning long-tailed distribution multi-label text classification ensemble learning

来源：评论

学校读者我要写书评

暂无评论

A Combinatorial Interaction Testing Method for Multi-Label Image Classifier 35

A Combinatorial Interaction Testing Method for Multi-Label I...

引用

35th IEEE International Symposium on software Reliability Engineering, ISSRE 2024

作者： Wang, Peng Hu, Shengyou Wu, Huayao Niu, Xintao Nie, Changhai Chen, Lin Nanjing University State Key Laboratory for Novel Software Technology and School of Computer Science China

ISBN: (纸本)9798350353884

Multi-label image classification is a critical task in computer vision, in which the correlations between labels are typically exploited by modern classifiers for an effective classification. In this study, we propose LV-CIT, a black-box testing method that applies Combinatorial Interaction Testing (CIT) to systematically test the ability of classifiers to handle such correlations. Specifically, LV-CIT views each label of the label space as an input-parameter taking binary values (indicating whether an object appears in an image), and manages to generate a label value covering array as the set of test cases to cover certain combinations of label values. Then, for each test case, LV-CIT relies on an object library to generate composite test images that perfectly match the specified labels, and reports classification errors if such labels cannot be correctly recognised. The experimental results on two popular datasets with six state-of-The-Art image classifiers show that LV-CIT is more efficient than the existing CIT tools in generating label value covering arrays. LV-CIT is also effective in errors revelation, as it can find 111% more errors by using 20% fewer test images than the existing methods for testing multi-label image classifiers. © 2024 IEEE.

关键词： Black-box testing

来源：评论

学校读者我要写书评

暂无评论

AsyCo: an asymmetric dual-task co-training model for partial-label learning

引用

science China(Information sciences) 2025年第5期68卷 332-347页

作者： Beibei LI Yiyuan ZHENG Beihong JIN Tao XIANG Haobo WANG Lei FENG College of Computer Science Chongqing University State Key Laboratory of Computer Science Institute of Software Chinese Academy of Sciences University of Chinese Academy of Sciences School of Software Technology Zhejiang University School of Computer Science and Engineering Nanyang Technological University

Partial-label learning(PLL) is a typical problem of weakly supervised learning, where each training instance is annotated with a set of candidate labels. Self-training PLL models achieve state-of-the-art performance but suffer from error accumulation problems caused by mistakenly disambiguated instances. Although co-training can alleviate this issue by training two networks simultaneously and allowing them to interact with each other, most existing co-training methods train two structurally identical networks with the same task, i.e., are symmetric, rendering it insufficient for them to correct each other due to their similar limitations. Therefore, in this paper, we propose an asymmetric dual-task co-training PLL model called AsyCo,which forces its two networks, i.e., a disambiguation network and an auxiliary network, to learn from different views explicitly by optimizing distinct tasks. Specifically, the disambiguation network is trained with a self-training PLL task to learn label confidence, while the auxiliary network is trained in a supervised learning paradigm to learn from the noisy pairwise similarity labels that are constructed according to the learned label confidence. Finally, the error accumulation problem is mitigated via information distillation and confidence refinement. Extensive experiments on both uniform and instance-dependent partially labeled datasets demonstrate the effectiveness of AsyCo.

关键词： machine learning weakly supervised learning partial-label learning co-training models candidate label sets

来源：评论

学校读者我要写书评

暂无评论

Large-Scale Multi-Objective Optimization Algorithm Based on Weighted Overlapping Grouping of Decision Variables

引用

computer Modeling in Engineering & sciences 2024年第7期140卷 363-383页

作者： Liang Chen Jingbo Zhang Linjie Wu Xingjuan Cai Yubin Xu Shanxi Key Laboratory of Big Data Analysis and Parallel Computing Taiyuan University of Science and TechnologyTaiyuan030024China School of State Key Laboratory of Novel Software Technology Nanjing UniversityNanjing210008China

The large-scale multi-objective optimization algorithm(LSMOA),based on the grouping of decision variables,is an advanced method for handling high-dimensional decision ***,in practical problems,the interaction among decision variables is intricate,leading to large group sizes and suboptimal optimization effects;hence a large-scale multi-objective optimization algorithm based on weighted overlapping grouping of decision variables(MOEAWOD)is proposed in this ***,the decision variables are perturbed and categorized into convergence and diversity variables;subsequently,the convergence variables are subdivided into groups based on the interactions among different decision *** the size of a group surpasses the set threshold,that group undergoes a process of weighting and overlapping ***,the interaction strength is evaluated based on the interaction frequency and number of objectives among various decision *** decision variable with the highest interaction in the group is identified and disregarded,and the remaining variables are then reclassified into ***,the decision variable with the strongest interaction is added to each *** minimizes the interactivity between different groups and maximizes the interactivity of decision variables within groups,which contributed to the optimized direction of convergence and diversity exploration with different *** was subjected to testing on 18 benchmark large-scale optimization problems,and the experimental results demonstrate the effectiveness of our *** with the other algorithms,our method is still at an advantage.

关键词： Decision variable grouping large-scale multi-objective optimization algorithms weighted overlapping grouping direction-guided evolution

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：