检索结果-内蒙古大学图书馆

2022 Conference on Empirical Methods in Natural Language Processing, EMNLP 2022

作者： Chen, Zhi Chen, Bei Chen, Lu Yu, Kai Lou, Jian-Guang X-LANCE Lab Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University China Microsoft Research Asia

Thanks to the development of pre-trained language models, multitask learning (MTL) methods have achieved great success in natural language understanding. However, current MTL methods pay more attention to task selection or model design to fuse as much knowledge as possible, while the intrinsic task correlation is often neglected. It is important to learn sharing strategies among multiple tasks rather than sharing everything. In this paper, we propose AdapterShare, an adapter differentiation method to explicitly model task correlation among multiple tasks. AdapterShare is automatically learned based on the gradients on tiny held-out validation data. Compared to single-task learning and fully shared MTL methods, our proposed method obtains obvious performance improvements. Compared to the existing MTL method AdapterFusion, AdapterShare achieves an absolute average improvement of 1.90 points on five dialogue understanding tasks and 2.33 points on NLU tasks. Our implementation is available at https://***/microsoft/ContextualSP. © 2022 Association for Computational Linguistics.

关键词： Learning systems

来源：评论

学校读者我要写书评

暂无评论

On the emergence of cross-task linearity in pretraining-finetuning paradigm 24

On the emergence of cross-task linearity in pretraining-fine...

引用

Proceedings of the 41st International Conference on Machine Learning

作者： Zhanpeng Zhou Zijun Chen Yilan Chen Bo Zhang Junchi Yan School of Artificial Intelligence & Department of Computer Science and Engineering & MoE Lab of AI Shanghai Jiao Tong University Shanghai China School of Artificial Intelligence & Department of Computer Science and Engineering & MoE Lab of AI Shanghai Jiao Tong University Shanghai China and Shanghai Artificial Intelligence Laboratory Computer Science and Engineering University of California San Diego Shanghai Artificial Intelligence Laboratory

The pretraining-finetuning paradigm has become the prevailing trend in modern deep learning. In this work, we discover an intriguing linear phenomenon in models that are initialized from a common pretrained checkpoint and finetuned on different tasks, termed as Cross-Task Linearity (CTL). Specifically, we show that if we linearly interpolate the weights of two finetuned models, the features in the weight-interpolated model are often approximately equal to the linear interpolation of features in two finetuned models at each layer. We provide comprehensive empirical evidence supporting that CTL consistently occurs for finetuned models that start from the same pretrained checkpoint. We conjecture that in the pretraining-finetuning paradigm, neural networks approximately function as linear maps, mapping from the parameter space to the feature space. Based on this viewpoint, our study unveils novel insights into explaining model merging/editing, particularly by translating operations from the parameter space to the feature space. Furthermore, we delve deeper into the root cause for the emergence of CTL, highlighting the role of pretraining. We released our source code at https://***/zzp1012/Cross-Task-Linearity.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Knowledge Proxy Intervention for Deconfounded Video Question Answering

Knowledge Proxy Intervention for Deconfounded Video Question...

引用

International Conference on computer Vision (ICCV)

作者： Jiangtong Li Li Niu Liqing Zhang Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence Shanghai Jiao Tong University

Recently, Video Question-Answering (VideoQA) has drawn more and more attention from both the industry and the research community. Despite all the success achieved by recent works, dataset bias always harmfully misleads current methods focusing on spurious correlations in training data. To analyze the effects of dataset bias, we frame the VideoQA pipeline into a causal graph, which shows the causalities among video, question, aligned feature between video and question, answer, and underlying confounder. Through the causal graph, we prove that the confounder and the backdoor path lead to spurious causality. To tackle the challenge that the confounder in VideoQA is unobserved and non-enumerable in general, we propose a model-agnostic framework called Knowledge Proxy Intervention (KPI), which introduces an extra knowledge proxy variable in the causal graph to cut the backdoor path and remove the effect of confounder. Our KPI framework exploits the front-door adjustment, which requires no prior knowledge about the confounder. The effectiveness of our KPI framework is corroborated by three baseline methods on five benchmark datasets, including MSVD-QA, MSRVTT-QA, TGIF-QA, NExT-QA, and Causal-VidQA.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Deep Image Harmonization with Learnable Augmentation

Deep Image Harmonization with Learnable Augmentation

引用

International Conference on computer Vision (ICCV)

作者： Li Niu Junyan Cao Wenyan Cong Liqing Zhang Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence Shanghai Jiao Tong University

The goal of image harmonization is adjusting the foreground appearance in a composite image to make the whole image harmonious. To construct paired training images, existing datasets adopt different ways to adjust the illumination statistics of foregrounds of real images to produce synthetic composite images. However, different datasets have considerable domain gap and the performances on small-scale datasets are limited by insufficient training data. In this work, we explore learnable augmentation to enrich the illumination diversity of small-scale datasets for better harmonization performance. In particular, our designed SYthetic COmposite Network (SycoNet) takes in a real image with foreground mask and a random vector to learn suitable color transformation, which is applied to the foreground of this real image to produce a synthetic composite image. Comprehensive experiments demonstrate the effectiveness of our proposed learnable augmentation for image harmonization. The code of SycoNet is released at https://***/bcmi/SycoNet-Adaptive-Image-Harmonization.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Fine-grained Visible Watermark Removal

Fine-grained Visible Watermark Removal

引用

International Conference on computer Vision (ICCV)

作者： Li Niu Xing Zhao Bo Zhang Liqing Zhang Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence Shanghai Jiao Tong University

Visible watermark removal aims to erase the watermark from watermarked image and recover the background image, which is a challenging task due to the diverse watermarks. Previous works have designed dynamic network to handle various types of watermarks adaptively, but they ignore that even the watermarked region in a single image can be divided into multiple local parts with distinct visual appearances. In this work, we advance image-specific dynamic network towards part-specific dynamic network, which discovers multiple local parts within the watermarked region and handle them adaptively. Specifically, we propose a query-based multi-task framework, in which part query embeddings are jointly used in two branches to predict part masks and restore watermarked parts. Extensive experiments demonstrate the effectiveness of our fine-grained watermark removal network.

关键词：

来源：评论

学校读者我要写书评

暂无评论

BiBL: AMR Parsing and Generation with Bidirectional Bayesian Learning 29

BiBL: AMR Parsing and Generation with Bidirectional Bayesian...

引用

29th International Conference on Computational Linguistics, COLING 2022

作者： Cheng, Ziming Li, Zuchao Zhao, Hai Department of Computer Science and Engineering Shanghai Jiao Tong University China MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University China

Meaning Representation (AMR) offers a unified semantic representation for natural language sentences. Thus transformation between AMR and text yields two transition tasks in opposite directions, i.e., Text-to-AMR parsing and AMR-to-Text generation. Existing AMR studies only focus on one-side improvements despite the duality of the two tasks, and their improvements are greatly attributed to the inclusion of large extra training data or complex structure modifications which harm the inference speed. Instead, we propose data-efficient Bidirectional Bayesian learning (BiBL) to facilitate bidirectional information transition by adopting a single-stage multitasking strategy so that the resulting model may enjoy much lighter training at the same time. Evaluation on benchmark datasets shows that our proposed BiBL outperforms strong previous seq2seq refinements without the help of extra data which is indispensable in existing counterpart models. We release the codes of BiBL at: https://***/KHAKhazeus/BiBL. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Aspect-based Sentiment Analysis as Machine Reading Comprehension 29

Aspect-based Sentiment Analysis as Machine Reading Comprehen...

引用

29th International Conference on Computational Linguistics, COLING 2022

作者： Yang, Yifei Zhao, Hai Department of Computer Science and Engineering Shanghai Jiao Tong University China MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University China

Existing studies typically handle aspect-based sentiment analysis by stacking multiple neural modules, which inevitably result in severe error propagation. Instead, we propose a novel end-to-end framework, MRCOOL: MRC-PrOmpt mOdeL framework, where numerous sentiment aspects are elicited by a machine reading comprehension (MRC) model and their corresponding sentiment polarities are classified in a prompt learning way. Experiments show that our end-to-end framework consistently yields promising results on widely-used benchmark datasets which significantly outperform existing state-of-the-art models or achieve comparable performance. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.

关键词： Sentiment analysis

来源：评论

学校读者我要写书评

暂无评论

Nested Named Entity Recognition as Corpus Aware Holistic Structure Parsing 29

Nested Named Entity Recognition as Corpus Aware Holistic Str...

引用

29th International Conference on Computational Linguistics, COLING 2022

作者： Yang, Yifei Li, Zuchao Zhao, Hai Department of Computer Science and Engineering Shanghai Jiao Tong University China MoE Key Lab of Artificial Intelligence AI Institute Shanghai Jiao Tong University China

As a fundamental natural language processing task and one of core knowledge extraction techniques, named entity recognition (NER) is widely used to extract information from texts for downstream tasks. Nested NER is a branch of NER in which the named entities (NEs) are nested with each other. However, most of the previous studies on nested NER usually apply linear structure to model the nested NEs which are actually accommodated in a hierarchical structure. Thus in order to address this mismatch, this work models the full nested NEs in a sentence as a holistic structure, then we propose a holistic structure parsing algorithm to disclose the entire NEs once for all. Besides, there is no research on applying corpus-level information to NER currently. To make up for the loss of this information, we introduce Point-wise Mutual Information (PMI) and other frequency features from corpus-aware statistics for even better performance by holistic modeling from sentence-level to corpus-level. Experiments show that our model yields promising results on widely-used benchmarks which approach or even achieve state-of-the-art. Further empirical studies show that our proposed corpus-aware features can substantially improve NER domain adaptation, which demonstrates the surprising advantage of our proposed corpus-level holistic structure modeling. © 2022 Proceedings - International Conference on Computational Linguistics, COLING. All rights reserved.

关键词： Syntactics

来源：评论

学校读者我要写书评

暂无评论

Learning to Communicate Among Agents for Large-Scale Dynamic Path Planning With Genetic Programming Hyperheuristic

IEEE Transactions on Artificial Intelligence

引用

IEEE Transactions on artificial intelligence 2025年第5期6卷 1269-1283页

作者： Liao, Xiao-Cheng Hu, Xiao-Min Chen, Xiang-Ling Mei, Yi Jia, Ya-Hui Chen, Wei-Neng Victoria University of Wellington Centre for Data Science and Artificial Intelligence School of Engineering and Computer Science Wellington6140 New Zealand Guangdong University of Technology School of Computer Science and Technology Guangzhou510006 China Hanyang University Department of Electrical and Electronic Engineering Ansan15588 Korea Republic of South China University of Technology School of Future Technology Guangzhou510006 China Pazhou Lab Guangzhou510005 China South China University of Technology School of Computer Science and Engineering State Key Laboratory of Subtropical Building and Urban Science Guangzhou510006 China

Genetic programming hyperheuristic (GPHH) has recently become a promising methodology for large-scale dynamic path planning (LDPP) since it can produce reusable heuristics rather than disposable solutions. However, in this methodology, the extracted local and decentralized heuristic for agents that lack a global systemic view sometimes may be problematic. Therefore, a new challenge is to strike a balance between conciseness to improve generalization ability and incorporation of more global information to obtain better performance. In this work, we target the LDPP problem and propose a communication learning mechanism (ComLGP) for GPHH to address the above difficulties. In ComLGP, a communication function is introduced to serve as a communication protocol and exist in the form of an extra terminal in GPHH. Compared to the classic terminals which are fixed in genetic programing, this communication function undergoes optimization along with the evolutionary process of GPHH. In this way, the communication function can be learned which enables agents to communicate without a predefined communication protocol. Then, a caching and lazy updating mechanism for ComLGP is presented to accelerate the calculation of communication content. Last, we verified our method on 22 scenarios including two real world road networks. The experimental results demonstrate that the proposed ComLGP can successfully learn to communicate. Although in the absence of any manually designed communication features, ComLGP is capable of achieving performance competitive to the state-of-the-art method that employs a predefined communication protocol and outperforms the remaining compared methods in most scenarios. © 2020 IEEE.

关键词： Genetic programming

来源：评论

学校读者我要写书评

暂无评论

Fast and High-Quality Auto-Regressive Speech Synthesis via Speculative Decoding

Fast and High-Quality Auto-Regressive Speech Synthesis via S...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Bohan Li Hankun Wang Situo Zhang Yiwei Guo Kai Yu Department of Computer Science and Engineering MoE Key Lab of Artificial Intelligence AI Institute X-LANCE Lab Shanghai Jiao Tong University Shanghai China

ISBN: (数字)9798350368741

ISBN: (纸本)9798350368758

The auto-regressive (AR) architecture, exemplified by models such as GPT, is extensively utilized in modern Text-to-Speech (TTS) systems. However, it often leads to considerable inference delays, primarily due to the challenges associated with next-token prediction in long speech sequences. In this work, we introduce VADUSA, one of the first approaches to accelerate AR-based TTS through speculative decoding. Our findings demonstrate that VADUSA not only delivers a significant reduction in inference time but also enhances TTS quality by employing draft heads to predict future speech tokens in an auto-regressive manner. Additionally, the incorporation of a tolerance mechanism during the sampling process further boosts performance, yielding approximately a 3× speedup in AR TTS. Moreover, our approach exhibits strong generalization across diverse datasets and various speech token types.

关键词： Speech enhancement Signal processing Acoustics Decoding Text to speech Delays

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：