检索结果-内蒙古大学图书馆

Efficient unified information extraction model based on large language models

Applied Soft computing 2025年 180卷

作者： Zhang, Xieyun Cai, Shimin Shen, Xiaorong Yang, Han Hu, Wenhao Zhang, Yanru Big Data Research Center University of Electronic Science and Technology of China Chengdu611731 China i-Large Model innovation Lab of Ideological and Political Science University of Electronic Science and Technology of China Chengdu611731 China

Large Language Models (LLMs) have shown underperformance in information extraction (IE) tasks compared to smaller models. This underperformance is largely attributed to the mismatch between IE's structured output and LLMs’ natural language output and the absence of IE tasks in the pre-training corpus. Furthermore, prior research aimed to adapt LLMs to IE tasks through instructional design without updating model parameters, or by fine-tuning them with substantial computational resources, at the expense of their original performance in other tasks. Inspired by the Parameter Efficient Fine-Tuning (PEFT) technique, we designed an efficient unified information framework for LLMs (LLM-UIE), which performs domain adaptation fine-tuning with low resource requirements. Importantly, LLM-UIE introduced an additional answer selection task to improve LLMs’ ability to generate desired answers, efficiently addressing the inconsistency between LLMs’ fuzzy outputs and standard answers. Experiments on extensive information extraction datasets show that LLM-UIE not only matches but even surpasses the F1 scores of state-of-the-art models while demonstrating significant advantages in training efficiency, substantially reducing training time. Moreover, compared to previous LLM-based studies, LLM-UIE significantly lowers computational resource requirements. © 2025 Elsevier B.V.

关键词： Online searching

来源：评论

学校读者我要写书评

暂无评论

Downstream-agnostic Adversarial Examples

Downstream-agnostic Adversarial Examples

引用

International Conference on Computer Vision (ICCV)

作者： Ziqi Zhou Shengshan Hu Ruizhi Zhao Qian Wang Leo Yu Zhang Junhui Hou Hai Jin School of Cyber Science and Engineering Huazhong University of Science and Technology National Engineering Research Center for Big Data Technology and System Services Computing Technology and System Lab Hubei Key Laboratory of Distributed System Security Hubei Engineering Research Center on Big Data Security School of Cyber Science and Engineering Wuhan University School of Information and Communication Technology Griffith University Department of Computer Science City University of Hong Kong School of Computer Science and Technology Huazhong University of Science and Technology Cluster and Grid Computing Lab

Self-supervised learning usually uses a large amount of unlabeled data to pre-train an encoder which can be used as a general-purpose feature extractor, such that downstream users only need to perform fine-tuning operations to enjoy the benefit of "large model". Despite this promising prospect, the security of pre-trained encoder has not been thoroughly investigated yet, especially when the pre-trained encoder is publicly available for commercial *** this paper, we propose AdvEncoder, the first framework for generating downstream-agnostic universal adversarial examples based on the pre-trained encoder. AdvEncoder aims to construct a universal adversarial perturbation or patch for a set of natural images that can fool all the downstream tasks inheriting the victim pre-trained encoder. Unlike traditional adversarial example works, the pre-trained encoder only outputs feature vectors rather than classification labels. Therefore, we first exploit the high frequency component information of the image to guide the generation of adversarial examples. Then we design a generative attack framework to construct adversarial perturbations/patches by learning the distribution of the attack surrogate dataset to improve their attack success rates and transferability. Our results show that an attacker can successfully attack downstream tasks without knowing either the pre-training dataset or the downstream dataset. We also tailor four defenses for pre-trained encoders, the results of which further prove the attack ability of AdvEncoder. Our codes are available at: https://***/CGCL-codes/AdvEncoder.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Hyperbolic Geometric Latent Diffusion Model for Graph Generation 41

Hyperbolic Geometric Latent Diffusion Model for Graph Genera...

引用

41st International Conference on Machine Learning, ICML 2024

作者： Fu, Xingcheng Gao, Yisen Wei, Yuecen Sun, Qingyun Peng, Hao Li, Jianxin Li, Xianxian Key Lab of Education Blockchain and Intelligent Technology Ministry of Education Guangxi Normal University Guilin China Institute of Artificial Intelligence Beihang University Beijing China School of Software Beihang University Beijing China Beijing Advanced Innovation Center for Big Data and Brain Computing School of Computer Science and Engineering Beihang University Beijing China

Diffusion models have made significant contributions to computer vision, sparking a growing interest in the community recently regarding the application of them to graph generation. Existing discrete graph diffusion models exhibit heightened computational complexity and diminished training efficiency. A preferable and natural way is to directly diffuse the graph within the latent space. However, due to the non-Euclidean structure of graphs is not isotropic in the latent space, the existing latent diffusion models effectively make it difficult to capture and preserve the topological information of graphs. To address the above challenges, we propose a novel geometrically latent diffusion framework HypDiff. Specifically, we first establish a geometrically latent space with interpretability measures based on hyperbolic geometry, to define anisotropic latent diffusion processes for graphs. Then, we propose a geometrically latent diffusion process that is constrained by both radial and angular geometric properties, thereby ensuring the preservation of the original topological properties in the generative graphs. Extensive experimental results demonstrate the superior effectiveness of HypDiff for graph generation with various topologies. Copyright 2024 by the author(s)

关键词： Graph theory

来源：评论

学校读者我要写书评

暂无评论

Hierarchical Classification Auxiliary Network for Time Series Forecasting

arXiv

引用

arXiv 2024年

作者： Sun, Yanru Xie, Zongxia Chen, Dongyue Eldele, Emadeldeen Hu, Qinghua Tianjin Key Lab of Machine Learning College of Intelligence and Computing Tianjin University China Centre for Frontier AI Research Agency for Science Technology and Research Singapore Institute for InfoComm Research Agency for Science Technology and Research Singapore

Deep learning has significantly advanced time series forecasting through its powerful capacity to capture sequence relationships. However, training these models with the Mean Square Error (MSE) loss often results in over-smooth predictions, making it challenging to handle the complexity and learn high-entropy features from time series data with high variability and unpredictability. In this work, we introduce a novel approach by tokenizing time series values to train forecasting models via cross-entropy loss, while considering the continuous nature of time series data. Specifically, we propose a Hierarchical Classification Auxiliary Network, HCAN, a general model-agnostic component that can be integrated with any forecasting model. HCAN is based on a Hierarchy-Aware Attention module that integrates multi-granularity high-entropy features at different hierarchy levels. At each level, we assign a class label for timesteps to train an Uncertainty-Aware Classifier. This classifier mitigates the over-confidence in softmax loss via evidence theory. We also implement a Hierarchical Consistency Loss to maintain prediction consistency across hierarchy levels. Extensive experiments integrating HCAN with state-of-the-art forecasting models demonstrate substantial improvements over baselines on several real-world datasets. Copyright © 2024, The Authors. All rights reserved.

关键词： Entropy

来源：评论

学校读者我要写书评

暂无评论

Rejection Sampling IMLE: Designing Priors for Better Few-Shot Image Synthesis

arXiv

引用

arXiv 2024年

作者： Vashist, Chirag Peng, Shichong Li, Ke APEX Lab School of Computing Science Simon Fraser University Canada

An emerging area of research aims to learn deep generative models with limited training data. Prior generative models like GANs and diffusion models require a lot of data to perform well, and their performance degrades when they are trained on only a small amount of data. A recent technique called Implicit Maximum Likelihood Estimation (IMLE) has been adapted to the few-shot setting, achieving state-of-the-art performance. However, current IMLE-based approaches encounter challenges due to inadequate correspondence between the latent codes selected for training and those drawn during inference. This results in suboptimal test-time performance. We theoretically show a way to address this issue and propose RS-IMLE, a novel approach that changes the prior distribution used for training. This leads to substantially higher quality image generation compared to existing GAN and IMLE-based methods, as validated by comprehensive experiments conducted on nine few-shot image datasets. © 2024, CC BY-NC-ND.

关键词：

来源：评论

学校读者我要写书评

暂无评论

A nearly optimal distributed algorithm for computing the weighted girth

引用

science China(Information sciences) 2021年第11期64卷 80-94页

作者： Qiang-Sheng HUA Lixiang QIAN Dongxiao YU Xuanhua SHI Hai JIN National Engineering Research Center for Big Data Technology and System/Services Computing Technology and System Lab/Cluster and Grid Computing Lab School of Computer Science and TechnologyHuazhong University of Science and Technology School of Computer Science and Technology Shandong University

computing the weighted girth, which is the sum of weights of edges in the minimum weight cycle,is an important problem in network analysis. The problem for distributively computing girth in unweighted graphs has garnered lots of attention, but there are few studies in weighted graphs. In this paper, we propose a distributed randomized algorithm for computing the weighted girth in weighted graphs with integral edge weights in the range [1, nc], where n is the number of vertices and c is a constant. The algorithm is devised under the standard synchronous CON GE S T model, which limits each vertex can only transfer O（log n） bits information along each incident edge in a round. The upper bound of the algorithm is O（n log2n） rounds. We also prove the lower bound for computing the weighted girth is ?（D + n/log n） where D is the hop diameter of the weighted graph. This means our distributed algorithm is optimal within a factor of O（log3n）.

关键词： distributed algorithms weighted girth \({\cal C}{\cal O}{\cal N}{\cal G}{\cal E}{\cal S}{\cal T}\) model communication complexity round complexity

来源：评论

学校读者我要写书评

暂无评论

Enhancing SNN-based Spatio-Temporal Learning: A Benchmark dataset and Cross-Modality Attention Model

arXiv

引用

arXiv 2024年

作者： Zhou, Shibo Yang, Bo Yuan, Mengwen Jiang, Runhao Yan, Rui Pan, Gang Tang, Huajin Research Center for Data Hub and Security Zhejiang Lab Hangzhou China College of Computer Science and Technology Zhejiang University Hangzhou China Research Center for High Efficiency Computing System Zhejiang Lab Hangzhou China College of Computer Science and Technology Zhejiang University of Technology Hangzhou China The State Key Lab of Brain-Machine Intelligence Zhejiang University Hangzhou China

Spiking Neural Networks (SNNs), renowned for their low power consumption, brain-inspired architecture, and spatio-temporal representation capabilities, have garnered considerable attention in recent years. Similar to Artificial Neural Networks (ANNs), high-quality benchmark datasets are of great importance to the advances of SNNs. However, our analysis indicates that many prevalent neuromorphic datasets lack strong temporal correlation, preventing SNNs from fully exploiting their spatio-temporal representation capabilities. Meanwhile, the integration of event and frame modalities offers more comprehensive visual spatio-temporal information. Yet, the SNN-based cross-modality fusion remains underexplored. In this work, we present a neuromorphic dataset called DVS-SLR that can better exploit the inherent spatio-temporal properties of SNNs. Compared to existing datasets, it offers advantages in terms of higher temporal correlation, larger scale, and more varied scenarios. In addition, our neuromorphic dataset contains corresponding frame data, which can be used for developing SNN-based fusion methods. By virtue of the dual-modal feature of the dataset, we propose a Cross-Modality Attention (CMA) based fusion method. The CMA model efficiently utilizes the unique advantages of each modality, allowing for SNNs to learn both temporal and spatial attention scores from the spatio-temporal features of event and frame modalities, subsequently allocating these scores across modalities to enhance their synergy. Experimental results demonstrate that our method not only improves recognition accuracy but also ensures robustness across diverse scenarios. © 2024, CC0.

关键词： Spatio-temporal data

来源：评论

学校读者我要写书评

暂无评论

Context-Driven Index Trimming: A data Quality Perspective to Enhancing Precision of RALMs

Context-Driven Index Trimming: A Data Quality Perspective to...

引用

2024 Findings of the Association for Computational Linguistics, EMNLP 2024

作者： Ma, Kexin Jin, Ruochun Wang, Haotian Wang, Xi Chen, Huan Tang, Yuhua Wang, Qian Institute for Quantum Information State Key Laboratory of High Performance Computing China College of Computer Science and Technology National University of Defense Technology Changsha China Intelligent Game and Decision Lab Academy of Military Science Beijing China

ISBN: (纸本)9798891761681

Retrieval-Augmented Large Language Models (RALMs) have made significant strides in enhancing the accuracy of generated responses. However, existing research often overlooks the data quality issues within retrieval results, often caused by inaccurate existing vector-distance-based retrieval methods. We propose to boost the precision of RALMs' answers from a data quality perspective through the Context-Driven Index Trimming (CDIT) framework, where Context Matching Dependencies (CMDs) are employed as logical data quality rules to capture and regulate the consistency between retrieved contexts. Based on the semantic comprehension capabilities of Large Language Models (LLMs), CDIT can effectively identify and discard retrieval results that are inconsistent with the query context and further modify indexes in the database, thereby improving answer quality. Experiments demonstrate average improvement of 3.75% in accuracy on challenging question-answering tasks. Also, the flexibility of CDIT is verified through its compatibility with various language models and indexing methods, which offers a promising approach to bolster RALMs' data quality and retrieval precision jointly. © 2024 Association for Computational Linguistics.

关键词： data accuracy

来源：评论

学校读者我要写书评

暂无评论

Feature Fusion Network for Personalized Online Advertising Systems

Feature Fusion Network for Personalized Online Advertising S...

引用

IEEE International Conference on Big data

作者： Weijie Zhao Peng Yang Dong Li Xing Shen Lin Liu Ping Li Cognitive Computing Lab Baidu Research Baidu Search Ads (Phoenix Nest) Baidu Inc Beijing China

ISBN: (纸本)9781665480468

Sponsored online advertising delivers many billions of revenues for online ads publishers. The ads systems take userinput query keywords and display ads that are relevant to the query. the task of click-through rate (CTR) prediction aims to estimate the likelihood of a user clicking on the ads, which has become one of the core goals in the ads system. In order to further improve the CTR, user portraits are also considered as an input to make personalized ads display and recommendations, in the current deep learning CTR training platform. The naive combination of user space (~ 10 9 ) and feature space (~ 10] 12 however, would yield a 10 21 dimensional space. It is not only infeasible to feed the 10 21 parameters into the embedding layer with any off-the-shelf storage, but also impractical to train the network in such massive-scale dimensional space. In this paper, we design a novel CTR prediction framework for ads systems to tackle the massive-scale user-feature combination challenge. Specifically, we introduce a feature fusion network to explicitly learn user-feature cross embedding in an end-to-end manner. To improve the efficiency, we prune the feature fusion networks to a practical number through a network importance ranking scheme. Extensive empirical experiments on Baidu’s ads data validate the effectiveness of the proposed feature fusion networks.

关键词： Training Deep learning Neural networks Predictive models Big data Prediction algorithms History

来源：评论

学校读者我要写书评

暂无评论

Detecting adversarial examples using image reconstruction differences

引用

SOFT computing 2023年第12期27卷 7863-7877页

作者： Sun, Jiaze Yi, Meng Xian Univ Posts & Telecommun Sch Comp Sci & Technol Xian 710121 Peoples R China Shaanxi Key Lab Network Data Anal & Intelligent Pr Xian 710121 Peoples R China Xian Key Lab Big Data & Intelligent Comp Xian 710127 Peoples R China

The adversarial examples (AEs) cause misjudgments and damage the robustness of the DNNs systems. Previous studies have defended against AEs by detecting, but it is challenging to ensure a stable and high performance of detecting AEs, while with a poor false detection. To this end, an AEs detection method named image reconstruction differences (IRD) is proposed to enhance the robustness of DNNs. Firstly, we use an end-to-end Com-Rec network to reconstruct examples with feature compression to expand the distinguishing features. Secondly, propose an image reconstruction differences based on information-theoretic VIF, structural information UQI and spectral information RASE composition to discriminate AEs. Moreover, we introduce the idea of integrated learning to form a strong random forest binary classifier to enhance the performance of detecting AEs. We further validate it through extensive experiments on the MNIST and CIFAR-10 datasets. These experiments demonstrated that the IRD effectively detected AEs and achieved a high average accuracy of 98.33%. Specifically it also performs favorably against the following methods based on Feature Squeezing, Local Intrinsic Dimensionality, Kernel Density and Network Invariance Checking with an average detection rate of 99.54% and a 1.44% average false positive rate.

关键词： Deep neural networks Adversarial examples Detection Compress and reconstruct Image reconstruction differences Random forest

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：