检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Wang, Jiaxin Zhang, Lingling Liu, Jun Guo, Tianlin Wu, Wenjun The School of Computer Science and Technology Xi’an Jiaotong University Shaanxi Xi’an710049 China The Shaanxi Provincial Key Laboratory of Big Data Knowledge Engineering Xi’an Jiaotong University Shaanxi Xi’an710049 China National Engineering Lab for Big Data Analytics Xi’an Jiaotong University Shaanxi Xi’an710049 China

We introduce a novel task, called Generalized Relation Discovery (GRD), for open-world relation extraction. GRD aims to identify unlabeled instances in existing pre-defined relations or discover novel relations by assigning instances to clusters as well as providing specific meanings for these clusters. The key challenges of GRD are how to mitigate the serious model biases caused by labeled pre-defined relations to learn effective relational representations and how to determine the specific semantics of novel relations during classifying or clustering unlabeled instances. We then propose a novel framework, SFGRD, for this task to solve the above issues by learning from semi-factuals in two stages. The first stage is semi-factual generation implemented by a tri-view debiased relation representation module, in which we take each original sentence as the main view and design two debiased views to generate semi-factual examples for this sentence. The second stage is semi-factual thinking executed by a dual-space tri-view collaborative relation learning module, where we design a cluster-semantic space and a class-index space to learn relational semantics and relation label indices, respectively. In addition, we devise alignment and selection strategies to integrate two spaces and establish a self-supervised learning loop for unlabeled data by doing semi-factual thinking across three views. Extensive experimental results show that SFGRD surpasses state-of-the-art models in terms of accuracy by 2.36% ∼5.78% and cosine similarity by 32.19%∼ 84.45% for relation label index and relation semantic quality, respectively. To the best of our knowledge, we are the first to exploit the efficacy of semi-factuals in relation extraction. Copyright © 2024, The Authors. All rights reserved.

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

EFFICIENTLY PARAMETERIZED NEURAL METRIPLECTIC SYSTEMS

arXiv

引用

arXiv 2024年

作者： Gruber, Anthony Lee, Kookjin Lim, Haksoo Park, Noseong Trask, Nathaniel Center for Computing Research Sandia National Laboratories AlbuquerqueNM United States School of Computing and Augmented Intelligence Arizona State University TempeAZ United States Big Data Analytics Laboratory Yonsei University Seoul Korea Republic of Big Data Analytics Laboratory Korea Advanced Institute of Science and Technology Daejeon Korea Republic of School of Engineering and Applied Science University of Pennsylvania PhiladelphiaPA United States

Metriplectic systems are learned from data in a way that scales quadratically in both the size of the state and the rank of the metriplectic data. Besides being provably energy conserving and entropy stable, the proposed approach comes with approximation results demonstrating its ability to accurately learn metriplectic dynamics from data as well as an error estimate indicating its potential for generalization to unseen timescales when approximation error is low. Examples are provided which illustrate performance in the presence of both full state information as well as when entropic variables are unknown, confirming that the proposed approach exhibits superior accuracy and scalability without compromising on model expressivity. Copyright © 2024, The Authors. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Are Large Language Models Really Good Logical Reasoners? A Comprehensive Evaluation and Beyond

arXiv

引用

arXiv 2023年

作者： Xu, Fangzhi Lin, Qika Han, Jiawei Zhao, Tianzhe Liu, Jun Cambria, Erik The School of Computer Science and Technology Xi’an Jiao-tong University Shaanxi Xi’an710049 China The Shaanxi Provincial Key Laboratory of Big Data Knowledge Engineering National Engineering Lab for Big Data Analytics Shaanxi Xi’an710049 China The School of Computer Science and Engineering Nanyang Technological University Singapore

Logical reasoning consistently plays a fundamental and significant role in the domains of knowledge engineering and artificial intelligence. Recently, Large Language Models (LLMs) have emerged as a noteworthy innovation in natural language processing (NLP). However, the question of whether LLMs can effectively address the task of logical reasoning, which requires gradual cognitive inference similar to human intelligence, remains unanswered. To this end, we aim to bridge this gap and provide comprehensive evaluations in this paper. Firstly, to offer systematic evaluations, we select fifteen typical logical reasoning datasets and organize them into deductive, inductive, abductive and mixed-form reasoning settings. Considering the comprehensiveness of evaluations, we include 3 early-era representative LLMs and 4 trending LLMs. Secondly, different from previous evaluations relying only on simple metrics (e.g., accuracy), we propose fine-level evaluations in objective and subjective manners, covering both answers and explanations, including answer correctness, explain correctness, explain completeness and explain redundancy. Additionally, to uncover the logical flaws of LLMs, problematic cases will be attributed to five error types from two dimensions, i.e., evidence selection process and reasoning process. Thirdly, to avoid the influences of knowledge bias and concentrate purely on benchmarking the logical reasoning capability of LLMs, we propose a new dataset with neutral content. Based on the in-depth evaluations, this paper finally forms a general evaluation scheme of logical reasoning capability from six dimensions (i.e., Correct, Rigorous, Self-aware, Active, Oriented and No hallucination). It reflects the pros and cons of LLMs and gives guiding directions for future works. © 2023, CC BY.

关键词： Natural language processing systems

来源：评论

学校读者我要写书评

暂无评论

Generative Model-based Feature Knowledge Distillation for Action Recognition

arXiv

引用

arXiv 2023年

作者： Wang, Guiqin Zhao, Peng Shi, Yanjiang Zhao, Cong Yang, Shusen School of Computer Science and Technology Xi’an Jiaotong University China School of Mathematics and Statistics Xi’an Jiaotong University China National Engineering Laboratory for Big Data Analytics Xi’an Jiaotong University China

Knowledge distillation (KD), a technique widely employed in computer vision, has emerged as a de facto standard for improving the performance of small neural networks. However, prevailing KD-based approaches in video tasks primarily focus on designing loss functions and fusing cross-modal information. This overlooks the spatial-temporal feature semantics, resulting in limited advancements in model compression. Addressing this gap, our paper introduces an innovative knowledge distillation framework, with the generative model for training a lightweight student model. In particular, the framework is organized into two steps: the initial phase is Feature Representation, wherein a generative model-based attention module is trained to represent feature semantics;Subsequently, the Generative-based Feature Distillation phase encompasses both Generative Distillation and Attention Distillation, with the objective of transferring attention-based feature semantics with the generative model. The efficacy of our approach is demonstrated through comprehensive experiments on diverse popular datasets, proving considerable enhancements in video action recognition task. Moreover, the effectiveness of our proposed framework is validated in the context of more intricate video action detection task. Our code is available at https://***/aaai24/Generative-based-KD. Copyright © 2023, The Authors. All rights reserved.

关键词： Distillation

来源：评论

学校读者我要写书评

暂无评论

FedDSV: Shapley Value-based Contribution Estimation in Federated Learning with Dynamic Participation

引用

IEEE Transactions on Mobile Computing 2025年

作者： Lei, Kaijia Ren, Xuebin Yang, Shusen Wang, Xiaocheng Zhao, Fangyuan Xi'an Jiaotong University School of Computer Science and Technology Shaanxi Xi'an 710049 China Xi'an Jiaotong University National Engineering Laboratory for Big Data Analytics Shaanxi Xi'an 710049 China Xi'an Jiaotong University National Engineering Laboratory for Big Data Analytics (NEL-BDA) Shaanxi Xi'an 710049 China Xi'an Jiaotong University Ministry of Education Key Lab for Intelligent Networks and Network Security (MOE KLINNS Lab) Shaanxi Xi'an 710049 China

Federated Learning (FL) succeeds in collaborative and privacy-preserving ML model training among multiple distributed data owners. To maintain a healthy FL ecosystem, it is crucial to estimate the contributions of all participants fairly. Due to provable fairness, Shapley value (SV) is widely used for contribution estimation in FL. However, current studies focus on static scenarios with fixed participants and neglect the dynamic settings with the random joining or leaving of participants in practice. This paper fills the gap by proposing FedDSV, a novel contribution estimation framework for FL with dynamic participation. FedDSV supports flexible weighting mechanisms and is compatible with the SV fairness properties in dynamic scenarios. To reduce the computational complexity, we propose a Monte Carlo variant sampling method (SMC), which can adapt well to dynamic scenarios and approximate the true SVs. To evaluate the effectiveness and efficiency of our proposed approaches, extensive experiments under different settings (e.g., frequency switching, low-quality detection, etc.) are conducted on both i.i.d and non-i.i.d. distributions. Experimental results demonstrate that FedDSV can reflect the real utility contribution of data sources for dynamic FL, and SMC can approximate the exact dynamic SVs with larger similarities in a much shorter time than the state-of-the-art methods. © 2002-2012 IEEE.

关键词： Contribution estimation dynamic participation federated learning shapley value

来源：评论

学校读者我要写书评

暂无评论

Joint Energy and Frequency Regulation Market Clearing Considering Wind Power Uncertainty

Joint Energy and Frequency Regulation Market Clearing Consid...

引用

IEEE/IAS Industrial and Commercial Power System Asia (I&CPS Asia)

作者： Yemin Wu Jiyao Wang Chenyu Peng Chunyu Chen Xuemei Dai Jianxiao Wang School of Electrical Engineering China University of Mining and Technology Xuzhou China School of Electrical Engineering Shanghai Electric Power University Shanghai China National Engineering Laboratory of Big Data Analytics and Applied Technologies Peking University Beijing China

Efficient integration of renewable power such as wind energy in electricity markets is among the highest priorities. However, the deterministic market designs may not effectively adapt to the uncertainty of renewable power. In this paper, we investigate a joint energy and frequency regulation market clearing considering wind power uncertainty. By introducing the chance-constrained market design, we propose a joint market-clearing strategy for both conventionally deterministic units and uncertain wind farms. By the simulation based on real market data, we show that as the confidence level increases, system stability improves, but this comes at the expense of increased wind abandonment losses and higher operating costs in the market. Simultaneously, a higher penetration of wind power leads to lower market operating costs and energy shadow prices.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Transfer and Alignment Network for Generalized Category Discovery

arXiv

引用

arXiv 2023年

作者： An, Wenbin Tian, Feng Shi, Wenkai Chen, Yan Wu, Yaqiang Wang, Qianying Chen, Ping School of Automation Science and Engineering Xi’an Jiaotong University China School of Computer Science and Technology Xi’an Jiaotong University China National Engineering Laboratory for Big Data Analytics China Lenovo Research China Department of Engineering University of Massachusetts Boston United States

Generalized Category Discovery (GCD) is a crucial real-world task that aims to recognize both known and novel categories from an unlabeled dataset by leveraging another labeled dataset with only known categories. Despite the improved performance on known categories, current methods perform poorly on novel categories. We attribute the poor performance to two reasons: biased knowledge transfer between labeled and unlabeled data and noisy representation learning on the unlabeled data. The former leads to unreliable estimation of learning targets for novel categories and the latter hinders models from learning discriminative features. To mitigate these two issues, we propose a Transfer and Alignment Network (TAN), which incorporates two knowledge transfer mechanisms to calibrate the biased knowledge and two feature alignment mechanisms to learn discriminative features. Specifically, we model different categories with prototypes and transfer the prototypes in labeled data to correct model bias towards known categories. On the one hand, we pull instances with known categories in unlabeled data closer to these prototypes to form more compact clusters and avoid boundary overlap between known and novel categories. On the other hand, we use these prototypes to calibrate noisy prototypes estimated from unlabeled data based on category similarities, which allows for more accurate estimation of prototypes for novel categories that can be used as reliable learning targets later. After knowledge transfer, we further propose two feature alignment mechanisms to acquire both instance- and category-level knowledge from unlabeled data by aligning instance features with both augmented features and the calibrated prototypes, which can boost model performance on both known and novel categories with less noise. Experiments on three benchmark datasets show that our model outperforms SOTA methods, especially on novel categories. Theoretical analysis is provided for an in-depth understanding

关键词： Alignment

来源：评论

学校读者我要写书评

暂无评论

Weakly-Supervised Action Localization by Hierarchically-structured Latent Attention Modeling

Weakly-Supervised Action Localization by Hierarchically-stru...

引用

International Conference on Computer Vision (ICCV)

作者： Guiqin Wang Peng Zhao Cong Zhao Shusen Yang Jie Cheng Luziwei Leng Jianxing Liao Qinghai Guo School of Computer Science and Technology Xi’an Jiao Tong University School of Mathematics and Statistics Xi’an Jiao Tong University National Engineering Laboratory for Big Data Analytics Xi’an Jiao Tong University ACS Lab Huawei Technologies

Weakly-supervised action localization aims to recognize and localize action instancese in untrimmed videos with only video-level labels. Most existing models rely on multiple instance learning(MIL), where the predictions of unlabeled instances are supervised by classifying labeled bags. The MIL-based methods are relatively well studied with cogent performance achieved on classification but not on localization. Generally, they locate temporal regions by the video-level classification but overlook the temporal variations of feature semantics. To address this problem, we propose a novel attention-based hierarchically-structured latent model to learn the temporal variations of feature semantics. Specifically, our model entails two components, the first is an unsupervised change-points detection module that detects change-points by learning the latent representations of video features in a temporal hierarchy based on their rates of change, and the second is an attention-based classification model that selects the change-points of the foreground as the boundaries. To evaluate the effectiveness of our model, we conduct extensive experiments on two benchmark datasets, THUMOS-14 and ActivityNet-v1.3. The experiments show that our method outperforms current state-of-the-art methods, and even achieves comparable performance with fully-supervised methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Mind Reasoning Manners: Enhancing Type Perception for Generalized Zero-shot Logical Reasoning over Text

arXiv

引用

arXiv 2023年

作者： Xu, Fangzhi Liu, Jun Lin, Qika Zhao, Tianzhe Zhang, Jian Zhang, Lingling School of Computer Science and Technology Xi’an Jiaotong University Xi’an China Shaanxi Province Key Laboratory of Satellite and Terrestrial Network Tech. R&D National Engineering lab for Big Data Analytics Xi’an China

Logical reasoning task involves diverse types of complex reasoning over text, based on the form of multiple-choice question answering. Given the context, question and a set of options as the input, previous methods achieve superior performances on the full-data setting. However, the current benchmark dataset has the ideal assumption that the reasoning type distribution on the train split is close to the test split, which is inconsistent with many real application scenarios. To address it, there remain two problems to be studied: (1) How is the zero-shot capability of the models (train on seen types and test on unseen types)? (2) How to enhance the perception of reasoning types for the models? For problem 1, we propose a new benchmark for generalized zero-shot logical reasoning, named ZsLR. It includes six splits based on the three type sampling strategies. For problem 2, a type-aware model TaCo is proposed. It utilizes both the heuristic input reconstruction and the contrastive learning to improve the type perception in the global representation. Extensive experiments on both the zero-shot and full-data settings prove the superiority of TaCo over the state-of-the-art methods. Also, we experiment and verify the generalization capability of TaCo on other logical reasoning dataset. © 2023, CC BY.

关键词： Benchmarking

来源：评论

学校读者我要写书评

暂无评论

Weakly-Supervised Action Localization by Hierarchically-structured Latent Attention Modeling

arXiv

引用

arXiv 2023年

作者： Wang, Guiqin Zhao, Peng Zhao, Cong Yang, Shusen Cheng, Jie Leng, Luziwei Liao, Jianxing Guo, Qinghai School of Computer Science and Technology Xi’an Jiaotong University China ACS Lab Huawei Technologies China School of Mathematics and Statistics Xi’an Jiaotong University China National Engineering Laboratory for Big Data Analytics Xi’an Jiaotong University China

Weakly-supervised action localization aims to recognize and localize action instancese in untrimmed videos with only video-level labels. Most existing models rely on multiple instance learning(MIL), where the predictions of unlabeled instances are supervised by classifying labeled bags. The MIL-based methods are relatively well studied with cogent performance achieved on classification but not on localization. Generally, they locate temporal regions by the video-level classification but overlook the temporal variations of feature semantics. To address this problem, we propose a novel attention-based hierarchically-structured latent model to learn the temporal variations of feature semantics. Specifically, our model entails two components, the first is an unsupervised change-points detection module that detects change-points by learning the latent representations of video features in a temporal hierarchy based on their rates of change, and the second is an attention-based classification model that selects the change-points of the foreground as the boundaries. To evaluate the effectiveness of our model, we conduct extensive experiments on two benchmark datasets, THUMOS-14 and ActivityNetv1.3. The experiments show that our method outperforms current state-of-the-art methods, and even achieves comparable performance with fully-supervised methods. Copyright © 2023, The Authors. All rights reserved.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：