检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Buettner, Kyle Kovashka, Adriana Intelligent Systems Program Department of Computer Science University of Pittsburgh United States

There is a scarcity of multilingual vision-language models that properly account for the perceptual differences that are reflected in image captions across languages and cultures. In this work, through a multimodal, multilingual retrieval case study, we quantify the existing lack of model flexibility. We empirically show performance gaps between training on captions that come from native German perception and captions that have been either machine-translated or human-translated from English into German. To address these gaps, we further propose and evaluate caption augmentation strategies. While we achieve mean recall improvements (+1.3), gaps still remain, indicating an open area of future work for the community. Copyright © 2024, The Authors. All rights reserved.

关键词： Machine translation

来源：评论

学校读者我要写书评

暂无评论

An Approach to Formation Control of UAVs Based on Applying Adapted Kohonen Neural Network

An Approach to Formation Control of UAVs Based on Applying A...

引用

2023 IEEE Ural-Siberian Conference on Biomedical Engineering, Radioelectronics and Information Technology, USBEREIT 2023

作者： Khachumov, Mikhail Ailamazyan Program Systems Institute of Ras Intelligent Control Laboratory Pereslavl-Zalessky Russia Federal Research Center' Computer Science and Control' of Ras Moscow Russia

ISBN: (纸本)9798350336054

The paper gives a statement and considers the solution of an urgent scientific problem of formation control for a group of unmanned aerial vehicles (UAVs) operating in an unstable environment. To construct the reference trajectories and allocate UAVs to positions of a given structure, an original approach is proposed based on the adapted Kohonen neural network with a set of metrics, including Euclidean, Mahalanobis and Euclidean-Mahalanobis distances. To implement the movement of a UAV group in a complex environment, we apply the principles of intelligent-geometric control, which allows to combine flexible intelligent and precise geometric control methods within one concept. When moving along a given route, UAVs use a set of allowable control strategies. The developed modeling scheme is designed to take into account wind loads and possibly dangerous rapprochement of vehicles. To reflect the dynamics of UAVs, a special module is used that contains transfer functions integrated into a single stabilization system with an autopilot. The proposed approach has shown promise when simulating a series of formation control problems. © 2023 IEEE.

关键词： Unmanned aerial vehicles (UAV)

来源：评论

学校读者我要写书评

暂无评论

Adding Argumentation into Human Evaluation of Long Document Abstractive Summarization: A Case Study on Legal Opinions 4

Adding Argumentation into Human Evaluation of Long Document ...

引用

4th Workshop on Human Evaluation of NLP systems, HumEval 2024

作者： Elaraby, Mohamed Xu, Huihui Gray, Morgan Ashley, Kevin Litman, Diane Department of Computer Science School of Computing and Information United States Learning Research and Development Center United States Intelligent Systems Program School of Computing Information University of Pittsburgh PittsburghPA United States

ISBN: (纸本)9782493814418

Human evaluation remains the gold standard for assessing abstractive summarization. However, current practices often prioritize constructing evaluation guidelines for fluency, coherence, and factual accuracy, overlooking other critical dimensions. In this paper, we investigate argument coverage in abstractive summarization by focusing on long legal opinions, where summaries must effectively encapsulate the document’s argumentative nature. We introduce a set of human-evaluation guidelines to evaluate generated summaries based on argumentative coverage. These guidelines enable us to assess three distinct summarization models, studying the influence of including argument roles in summarization. Furthermore, we utilize these evaluation scores to benchmark automatic summarization metrics against argument coverage, providing insights into the effectiveness of automated evaluation methods. © 2024 European Language Resources Association (ELRA).

关键词： Human Evaluation Legal Summarization Summarization

来源：评论

学校读者我要写书评

暂无评论

Using LLMs to Discover Legal Factors 37

Using LLMs to Discover Legal Factors

引用

37th Annual Conference on Legal Knowledge and Information systems, JURIX 2024

作者： Gray, Morgan Savelka, Jaromir Oliver, Wesley Ashley, Kevin Intelligent Systems Program University of Pittsburgh United States School of Law University of Pittsburgh United States School of Computer Science Carnegie Mellon University United States School of Law Duquesne University United States

ISBN: (纸本)9781643685625

Factors are a foundational component of legal analysis and computational models of legal reasoning. These factor-based representations enable lawyers, judges, and AI and Law researchers to reason about legal cases. In this paper, we introduce a methodology that leverages large language models (LLMs) to discover lists of factors that effectively represent a legal domain. Our method takes as input raw court opinions and produces a set of factors and associated definitions. We demonstrate that a semi-automated approach, incorporating minimal human involvement, produces factor representations that can predict case outcomes with moderate success, if not yet as well as expert-defined factors can. © 2024 The Authors.

关键词： Case based reasoning

来源：评论

学校读者我要写书评

暂无评论

The Model of Managing a Group of UAVs by a Single Operator for Tasks Requiring Increased Attention

The Model of Managing a Group of UAVs by a Single Operator f...

引用

2023 International Conference on Industrial Engineering, Applications and Manufacturing, ICIEAM 2023

作者： Khachumov, Mikhail Khachumov, Vyacheslav Intelligent Control Laboratory Ailamazyan Program Systems Institute of Ras Pereslavl-Zalessky Russia Rudn University Moscow Russia Federal Research Center Computer Science and Control of Ras Moscow Russia

ISBN: (纸本)9781665475952

The problem of optimizing the load on an operator of unmanned aerial vehicles (UAVs), which performs real-time tasks of researching and monitoring territories in an unstable environment is considered. Working load depends on the intensity of the information exchange between onboard and ground control stations and is determined by the agreed limit that leads to a limitation in the number of managed vehicles. As a mathematical model of the multi-functional control system, we use a closed-loop queuing system, which allows to establish coordination among its major elements (onboard computing complex, ground control station and operator) and reduce the loss of service requests. The latter is one of the main requirements to functioning of the system during emergencies. Of considerable interest are optimization statements for the problems of distribution of functions between elements and choosing the load modes. We gave formulations of some optimization problems in a general form and made appropriate conclusions. It is assumed that as an applied task, the problem of monitoring fire sites in forest areas can be considered, which requires increased attention from the operator to process information coming to the ground control station. © 2023 IEEE.

关键词： Distribution functions

来源：评论

学校读者我要写书评

暂无评论

Solution to the Problem of Passing Over the Given Targets by an Unmanned Aerial Vehicle in an Unstable Environment

Solution to the Problem of Passing Over the Given Targets by...

引用

2023 International Russian Automation Conference, RusAutoCon 2023

作者： Khachumov, Mikhail Intelligent Control Laboratory Ailamazyan Program Systems Institute of Ras Pereslavl-Zalessky Russia Federal Research Center "Computer Science and Control"of Ras Moscow Russia Rudn University Moscow Russia

ISBN: (纸本)9798350345551

The paper gives a statement and considers the solution to an urgent problem of flying over the given targets by an unmanned aerial vehicle (UAV) in unstable conditions. A criterion is formulated for constructing efficient routes and passing arbitrary located sites according to a given formal description of the terrain map with the presence of interferences. Heuristic rules are proposed aimed at minimizing the graph of the initial map and reducing the number of alternative flight routes, given the time constraints for completing the task. The search of the solution in this case can be carried out according to the base tree of routes. A heuristic algorithm for searching efficient routes in a limited time is considered, which has the complexity that ensures its implementation on the UAV onboard computer. It is substantiated that the proposed approach can significantly reduce the number of comparing alternatives for flying around given arbitrary located targets in comparison with the well-known methods for solving this problem. The paper describes a successful example of solving the UAV route planning problem with the proposed approach, which is complicated by the presence of wind flows affecting the speed and trajectory. © 2023 IEEE.

关键词： Motion planning

来源：评论

学校读者我要写书评

暂无评论

Strategies to Leverage Foundational Model Knowledge in Object Affordance Grounding

Strategies to Leverage Foundational Model Knowledge in Objec...

引用

IEEE computer Society Conference on computer Vision and Pattern Recognition Workshops (CVPRW)

作者： Arushi Rai Kyle Buettner Adriana Kovashka Department of Computer Science University of Pittsburgh PA USA Intelligent Systems Program University of Pittsburgh PA USA

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

An important task for intelligent systems is affordance grounding, where the goal is to locate regions on an object where an action can be performed. Past weakly supervised approaches learn from human-object interaction (HOI) by transferring grounding knowledge from exocentric to ego-centric views of an object. The use of HOI priors is inherently noisy and thus provides a limited source of supervision. To address this challenge, we identify that recent foundational models (i.e. VLMs and LLMs) can serve as auxiliary sources of knowledge for frameworks due to their vast world knowledge. In this work, we propose strategies to extract and leverage foundational model knowledge related to attributes and object parts to enhance an HOI-based affordance grounding framework. In particular, we propose to combine HOI and foundational model priors through (1) a spatial consistency loss and (2) heatmap aggregation. Our strategies result in mKLD and mNSS improvements, and insights suggest future directions for improving affordance grounding capabilities.

关键词： Heating systems computer vision Grounding Affordances Conferences Pipelines Pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Investigating the Role of Attribute Context in Vision-Language Models for Object Recognition and Detection

Investigating the Role of Attribute Context in Vision-Langua...

引用

IEEE Workshop on Applications of computer Vision (WACV)

作者： Kyle Buettner Adriana Kovashka Intelligent Systems Program University of Pittsburgh PA USA Department of Computer Science University of Pittsburgh PA USA

Vision-language alignment learned from image-caption pairs has been shown to benefit tasks like object recognition and detection. Methods are mostly evaluated in terms of how well object class names are learned, but captions also contain rich attribute context that should be considered when learning object alignment. It is unclear how methods use this context in learning, as well as whether models succeed when tasks require attribute and object understanding. To address this gap, we conduct extensive analysis of the role of attributes in vision-language models. We specifically measure model sensitivity to the presence and meaning of attribute context, gauging influence on object embeddings through unsupervised phrase grounding and classification via description methods. We further evaluate the utility of attribute context in training for open-vocabulary object detection, fine-grained text-region retrieval, and attribution tasks. Our results show that attribute context can be wasted when learning alignment for detection, attribute meaning is not adequately considered in embeddings, and describing classes by only their attributes is ineffective. A viable strategy that we find to increase benefits from attributes is contrastive training with adjective-based negative captions.

关键词：

来源：评论

学校读者我要写书评

暂无评论

LEATHER: A Framework for Learning to Generate Human-like Text in Dialogue 2

LEATHER: A Framework for Learning to Generate Human-like Tex...

引用

2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, AACL-IJCNLP 2022

作者： Sicilia, Anthony Alikhani, Malihe Intelligent Systems Program University of Pittsburgh PittsburghPA United States Computer Science Department University of Pittsburgh PittsburghPA United States

ISBN: (纸本)9781959429043

Algorithms for text-generation in dialogue canbe misguided. For example, in task-orientedsettings, reinforcement learning that optimizesonly task-success can lead to abysmal lexical diversity. We hypothesize this is due to poor theoretical understanding of the objectives in textgeneration and their relation to the learning process (i.e., model training). To this end, we propose a new theoretical framework for learningto generate text in dialogue. Compared to existing theories of learning, our framework allowsfor analysis of the multi-faceted goals inherent to text-generation. We use our frameworkto develop theoretical guarantees for learnersthat adapt to unseen data. As an example, weapply our theory to study data-shift within a cooperative learning algorithm proposed for theGuessWhat?! visual dialogue game. From thisinsight, we propose a new algorithm, and empirically, we demonstrate our proposal improvesboth task-success and human-likeness of thegenerated text. Finally, we show statistics fromour theory are empirically predictive of multiple qualities of the generated dialogue, suggesting our theory is useful for model-selectionwhen human evaluations are not available. © AACL-IJCNLP *** rights reserved

关键词： Learning algorithms

来源：评论

学校读者我要写书评

暂无评论

Decomposition and Reorganization of Phonetic Information for Speaker Embedding Learning

引用

IEEE/ACM Transactions on Audio Speech and Language Processing 2023年 31卷 1745-1757页

作者： Hong, Qian-Bei Wu, Chung-Hsien Wang, Hsin-Min National Cheng Kung University and Academia Sinica Graduate Program of Multimedia Systems and Intelligent Computing Tainan701 Taiwan Academia Sinica Taipei115 Taiwan National Cheng Kung University Department of Computer Science and Information Engineering Tainan701401 Taiwan Academia Sinica Institute of Information Science Taipei115 Taiwan

Speech content is closely related to the stability of speaker embeddings in speaker verification tasks. In this paper, we propose a novel architecture based on self-constraint learning (SCL) and reconstruction task (RT) to remove the influence of phonetic information on speaker embedding generation. First, SCL is used to reduce the divergence of frame-level features, which can avoid ambiguity between the resulting embeddings of the two utterances being compared. Second, RT is used to further remove phonetic information in frame-level layers, focusing on speaker-discriminative feature transformation. In our experiments, the speaker embedding models were trained on the VoxCeleb2 dataset and evaluated on the VoxCeleb1, Librispeech, SITW and VoxMovies datasets. Experimental results on VoxCeleb1 show that the proposed DROP-TDNN system reduced the EER by 7.5%, compared to the state-of-the-art ECAPA-TDNN system. Furthermore, the proposed DROP-TDNN system also outperformed the ECAPA-TDNN system in the experiments on SITW, Librispeech and VoxMovies under cross-dataset conditions. In the experiments on SITW, the proposed system reduced the EER by 3.4% compared to the ECAPA-TDNN system. In the experiments on Librispeech, the proposed system demonstrated the advantage of removing phonetic information under the clean speech condition, with a significant reduction of 25.5% in EER compared to the ECAPA-TDNN system. In the experiments on VoxMovies, the proposed system reduced the EER by up to 7.9% compared to the ECAPA-TDNN system under different pronunciation and background conditions. © 2014 IEEE.

关键词： Speech recognition

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：