检索结果-内蒙古大学图书馆

ViGT: proposal-free video grounding with a learnable token in the transformer

science China(Information sciences) 2023年第10期66卷 196-212页

作者： Kun LI Dan GUO Meng WANG School of Computer Science and Information Engineering Hefei University of Technology Key Laboratory of Knowledge Engineering with Big Data Ministry of Education Intelligent Interconnected Systems Laboratory of Anhui Province Institute of Artificial Intelligence Hefei Comprehensive National Science Center

The video grounding(VG) task aims to locate the queried action or event in an untrimmed video based on rich linguistic descriptions. Existing proposal-free methods are trapped in the complex interaction between video and query, overemphasizing cross-modal feature fusion and feature correlation for VG. In this paper, we propose a novel boundary regression paradigm that performs regression token learning in a transformer. Particularly, we present a simple but effective proposal-free framework, namely video grounding transformer(ViGT), which predicts the temporal boundary using a learnable regression token rather than multi-modal or cross-modal features. In ViGT, the benefits of a learnable token are manifested as follows.(1) The token is unrelated to the video or the query and avoids data bias toward the original video and query.(2) The token simultaneously performs global context aggregation from video and query ***, we employed a sharing feature encoder to project both video and query into a joint feature space before performing cross-modal co-attention(i.e., video-to-query attention and query-to-video attention) to highlight discriminative features in each modality. Furthermore, we concatenated a learnable regression token [REG] with the video and query features as the input of a vision-language transformer. Finally, we utilized the token [REG] to predict the target moment and visual features to constrain the foreground and background probabilities at each timestamp. The proposed ViGT performed well on three public datasets:ANet-Captions, TACoS, and YouCookⅡ. Extensive ablation studies and qualitative analysis further validated the interpretability of ViGT.

关键词： video grounding temporal sentence grounding boundary regression token learning proposal-free

来源：评论

学校读者我要写书评

暂无评论

Automated Deep Learning Based Knee Osteoarthritis Joint Extraction and Classification 17

Automated Deep Learning Based Knee Osteoarthritis Joint Extr...

引用

17th International Conference on Open Source Systems and Technologies, ICOSST 2023

作者： Abbas, Muhammad Sohail Jamil, Sonain Khurshid, Atif Department of Data Science and Artificial Engineering Islamabad Pakistan Department of Computer Science Gjovik2815 Norway

ISBN: (纸本)9798350381320

Knee osteoarthritis (KOA) is a widespread global condition, impacting over 300 million individuals as per the World Health Organization (WHO). Particularly prevalent among older adults, knee OA is a prominent cause of disability. Its occurrence increases with age, especially after 50, and is more frequent in women, particularly post-menopause. Several studies have been carried out so far for automated grading and classification of knee osteoarthritis (KOA), but none of them built strong foundations enough to make this system automated. This study focuses on machine-controlled knee joint extraction and grading classification with improved accuracy and performance. We used the osteoarthritis initiative (OAI) dataset of X-ray images for our study. Initially, a single-stage detector is used for joint extraction of the knee area as the X-ray images contain entire knees with both joints. Enhanced osteoarthritis feature extraction (OAFE) and osteoarthritis dimensionality reduction (OADR) blocks are used for grading classification. We have significantly improved state-of-The-Art results. We have acquired joint extraction with a mean average precision (map) of 95.3% and grading classification accuracy of 78.93%. Furthermore, the performance due to the dimensionality reduction block has improved by a huge factor. © 2023 IEEE.

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

Enhancing Document-Level Relation Extraction through Entity-Pair-Level Interaction Modeling

Enhancing Document-Level Relation Extraction through Entity-...

引用

2025 IEEE International Conference on Acoustics, Speech, and Signal Processing, ICASSP 2025

作者： Liu, Wanlong Zeng, Dingyi Zhou, Li Xiao, Yichen Zhang, Malu Chen, Wenyu School of Computer Science and Engineering University of Electronic Science and Technology of China China School of Data Science The Chinese University of Hong Kong Shenzhen China

ISBN: (纸本)9798350368741

Document-level relation extraction aims at extracting relational facts between two entities in a document. Existing approaches mainly focus on target entities, utilizing techniques such as graph neural networks to enhance their representations. However, they ignore the rich semantic correlations among entity pairs which provide wider and multifaceted information at a higher level. In this paper, we propose the Relation-based Entity-pair-level Inference (REI) model, which facilitates information interaction at the entity-pair level, enhancing logical reasoning among entities and capturing semantic correlations among entity pairs. Our REI model comprises two modules: Relation-based Information Aggregation (RIA) and Entity-pair-level Information Interaction (EII). The RIA module builds and integrates relation representations to filter out distractions from unrelated entity pairs, while the EII module models entity-pair-level information interaction through multi-head attentions. Extensive experiments on the DocRED, DWIE, CDR, and GDA datasets demonstrate the superiority of the proposed REI model, outperforming previous state-of-the-art approaches. Furthermore, we provide detailed experimental analyses based on the performance gains and illustrate the interpretability. © 2025 IEEE.

关键词： Information Extraction Linguistic Inference Natural Language Processing Relation Extraction

来源：评论

学校读者我要写书评

暂无评论

Word-Diffusion: Diffusion-Based Handwritten Text Word Image Generation 27th

Word-Diffusion: Diffusion-Based Handwritten Text Word Image ...

引用

27th International Conference on Pattern Recognition, ICPR 2024

作者： Gurav, Aniket Krishnan, Narayanan C. Chanda, Sukalpa Department of Computer Science and Communication Østfold University College Halden1757 Norway Department of Data Science Indian Institute of Technology Palakkad Kerala India

ISBN: (纸本)9783031784941

Generating realistic handwritten word images that closely resemble a target style remains a challenging task in document image analysis. In recent years, deep learning techniques, such as Latent Diffusion Models (LDM), have shown promise in generating styled handwritten text. However, these models face significant challenges when creating images for ‘Out of Vocabulary’ (OOV) words, impacting their overall effectiveness. In this paper, we introduce an extended diffusion-based Handwritten generation method that incorporates a novel conditioning mechanism. It is based on the Pyramidal Histogram of Shapes (PHOS) representation, which takes into account the spatial and structural characteristics of the target handwriting style. By conditioning the diffusion model on input text, PHOS vector, and writer ID, our approach enables the generation of handwritten word images. Notably, our approach outperforms the original diffusion model, which only uses text and writer ID as conditions, in generating both in-sample and out-of-sample. Furthermore, we have developed a faster inference method that significantly reduces the number of steps required for generating the output. Through qualitative and quantitative evaluations, we demonstrate the effectiveness of our proposed method. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

Knowledge Leaks in data-Driven Business Models? Exploring Different Types of Knowledge Risks and Protection Measures

引用

Schmalenbach Journal of Business Research 2024年第3期76卷 357-396页

作者： Fruhwirth, Michael Pammer-Schindler, Viktoria Thalmann, Stefan Know-Center GmbH Institute for Interactive Systems and Data Science Faculty of Computer Science and Biomedical Engineering Graz University of Technology Silicon Austria Labs GmbH Business Analytics and Data Science-Center (BANDAS-Center) School of Business Economics and Social Sciences University of Graz

data-driven business models imply the inter-organisational exchange of data or similar value objects. data science methods enable organisations to discover patterns and eventually knowledge from data. Further, by training machine learning models, knowledge is materialised in those models. Thus, organisations might risk the exposure of competitive knowledge by sharing data-related value objects, such as data, models or predictions. Although knowledge risks have been studied in traditional business models, little research has been conducted in the direction of data-driven business models. In this explorative qualitative study, we conducted 28 expert interviews in three rounds (two exploratory and one evaluatory) and identified five types of risks along the three basic types of value objects: data, models and predictions. These risks depend on the context, i.e., when competitive knowledge could be discovered from shared value objects. We found that those risks can be mitigated by technology, contractual regulations, trusted relationships, and adjusting the business model design. In this study, we show that the risk of knowledge leakage is a relevant risk factor in data-driven business models. Overall, knowledge risks should be considered already during business model design, and their management requires an interdisciplinary approach via a balanced assessment. The level of knowledge protection from a technology perspective highly depends on computer science innovations and thus is a moving target. As an outlook, we suggest that knowledge risk will become even more relevant with the extensive usage of machine learning and artificial intelligence in data-driven business models. © The Author(s) 2024.

关键词： Business model innovation data analytics data-driven business models Knowledge risks Risk management Value objects

来源：评论

学校读者我要写书评

暂无评论

data-Efficient Radiology Report Generation via Similar Report Features Enhancement 3rd

Data-Efficient Radiology Report Generation via Similar Repo...

引用

3rd International Workshop on Applications of Medical Artificial Intelligence, AMAI 2024 held in conjunction with the 27th International Conference on Medical Image Computing and computer Assisted Intervention, MICCAI 2024

作者： Li, Yanfeng Sun, Jinghan Wang, Liansheng Department of Computer Science at School of Informatics Xiamen University Xiamen China National Institute for Data Science in Health and Medicine Xiamen University Xiamen China

ISBN: (纸本)9783031820069

The utilization of Artificial Intelligence in automatically generating radiology reports presents a promising solution for enhancing the efficiency of the diagnostic process and reducing human error. However, existing methods require training on large datasets of image-report pairs, which are often scarce. Moreover, the accuracy of reports generated with limited paired data significantly diminishes. To address these challenges, this study introduces a data-efficient method that integrates the retrieval of similar reports with text fusion enhancements to tackle the scarcity of image-report pairs and generate accurate radiology reports. Our method is compared with several state-of-the-art approaches, showing advancements on the MIMIC-CXR and IU X-ray benchmarks with the same limited data pairs. It achieves near-optimal results on MIMIC-CXR and comparable results on IU-Xray, highlighting not only its effectiveness and potential to improve radiological diagnosis with fewer image reports but also its ability to generate more accurate reports. By enhancing cross-modal feature interaction and demonstrating higher diagnostic accuracy, this work contributes to the fields of clinical medicine and artificial intelligence. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Radiology

来源：评论

学校读者我要写书评

暂无评论

Phasmatodea Population Evolution Algorithm Based on Spiral Mechanism and Its Application to data Clustering

引用

computers, Materials & Continua 2025年第4期83卷 475-496页

作者： Jeng-Shyang Pan Mengfei Zhang Shu-Chuan Chu Xingsi Xue Václav Snášel College of Computer Science and Engineering Shandong University of Science and TechnologyQingdao266590China School of Artificial Intelligence Nanjing University of Information Science and TechnologyNanjing210044China Department of Information Management Chaoyang University of TechnologyTaichung41349TaiwanChina Fujian Provincial Key Laboratory of Big Data Mining and Applications Fujian University of TechnologyFuzhou350118China Faculty of Electrical Engineering and Computer Science VŠB-Technical University of OstravaOstrava70833Czech Republic

data clustering is an essential technique for analyzing complex datasets and continues to be a central research topic in data *** clustering algorithms,such as K-means,are widely used due to their simplicity and *** paper proposes a novel Spiral Mechanism-Optimized Phasmatodea Population Evolution Algorithm(SPPE)to improve clustering *** SPPE algorithm introduces several enhancements to the standard Phasmatodea Population Evolution(PPE)***,a Variable Neighborhood Search(VNS)factor is incorporated to strengthen the local search capability and foster population ***,a position update model,incorporating a spiral mechanism,is designed to improve the algorithm’s global exploration and convergence ***,a dynamic balancing factor,guided by fitness values,adjusts the search process to balance exploration and exploitation *** performance of SPPE is first validated on CEC2013 benchmark functions,where it demonstrates excellent convergence speed and superior optimization results compared to several state-of-the-art metaheuristic *** further verify its practical applicability,SPPE is combined with the K-means algorithm for data clustering and tested on seven *** results show that SPPE-K-means improves clustering accuracy,reduces dependency on initialization,and outperforms other clustering *** study highlights SPPE’s robustness and efficiency in solving both optimization and clustering challenges,making it a promising tool for complex data analysis tasks.

关键词： Phasmatodea population evolution algorithm data clustering meta-heuristic algorithm

来源：评论

学校读者我要写书评

暂无评论

Anchor data Augmentation 37

Anchor Data Augmentation

引用

37th Conference on Neural Information Processing Systems, NeurIPS 2023

作者： Schneider, Nora Goshtasbpour, Shirin Perez-Cruz, Fernando Computer Science Department ETH Zurich Zurich Switzerland Swiss Data Science Center Zurich Switzerland

ISBN: (纸本)9781713899921

We propose a novel algorithm for data augmentation in nonlinear over-parametrized regression. Our data augmentation algorithm borrows from the literature on causality and extends the recently proposed Anchor regression (AR) method for data augmentation, which is in contrast to the current state-of-the-art domain-agnostic solutions that rely on the Mixup literature. Our Anchor data Augmentation (ADA) uses several replicas of the modified samples in AR to provide more training examples, leading to more robust regression predictions. We apply ADA to linear and nonlinear regression problems using neural networks. ADA is competitive with state-of-the-art C-Mixup solutions. © 2023 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Open-Vocabulary Calibration for Fine-tuned CLIP 41

Open-Vocabulary Calibration for Fine-tuned CLIP

引用

41st International Conference on Machine Learning, ICML 2024

作者： Wang, Shuoyuan Wang, Jindong Wang, Guoqing Zhang, Bob Zhou, Kaiyang Wei, Hongxin Department of Statistics and Data Science Southern University of Science and Technology Shenzhen China Department of Computer and Information Science University of Macau Taipa China William & Mary WilliamsburgVA United States School of Computer Science and Engineering University of Electronic Science and Technology of China China Department of Computer Science Hong Kong Baptist University Hong Kong

Vision-language models (VLMs) have emerged as formidable tools, showing their strong capability in handling various open-vocabulary tasks in image recognition, text-driven visual content generation, and visual chatbots, to name a few. In recent years, considerable efforts and resources have been devoted to adaptation methods for improving the downstream performance of VLMs, particularly on parameter-efficient fine-tuning methods like prompt learning. However, a crucial aspect that has been largely overlooked is the confidence calibration problem in fine-tuned VLMs, which could greatly reduce reliability when deploying such models in the real world. This paper bridges the gap by systematically investigating the confidence calibration problem in the context of prompt learning and reveals that existing calibration methods are insufficient to address the problem, especially in the open-vocabulary setting. To solve the problem, we present a simple and effective approach called Distance-Aware Calibration (DAC), which is based on scaling the temperature using as guidance the distance between predicted text labels and base classes. The experiments with 7 distinct prompt learning methods applied across 11 diverse downstream datasets demonstrate the effectiveness of DAC, which achieves high efficacy without sacrificing the inference speed. Our code is available at https://***/mlstat-Sustech/CLIP Calibration. Copyright 2024 by the author(s)

关键词： Calibration

来源：评论

学校读者我要写书评

暂无评论

Functional Connectivity Disruptions in Alzheimer’s Disease: A Maximum Flow Perspective 12th

Functional Connectivity Disruptions in Alzheimer’s Disease...

引用

12th International Conference on Computational Advances in Bio and Medical sciences, ICCABS 2023

作者： Stubby, Emma T. Razavi, Seyed Majid Khanmohammadi, Sina School of Computer Science University of Oklahoma Norman73019 United States Data Science and Analytics Institute University of Oklahoma Norman73019 United States

ISBN: (纸本)9783031827679

Alzheimer’s disease is a neurological disorder characterized by functional and structural atrophy, leading to symptoms like memory loss and cognitive decline. This study seeks to analyze the disruptions of functional connectivity pathways within the brain caused by Alzheimer’s disease from the maximum flow perspective. More specifically, we computed the maximum flow pathways within the functional brain networks, and compared it between healthy controls and Alzheimer’s patients. Our results suggest that the Alzheimer’s patients utilize pathways related to the default mode network (DMN) more frequently and display significant alterations in the usage of paths connected to the striate cortex (SC). The increased usage of DMN pathways might point to a compensation mechanism that facilitates interregional communications in Alzheimer’s patients. Understanding the nature of such a compensation mechanism could help develop new treatment options for Alzheimer’s patients. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Neurodegenerative diseases

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：