检索结果-内蒙古大学图书馆

15th International Conference on Machine Learning and Computing, ICMLC 2023

作者： Mejhed Mkhinini, Meriem Sidibe, Aboubacar Sidiki Benali, Khaoula Bentaarit, Nouha Khelifi, Aymen Kaisens Data Paris France ENSI National School of Computer Science Tunis Tunisia

ISBN: (纸本)9781450398411

The amount of violent content shared on social networks makes it a very unhealthy space. Hence the birth of a growing research domain that involves filtering social media content using Artificial Intelligence-powered violence detection systems. In this paper, we propose a new approach based on deep learning to address this issue. We use a two layered model: First, a deep representation-based model that uses transfer learning concept to recognize violent content in a video. Second a text classifier to detect verbal violence using the audio cue. The result reports show that our approach is outperforming state-of-the art accuracies by learning most discriminating features, achieving 90% accuracy on the test set for physical violence detection and 89% for verbal violence detection. © 2023 ACM.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Identification of Post Covid Symptoms in Long Haulers 3

Identification of Post Covid Symptoms in Long Haulers

引用

3rd IEEE International Conference on ICT in Business Industry and Government, ICTBIG 2023

作者： Jadhav, Aditi Bhamare, Mamta S School of Computer Engineering and Technology Mit Wpu Data Science and Analysis Pune India

ISBN: (纸本)9798350343274

Covid 19 is a disastrous infection that the whole world tackled for 2 years from 2020-to 2021. As this virus was new and doctors had no idea about it, they treated patients to save lives with all their possible experiences. This virus was contagious, and lockdown took place, people faced financial problems due to loss of jobs or businesses getting shut down for months, many people could not get admitted to hospitals and lost their life due to lack of facilities. While the whole world was dealing with these numerous issues, there was one kind of group of people who successfully recovered from Covid19 but, they faced some changes in their health like some inability they didn't have before infecting of Covid19 or worsened health issues that were mild in them pre Covid19 infection. These are called post Covid19 symptoms which are seen in post Covid19 patients, and such patients are termed as 'Long Haulers'. Long Covid19 symptoms are different in different people, some face mild symptoms like headache, fatigue to severe symptoms where we see an effect on vital organs. In some cases, the effect on the vital organ is irreversible. The conception behind this study is to rack up the data of post Covid19 patients and perform analysis on such data and upskill a ML model, this trained model then project what disease the post Covid19 patient is prone to by taking all his/her details. This idea will help users to get an idea beforehand and take prophylactic measures by conferring with a respective specialist. © 2023 IEEE.

关键词： COVID-19

来源：评论

学校读者我要写书评

暂无评论

data-Efficient Radiology Report Generation via Similar Report Features Enhancement 3rd

Data-Efficient Radiology Report Generation via Similar Repo...

引用

3rd International Workshop on Applications of Medical Artificial Intelligence, AMAI 2024 held in conjunction with the 27th International Conference on Medical Image Computing and computer Assisted Intervention, MICCAI 2024

作者： Li, Yanfeng Sun, Jinghan Wang, Liansheng Department of Computer Science at School of Informatics Xiamen University Xiamen China National Institute for Data Science in Health and Medicine Xiamen University Xiamen China

ISBN: (纸本)9783031820069

The utilization of Artificial Intelligence in automatically generating radiology reports presents a promising solution for enhancing the efficiency of the diagnostic process and reducing human error. However, existing methods require training on large datasets of image-report pairs, which are often scarce. Moreover, the accuracy of reports generated with limited paired data significantly diminishes. To address these challenges, this study introduces a data-efficient method that integrates the retrieval of similar reports with text fusion enhancements to tackle the scarcity of image-report pairs and generate accurate radiology reports. Our method is compared with several state-of-the-art approaches, showing advancements on the MIMIC-CXR and IU X-ray benchmarks with the same limited data pairs. It achieves near-optimal results on MIMIC-CXR and comparable results on IU-Xray, highlighting not only its effectiveness and potential to improve radiological diagnosis with fewer image reports but also its ability to generate more accurate reports. By enhancing cross-modal feature interaction and demonstrating higher diagnostic accuracy, this work contributes to the fields of clinical medicine and artificial intelligence. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Radiology

来源：评论

学校读者我要写书评

暂无评论

A Novelty Framework in Image-Captioning with Visual Attention-Based Refined Visual Features

引用

computers, Materials & Continua 2025年第3期82卷 3943-3964页

作者： Alaa Thobhani Beiji Zou Xiaoyan Kui Amr Abdussalam Muhammad Asim Mohammed ELAffendi Sajid Shah School of Computer Science and Engineering Central South UniversityChangsha410083China Electronic Engineering and Information ScienceDepartment University of Science and Technology of ChinaHefei230026China EIAS Data Science Lab College of Computer and Information SciencesPrince Sultan UniversityRiyadh11586Saudi Arabia

Image captioning,the task of generating descriptive sentences for images,has advanced significantly with the integration of semantic ***,traditional models still rely on static visual features that do not evolve with the changing linguistic context,which can hinder the ability to form meaningful connections between the image and the generated *** limitation often leads to captions that are less accurate or *** this paper,we propose a novel approach to enhance image captioning by introducing dynamic interactions where visual features continuously adapt to the evolving linguistic *** model strengthens the alignment between visual and linguistic elements,resulting in more coherent and contextually appropriate ***,we introduce two innovative modules:the Visual Weighting Module(VWM)and the Enhanced Features Attention Module(EFAM).The VWM adjusts visual features using partial attention,enabling dynamic reweighting of the visual inputs,while the EFAM further refines these features to improve their relevance to the generated *** continuously adjusting visual features in response to the linguistic context,our model bridges the gap between static visual features and dynamic language *** demonstrate the effectiveness of our approach through experiments on the MS-COCO dataset,where our method outperforms state-of-the-art techniques in terms of caption quality and contextual *** results show that dynamic visual-linguistic alignment significantly enhances image captioning performance.

关键词： Image-captioning visual attention deep learning visual features

来源：评论

学校读者我要写书评

暂无评论

An Efficient Partial Video Copy Detection for a Large-scale Video database 9

An Efficient Partial Video Copy Detection for a Large-scale ...

引用

9th International Conference on Big data Computing and Communications, BigCom 2023

作者： Luo, Zhan Zhang, Lan Lai, Jiewei Wang, Xinming Tang, Chen University of Science and Technology of China School of Data Science Hefei China University of Science and Technology of China School of Computer Science and Technology Hefei China

ISBN: (纸本)9798350331240

Partial video copy detection (PVCD) aims to discover copy segments of query videos from a video database, which plays an important role in video copyright protection, filtering, tracking, etc. For a large-scale video database, PVCD can be divided into two stages: the first stage involves searching for video-level copies of the query video in the database, and the second stage is to further localize the copy segments within the video-level copies. Thus, two major challenges arise: (1) efficiently and effectively calculating the similarity between videos;(2) localizing mixed-duration video pairs. To address the above challenges, we propose an efficient PVCD approach for a large-scale video database, based on the Bag-of-Words (BoW) framework, which decouples video-level similarity and copy localization into cell-level. This approach consists of two modules. The first is an efficient video similarity measurement (VSM) module for the large-scale video database. VSM aggregates cell-level similarity into video-level similarity, and with a dual index, it greatly improves retrieval speed while accurately measuring spatiotemporal transformations. The second is a greedy pattern detection (GPD) module for video copy localization. GPD quickly and accurately detects similarity patterns through a greedy strategy on the similarity matrix formed by matching frames in each cell, then aggregates them into complete predicted copy segments. On the comprehensive dataset self-SVD, VSM significantly outperforms state-of-the-art methods by 7.28% in mAP, and the retrieval speed is increased by over 318 times. Additionally, for short videos at the scale of hundreds of millions, the response speed can theoretically reach seconds. On the copy localization dataset MIX, composed of mixed-duration videos, GPD also achieves the best performance. © 2023 IEEE.

关键词： Aggregates

来源：评论

学校读者我要写书评

暂无评论

An Amalgamated CNN-Transformer Network for Lightweight Image Super-Resolution

Journal of Network Intelligence

引用

Journal of Network Intelligence 2024年第3期9卷 1376-1387页

作者： Fang, Jinsheng Lin, Hanjiang Zeng, Kun School of Computer Science and Engineering Minnan Normal University Zhangzhou363000 China Key Laboratory of Data Science and Intelligence Application Fujian Province University Fujian Zhangzhou363000 China School of Computer and Control engineering Fujian Provincial Key Laboratory of Information Processing and Intelligent Control Minjiang University Fuzhou350108 China

Recently, Transformer-based methods for single image super-resolution (SISR) have achieved better performance advantages than the methods based on convolutional neural network (CNN). Exploiting self-attention mechanism to model global context definitely improves the SR results. However, the neglect of local information will bring inevitable reduction of the network performance. In this work, we propose an Amalgamated CNN-Transformer network for lightweight SR, namely ACTSR. Specifically, an amalgamated CNN-Transformer block (ACTB) is developed to extract the useful information of both local and global features. By employing stacked ACTBs, our ACTSR extracts more informative deep features beneficially for super-resolution reconstruction to improve network performance while keeps lightweight and flexible enough. Extensive experiments on commonly used benchmark datasets validate our ACTSR outperforms the advanced competitors. Our codes are available at: https://***/ginsengf/ACTSR. © 2024, Taiwan Ubiquitous Information CO LTD. All rights reserved.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

RAM-CG: Relation-aware Adaptive Model for Continual Graph Learning 2

RAM-CG: Relation-aware Adaptive Model for Continual Graph Le...

引用

2nd Asia Conference on Advanced Robotics, Automation, and Control Engineering, ARACE 2023

作者： Shen, Qinghua Liu, Guiquan University of Science and Technology of China School of Data Science Hefei China University of Science and Technology of China School of Computer Science and Technology Hefei China

ISBN: (纸本)9798350328363

Neural networks excel at mining the static graph, but how to learn from streaming graphs without forgetting previous knowledge is an emerging challenge and well known as continual graph learning (CGL). Despite recent progress in this area, two significant challenges persist: 1) most of existing works only manipulate on the intermediate graph embedding and ignore intrinsic properties of graphs, which limits the learning method's pertinence. 2) recent attempts obscure the transferable knowledge and lack explicit description for the transfer process. It is non-trivial to tell what and how the information is transferred across graphs. In this paper, we point out that latent relations behind graph edges, as an intrinsic graph property, can be attributed as an invariant factor for the evolving graph sequence and lead a possible knowledge transfer. Motivated by this, we design a relation-aware adaptive model, dubbed as RAM-CG, which consists of a relation analysis module to explore latent relations behind edges for message-passing and a task-awareness masking classifier to account for the shifting knowledge in the graph sequence. Extensive experiments show that RAM-CG provides significant 2.2%, 4.2% and 6.6% accuracy improvements over the state-of-the-art results on CitationNet, OGBN-arxiv and TWITCH dataset. © 2023 IEEE.

关键词： Message passing

来源：评论

学校读者我要写书评

暂无评论

A novel approach for image retrieval in remote sensing using vision-language-based image caption generation

引用

Multimedia Tools and Applications 2025年第6期84卷 2985-3014页

作者： Yadav, Prem Shanker Tyagi, Dinesh Kumar Vipparthi, Santosh Kumar Department of Computer Science and Engineering Malaviya National Institute of Technology Rajasthan Jaipur302017 India School of Artificial Intelligence and Data Engineering Indian Institute of Technology Ropar Punjab Rupnagar140001 India

Recent advancements in satellite technologies have resulted in the emergence of Remote Sensing (RS) images. Hence, the primary imperative research domain is designing a precise retrieval model for retrieving the most pertinent images based on the query. Present Remote Sensing Image Retrieval (RSIR) systems use visual descriptors to characterize the primitives (such as various land-cover types) that are visible in the images. However, the visual descriptors are inadequate for defining the complicated content of RS images. To solve this problem, a new model is devised for image retrieval based on image captions. The goal is to generate textual illustrations with captions to define relations amongst objects precisely. Here, image captioning is attained based on the vision-language pre-training model. The image captions are utilized for generating features like term frequency-inverse document frequency (TF-IDF), length of text, and Bag of Words. Meanwhile, query text is utilized wherein features like TF-IDF, text length, and Bag of Words are obtained. The similarity between query text features and the image captions features has been computed on the basis of a hybrid similarity measure wherein weights are tuned with the proposed Honey Badger Political Optimizer (HBPO) to retrieve the image. The proposed HBPO provided enhanced efficiency with elevated precision of 93.3%, recall of 93.7%, F1-score of 93.5%, and Recall-Oriented Understudy for Gisting Evaluation (ROUGE) of 0.441. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2024.

关键词： Visual languages

来源：评论

学校读者我要写书评

暂无评论

CMNEE: A Large-Scale Document-Level Event Extraction dataset based on Open-Source Chinese Military News 30

CMNEE: A Large-Scale Document-Level Event Extraction Dataset...

引用

Joint 30th International Conference on Computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024

作者： Zhu, Mengna Xu, Zijie Zeng, Kaisheng Xiao, Kaiming Wang, Mao Ke, Wenjun Huang, Hongbin Laboratory for Big Data and Decision National University of Defense Technology China School of Computer Science and Engineering Southeast University China Computer Science and Technology Tsinghua University China

ISBN: (纸本)9782493814104

Extracting structured event knowledge, including event triggers and corresponding arguments, from military texts is fundamental to many applications, such as intelligence analysis and decision assistance. However, event extraction in the military field faces the data scarcity problem, which impedes the research of event extraction models in this domain. To alleviate this problem, we propose CMNEE, a large-scale, document-level open-source Chinese Military News Event Extraction dataset. It contains 17,000 documents and 29,223 events, which are all manually annotated based on a pre-defined schema for the military domain including 8 event types and 11 argument role types. We designed a two-stage, multi-turns annotation strategy to ensure the quality of CMNEE and reproduced several state-of-the-art event extraction models with a systematic evaluation. The experimental results on CMNEE fall shorter than those on other domain datasets obviously, which demonstrates that event extraction for military domain poses unique challenges and requires further research efforts. Our code and data can be obtained from https://***/Mzzzhu/CMNEE. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

关键词： data mining

来源：评论

学校读者我要写书评

暂无评论

Question-Answering Pair Matching Based on Question Classification and Ensemble Sentence Embedding

引用

computer Systems science & Engineering 2023年第9期46卷 3471-3489页

作者： Jae-Seok Jang Hyuk-Yoon Kwon Department of Computer Science and Engineering Seoul National University of Science and TechnologySeoul01811Korea Department of Industrial Engineering/Graduate School of Data Science Seoul National University of Science and TechnologySeoul01811Korea

Question-answering(QA)models find answers to a given *** necessity of automatically finding answers is increasing because it is very important and challenging from the large-scale QA data *** this paper,we deal with the QA pair matching approach in QA models,which finds the most relevant question and its recommended answer for a given *** studies for the approach performed on the entire dataset or datasets within a category that the question writer manually *** contrast,we aim to automatically find the category to which the question belongs by employing the text classification model and to find the answer corresponding to the question within the *** to the text classification model,we can effectively reduce the search space for finding the answers to a given ***,the proposed model improves the accuracy of the QA matching model and significantly reduces the model inference ***,to improve the performance of finding similar sentences in each category,we present an ensemble embedding model for sentences,improving the performance compared to the individual embedding *** real-world QA data sets,we evaluate the performance of the proposed QA matching *** a result,the accuracy of our final ensemble embedding model based on the text classification model is 81.18%,which outperforms the existing models by 9.81%∼14.16%***,in terms of the model inference speed,our model is faster than the existing models by 2.61∼5.07 times due to the effective reduction of search spaces by the text classification model.

关键词： Question-answering text classification model data augmentation text embedding

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：