检索结果-内蒙古大学图书馆

Robust video question answering via contrastive cross-modality representation learning

science China(Information sciences) 2024年第10期67卷 211-226页

作者： Xun YANG Jianming ZENG Dan GUO Shanshan WANG Jianfeng DONG Meng WANG School of Information Science and Technology University of Science and Technology of China Institute of Artificial Intelligence Hefei Comprehensive National Science Center School of Computer Science and Information Engineering Hefei University of Technology Institutes of Physical Science and Information Technology Anhui University School of Computer Science and Technology Zhejiang Gongshang University

Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.

关键词： video question answering cross-modality fusion contrastive learning cross-media reasoning

来源：评论

学校读者我要写书评

暂无评论

A survey on augmenting knowledge graphs (KGs) with large language models (LLMs): models, evaluation metrics, benchmarks, and challenges

引用

Discover Artificial Intelligence 2024年第1期4卷 1-28页

作者： Ibrahim, Nourhan Aboulela, Samar Ibrahim, Ahmed Kashef, Rasha Electrical Computer and Biomedical Engineering Toronto Metropolitan University TorontoON Canada Faculty of Engineering Alexandria University Alexandria Egypt Computer Science Western University LondonON Canada

Integrating Large Language Models (LLMs) with Knowledge Graphs (KGs) enhances the interpretability and performance of AI systems. This research comprehensively analyzes this integration, classifying approaches into three fundamental paradigms: KG-augmented LLMs, LLM-augmented KGs, and synergized frameworks. The evaluation examines each paradigm’s methodology, strengths, drawbacks, and practical applications in real-life scenarios. The findings highlight the substantial impact of these integrations in fundamentally improving real-time data analysis, efficient decision-making, and promoting innovation across various domains. In this paper, we also describe essential evaluation metrics and benchmarks for assessing the performance of these integrations, addressing challenges like scalability and computational overhead, and providing potential solutions. This comprehensive analysis underscores the profound impact of these integrations on improving real-time data analysis, enhancing decision-making efficiency, and fostering innovation across various domains. © The Author(s) 2024.

关键词： Knowledge graph

来源：评论

学校读者我要写书评

暂无评论

A Novel CAPTCHA Recognition System Based on Refined Visual Attention

引用

computers, Materials & Continua 2025年第4期83卷 115-136页

作者： Zaid Derea Beiji Zou Xiaoyan Kui Monir Abdullah Alaa Thobhani Amr Abdussalam School of Computer Science and Engineering Central South UniversityChangsha410083China College of Computer Science and Information Technology Wasit UniversityWasit52001Iraq Department of Computer Science and Artificial Intelligence College of Computing and Information TechnologyUniversity of BishaBisha67714Saudi Arabia Electronic Engineering and Information Science Department University of Science and Technology of ChinaHefei230026China

Improving website security to prevent malicious online activities is crucial,and CAPTCHA(Completely Automated Public Turing test to tell computers and Humans Apart)has emerged as a key strategy for distinguishing human users from automated ***-based CAPTCHAs,designed to be easily decipherable by humans yet challenging for machines,are a common form of this ***,advancements in deep learning have facilitated the creation of models adept at recognizing these text-based CAPTCHAs with surprising *** our comprehensive investigation into CAPTCHA recognition,we have tailored the renowned UpDown image captioning model specifically for this *** approach innovatively combines an encoder to extract both global and local features,significantly boosting the model’s capability to identify complex details within CAPTCHA *** the decoding phase,we have adopted a refined attention mechanism,integrating enhanced visual attention with dual layers of Long Short-Term Memory(LSTM)networks to elevate CAPTCHA recognition *** rigorous testing across four varied datasets,including those from Weibo,BoC,Gregwar,and Captcha 0.3,demonstrates the versatility and effectiveness of our *** results not only highlight the efficiency of our approach but also offer profound insights into its applicability across different CAPTCHA types,contributing to a deeper understanding of CAPTCHA recognition technology.

关键词： Text-based CAPTCHA recognition refined visual attention web security computer vision

来源：评论

学校读者我要写书评

暂无评论

A Novel Enhanced Approach for Security and Privacy Preserving in IoT Devices with Federal Learning Technique

引用

SN computer science 2024年第6期5卷 1-17页

作者： Moeed, Syed Abdul Karnati, Ramesh Ashmitha, G. Mohammad, Gouse Baig Mohanty, Sachi Nandan Department of Computer Science and Engineering Kakatiya Institute of Technology and Science Department of Computer Science and Engineering Vardhaman College of Engineering School of Computer Science & amp Engineering (SCOPE) VIT-AP University

The Internet of Things (IoT) has revolutionized our lives, but it has also introduced significant security and privacy challenges. The vast amount of data collected by these devices, often containing sensitive information, makes them prime targets for cyberattacks. Traditional security methods struggle to keep pace with the evolving threat landscape of the interconnected IoT world. In response, this paper explores the transformative potential of federated learning (FL) in safeguarding both the privacy and security of IoT devices. FL keeps data on individual devices, only sharing updated models, not raw data, thus protecting user privacy and eliminating the need for a central data storage server, which reduces the risk of data breaches and security vulnerabilities. The effectiveness of the proposed model is evaluated using established open-source datasets like NSL-KDD and UNSW NB15, ensuring real-world applicability. Analysis of the dataset's features enabled the development of a model utilizing federated learning. Notably, the proposed model achieved superior performance with FL in detecting attacks on IoT networks. Furthermore, this research investigates the transformative potential of FL to address the inherent security and privacy challenges plaguing traditional, centralized data collection methods in the ever-expanding realm of IoT devices. FL empowers collaborative learning on distributed datasets, allowing devices to collectively improve security without compromising user privacy. The findings of this study demonstrate the promise of FL as a secure and privacy-preserving solution for the future of IoT. © The Author(s), under exclusive licence to Springer Nature Singapore Pte Ltd. 2024.

关键词： Cyberattacks Federal learning Internet of Things Machine learning Security and privacy

来源：评论

学校读者我要写书评

暂无评论

Deep learning-based method for detection of copy-move forgery in videos

引用

Neural Computing and Applications 2025年第14期37卷 8451-8464页

作者： Elbarougy, Reda Abdelfatah, Osama Behery, G.M. El-Badry, Noha M. Department of Artificial Intelligence and Data Science College of Computer Science and Engineering University of Ha’il Ha’il81481 Saudi Arabia Department of Information Technology Faculty of Computer and Artificial Intelligence Damietta University New Damietta Kafr Saad Egypt Department of Computer Science Faculty of Computer and Artificial Intelligence Damietta University New Damietta Kafr Saad Egypt Department of Mathematics Faculty of Science Damietta University New Damietta Kafr Saad Egypt

Video forgery is one of the most serious problems affecting the credibility and reliability of video content. Therefore, detecting video forgery presents a major challenge for researchers due to the diversity of forgery types, the modernity of the programs used in forgery operations, and the abundance of information and content present in videos. The seriousness of this issue arises from the widespread use of videos in vital fields that require very high accuracy with no room for doubt or error, such as courtrooms, journalism, and others. Copy-move forgery is one of the most common and dangerous types of video forgery because of the difficulty in identifying it with the naked eye and the great diversity of forgery techniques involved in this particular type. One of the biggest challenges researchers have faced in the past is the complexity of the steps required to detect forgery in videos, leading to significant computational complexity and time consumption. The proposed method also aims to achieve better results than previous methods while reducing computational operations. Ultimately, forgery is detected with great efficiency. Compared to previous methods used for detecting copy-move forgery in the Rewind dataset, the proposed method achieves the highest F1, reaching 0.86, with a significant difference of 0.13 compared to the best result of the previous methods. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2025.

关键词： Deep learning

来源：评论

学校读者我要写书评

暂无评论

A systematic mapping to investigate the application of machine learning techniques in requirement engineering activities

引用

CAAI Transactions on Intelligence Technology 2024年第6期9卷 1412-1434页

作者： Shoaib Hassan Qianmu Li Khursheed Aurangzeb Affan Yasin Javed Ali Khan Muhammad Shahid Anwar School of Computer Science and Engineering Nanjing University of Science and TechnologyNanjingJiangsuChina Department of Computer Engineering College of Computer and Information SciencesKing Saud UniversityRiyadhSaudi Arabia School of Software Northwestern Polytechnical UniversityXianShaanxiChina Department of Computer Science School of PhysicsEngineering&Computer ScienceUniversity of HertfordshireHatfieldUK Department of AI and Software Gachon University Seongnam-siSeongnamSouth Korea

Over the past few years,the application and usage of Machine Learning(ML)techniques have increased exponentially due to continuously increasing the size of data and computing *** the popularity of ML techniques,only a few research studies have focused on the application of ML especially supervised learning techniques in Requirement engineering(RE)activities to solve the problems that occur in RE *** authors focus on the systematic mapping of past work to investigate those studies that focused on the application of supervised learning techniques in RE activities between the period of 2002–*** authors aim to investigate the research trends,main RE activities,ML algorithms,and data sources that were studied during this ***-five research studies were selected based on our exclusion and inclusion *** results show that the scientific community used 57 *** those algorithms,researchers mostly used the five following ML algorithms in RE activities:Decision Tree,Support Vector Machine,Naïve Bayes,K-nearest neighbour Classifier,and Random *** results show that researchers used these algorithms in eight major RE *** activities are requirements analysis,failure prediction,effort estimation,quality,traceability,business rules identification,content classification,and detection of problems in requirements written in natural *** selected research studies used 32 private and 41 public data *** most popular data sources that were detected in selected studies are the Metric Data Programme from NASA,Predictor Models in Software engineering,and iTrust Electronic Health Care System.

关键词： data sources machine learning requirement engineering supervised learning algorithms

来源：评论

学校读者我要写书评

暂无评论

AInvR:Adaptive Learning Rewards for Knowledge Graph Reasoning Using Agent Trajectories

引用

Tsinghua science and Technology 2023年第6期28卷 1101-1114页

作者： Hao Zhang Guoming Lu Ke Qin Kai Du School of Computer Science and Engineering University of Electronic Science and Technology of ChinaChengdu 611731China

Multi-hop reasoning for incomplete Knowledge Graphs(KGs)demonstrates excellent interpretability with decent *** Learning(RL)based approaches formulate multi-hop reasoning as a typical sequential decision *** intractable shortcoming of multi-hop reasoning with RL is that sparse reward signals make performance *** mainstream methods apply heuristic reward functions to counter this ***,the inaccurate rewards caused by heuristic functions guide the agent to improper inference paths and unrelated object *** this end,we propose a novel adaptive Inverse Reinforcement Learning(IRL)framework for multi-hop reasoning,called AInvR.(1)To counter the missing and spurious paths,we replace the heuristic rule rewards with an adaptive rule reward learning mechanism based on agent’s inference trajectories;(2)to alleviate the impact of over-rewarded object entities misled by inaccurate reward shaping and rules,we propose an adaptive negative hit reward learning mechanism based on agent’s sampling strategy;(3)to further explore diverse paths and mitigate the influence of missing facts,we design a reward dropout mechanism to randomly mask and perturb reward parameters for the reward learning *** results on several benchmark knowledge graphs demonstrate that our method is more effective than existing multi-hop approaches.

关键词： Knowledge Graph Reasoning(KGR) Inverse Reinforcement Learning(IRL) multi-hop reasoning

来源：评论

学校读者我要写书评

暂无评论

An enhanced AMBTC for color image compression using color palette

引用

Multimedia Tools and Applications 2024年第11期83卷 31783-31803页

作者： Xiong, Lizhi Zhang, Mengtao Yang, Ching-Nung Kim, Cheonshik School of Computer Science Nanjing University of Information Science and Technology Nanjing China Department of Computer Science and Information Engineering National Dong Hwa University Hualien Taiwan Department of Computer Engineering Sejong University Seoul Korea Republic of

Digital image has been used in various fields as an essential carrier. Many color images have been constantly produced since their more realistic description, which takes up much storage space and network bandwidth. Thus, color image compression has become an essential key technology. Absolute Moment Block Truncation Coding (AMBTC) has been widely studied as one of the classical image compression methods. However, in the existing methods, the visual quality of the reconstructed images and the compression rate are all relatively low. Therefore, this paper proposes an enhanced AMBTC for color image compression using a color palette. In the proposed method, the K-means clustering algorithm is utilized for training the image's palette pattern. The color palette obtained by K-mean will be more suitable for reconstructing this image than the standard color palette, and the visual quality will be higher. The six clustered central pixels are matched with the palette through a color difference formula, and the obtained index values are used as the quantization levels. Huffman coding is used to build a bitmap to achieve a higher compression rate, that is, a lower bit rate. At last, a block of a color image can be represented by six index values and a bitmap. Experimental results and theoretical analysis demonstrate that the proposed method has better visual quality and bit rate than similar schemes. © The Author(s), under exclusive licence to Springer science+Business Media, LLC, part of Springer Nature 2023.

关键词： Color

来源：评论

学校读者我要写书评

暂无评论

A Review on the Use of Smart Agriculture Systems to Protect Crops 3

A Review on the Use of Smart Agriculture Systems to Protect ...

引用

3rd International Conference on Electronics and Renewable Systems, ICEARS 2025

作者： Kumar, Gulshan Gourshettiwar, Palash Suman, Saurabh Kumar, Shravan Kumar, Dhiraj Raushan, Rahul Faculty of Engineering and Technology Department of Computer Science and Design Maharashtra Wardha India Faculty of Engineering and Technology Department of Computer Science and Medical Engineering Maharashtra Wardha India

ISBN: (纸本)9798331509675

Smart agriculture systems leverage the possibilities offered by cutting-edge technologies such as IoT, AI, and remote sensing to revolutionize conventional farming by enhancing resource utilization, production, and crop damage mitigation. Real-time monitoring of soil and crop health, predictive analytics, pest control, and precision irrigation measures are all enabled by these systems. They are able to address major Indian agriculture issues, consequently boosting yield and profitability and promoting environmental sustainability. The largescale deployment of intelligent agriculture systems will change the agriculture landscape in India and will assure long-term food security for an ever-growing population. Challenges include adequate research and future studies in order to better install and achieve smart agricultural systems to protect crops. Intelligent agriculture involves all advanced research, including science and innovations, in national development through space technologies to enhance soil quality, conserve water, and facilitate agriculture information. Space ventures will undergo improved modernization through the introduction of crop sprayers, precision gene editors, epigenetics, big data analytics, IoT, wind and photovoltaic smart energy, AI-enabled robotic applications, and wide-scale desalination technologies. Implementing digital farming systems in developing economies will help their sectors as 85 percent of the global population is set to live in developing countries by 2030. Automation will prove to be necessary since food scarcity is on the rise along with resource wastage. Control strategies such as the IoT, aerial imagery, machine learning, and artificial intelligence will boost production and prevent soil degradation. These advanced technologies are also able to alleviate such issues as plant disease detection, pesticide management, and water application. The introduction of the Internet of Things in the agricultural research world has started

关键词： Smart agriculture

来源：评论

学校读者我要写书评

暂无评论

Detection of Left Ventricular Cavity from Cardiac MRI Images Using Faster R-CNN

引用

computers, Materials & Continua 2023年第1期74卷 1819-1835页

作者： Zakarya Farea Shaaf Muhammad Mahadi Abdul Jamil Radzi Ambar Ahmed Abdu Alattab Anwar Ali Yahya Yousef Asiri Faculty of Electrical and Electronic Engineering Universiti Tun Hussein Onn MalaysiaParit RajaBatu Pahat86400JohorMalaysia Department of Computer Science College of Science and ArtsSharurahNajran UniversityNajran61441Saudi Arabia Department of Computer Science Faculty of Computer Science and Info.SystemsThamar UniversityDhamar87246Yemen Department of Computer Science College of Computer Science and Information SystemsNajran UniversityNajran61441Saudi Arabia

The automatic localization of the left ventricle(LV)in short-axis magnetic resonance(MR)images is a required step to process cardiac images using convolutional neural networks for the extraction of a region of interest(ROI).The precise extraction of the LV’s ROI from cardiac MRI images is crucial for detecting heart disorders via cardiac segmentation or ***,this task appears to be intricate due to the diversities in the size and shape of the LV and the scattering of surrounding tissues across different ***,this study proposed a region-based convolutional network(Faster R-CNN)for the LV localization from short-axis cardiac MRI images using a region proposal network(RPN)integrated with deep feature classification and *** was trained using images with corresponding bounding boxes(labels)around the LV,and various experiments were applied to select the appropriate layers and set the suitable *** experimental findings showthat the proposed modelwas adequate,with accuracy,precision,recall,and F1 score values of 0.91,0.94,0.95,and 0.95,*** model also allows the cropping of the detected area of LV,which is vital in reducing the computational cost and time during segmentation and classification ***,itwould be an ideal model and clinically applicable for diagnosing cardiac diseases.

关键词： Cardiac short-axis MRI images automatic left ventricle localization deep learning models faster R-CNN

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：