检索结果-内蒙古大学图书馆

Robust video question answering via contrastive cross-modality representation learning

science China(Information sciences) 2024年第10期67卷 211-226页

作者： Xun YANG Jianming ZENG Dan GUO Shanshan WANG Jianfeng DONG Meng WANG School of Information Science and Technology University of Science and Technology of China Institute of Artificial Intelligence Hefei Comprehensive National Science Center School of Computer Science and Information Engineering Hefei University of Technology Institutes of Physical Science and Information Technology Anhui University School of Computer Science and Technology Zhejiang Gongshang University

Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.

关键词： video question answering cross-modality fusion contrastive learning cross-media reasoning

来源：评论

学校读者我要写书评

暂无评论

Optimizing wireless sensor network topology with node load consideration

引用

虚拟现实与智能硬件(中英文) 2025年第1期7卷 47-61页

作者： Ruizhi CHEN School of Computer Engineering Zhanjiang University of Science and TechnologyZhanjiang 524094China

Background With the development of the Internet,the topology optimization of wireless sensor networks has received increasing ***,traditional optimization methods often overlook the energy imbalance caused by node loads,which affects network *** To improve the overall performance and efficiency of wireless sensor networks,a new method for optimizing the wireless sensor network topology based on K-means clustering and firefly algorithms is *** K-means clustering algorithm partitions nodes by minimizing the within-cluster variance,while the firefly algorithm is an optimization algorithm based on swarm intelligence that simulates the flashing interaction between fireflies to guide the search *** proposed method first introduces the K-means clustering algorithm to cluster nodes and then introduces a firefly algorithm to dynamically adjust the *** The results showed that the average clustering accuracies in the Wine and Iris data sets were 86.59%and 94.55%,respectively,demonstrating good clustering *** calculating the node mortality rate and network load balancing standard deviation,the proposed algorithm showed dead nodes at approximately 50 iterations,with an average load balancing standard deviation of 1.7×10^(4),proving its contribution to extending the network *** This demonstrates the superiority of the proposed algorithm in significantly improving the energy efficiency and load balancing of wireless sensor networks to extend the network *** research results indicate that wireless sensor networks have theoretical and practical significance in fields such as monitoring,healthcare,and agriculture.

关键词： Node load Wireless sensor network K-means clustering Firefly algorithm Topology optimization

来源：评论

学校读者我要写书评

暂无评论

Byzantine Robust Federated Learning Scheme Based on Backdoor Triggers

引用

computers, Materials & Continua 2024年第5期79卷 2813-2831页

作者： Zheng Yang Ke Gu Yiming Zuo School of Computer and Communication Engineering Changsha University of Science and TechnologyChangsha410114China

Federated learning is widely used to solve the problem of data decentralization and can provide privacy protectionfor data owners. However, since multiple participants are required in federated learning, this allows attackers tocompromise. Byzantine attacks pose great threats to federated learning. Byzantine attackers upload maliciouslycreated local models to the server to affect the prediction performance and training speed of the global model. Todefend against Byzantine attacks, we propose a Byzantine robust federated learning scheme based on backdoortriggers. In our scheme, backdoor triggers are embedded into benign data samples, and then malicious localmodels can be identified by the server according to its validation dataset. Furthermore, we calculate the adjustmentfactors of local models according to the parameters of their final layers, which are used to defend against datapoisoning-based Byzantine attacks. To further enhance the robustness of our scheme, each localmodel is weightedand aggregated according to the number of times it is identified as malicious. Relevant experimental data showthat our scheme is effective against Byzantine attacks in both independent identically distributed (IID) and nonindependentidentically distributed (non-IID) scenarios.

关键词： Federated learning Byzantine attacks backdoor triggers

来源：评论

学校读者我要写书评

暂无评论

Efficient breast cancer detection using neural networks and explainable artificial intelligence

引用

Neural Computing and Applications 2025年第5期37卷 3759-3776页

作者： Murugan, Tamilarasi Kathirvel Karthikeyan, Pritikaa Sekar, Pavithra School of Computer Science Engineering Vellore Institute of Technology Tamilnadu Chennai India

The growing dependence on deep learning models for medical diagnosis underscores the critical need for robust interpretability and transparency to instill trust and ensure responsible usage. This study investigates the efficacy of various explainable artificial intelligence (XAI) techniques in comprehending deep learning models utilized for breast cancer classification from down sampled histopathology images. A comparative assessment of multiple convolutional neural network (CNN) architectures, encompassing standard CNNs, ResNet, VGG-16, and VGG-19, on down sampled images was conducted. The primary goal is to pinpoint the model exhibiting the highest accuracy and subsequently employ three prominent XAI methods—LIME, SHAP, and Saliency Maps—to get insights into the top-performing model. This study identifies VGG-19 as the best-performing model with an accuracy of 92.59% and demonstrates that among various XAI techniques, LIME provides the most accurate and clinically relevant explanations for breast cancer classification from down sampled histopathology images. These findings, validated by medical professionals, enhance the interpretability and reliability of deep learning models in clinical settings, promoting their responsible integration into healthcare practices. This validation was further corroborated through consultation with medical professionals, including doctors specializing in breast cancer diagnosis. This research endeavors to deepen the understanding of the model’s rationale and instill confidence in its outputs. The outcomes of this study hold significant promise in elevating the interpretability and reliability of deep learning models tailored for breast cancer diagnosis, thus facilitating their responsible integration into clinical settings. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Research on Image Defogging Algorithm Based on Improved FFA-Net

IAENG International Journal of Computer Science

引用

IAENG International Journal of computer science 2024年第6期51卷 634-641页

作者： Qinrong, Li Chi, Ma Qiang, Guo Hui, Hu School of Computer Science and Software Engineering University of Science and Technology LiaoNing AnShan114051 China School of Computer Science and Engineering Huizhou University Huizhou516007 China

Images captured under severe weather conditions, such as haze and fog, suffer from image quality degradation caused by atmospheric particle diffusion. This degradation manifests as color fading, reduced contrast, and adversely affects the performance of various computer vision tasks. To address this, this paper presents an end-to-end feature fusion attention network (FFA-Net) designed to directly restore haze-free images. By incorporating the SSIM loss into the original loss function, the proposed method effectively captures the visual disparities between the estimated defogged image and the authentic haze-free image. Additionally, it mitigates the color distortion problem inherent in the original algorithm. To address the challenge of low brightness in input images, a low illumination enhancement module is introduced, seamlessly integrated with the FFA-Net defogging method. Subsequently, a comparative analysis of different defogging algorithms is conducted using two distinct foggy datasets. Multiple evaluation metrics are employed to assess the performance of these algorithms. The findings indicate that our algorithm significantly outperforms others in terms of objective indicators such as PSNR and SSIM, as well as visual effects. © (2024), (International Association of Engineers). All rights reserved.

关键词： Image enhancement

来源：评论

学校读者我要写书评

暂无评论

A Fusion Model for Personalized Adaptive Multi-Product Recommendation System Using Transfer Learning and Bi-GRU

引用

computers, Materials & Continua 2024年第12期81卷 4081-4107页

作者： Buchi Reddy Ramakantha Reddy Ramasamy Lokesh Kumar School of Computer Science and Engineering Vellore Institute of TechnologyVellore632014TamilnaduIndia

Traditional e-commerce recommendation systems often struggle with dynamic user preferences and a vast array of products,leading to suboptimal user *** address this,our study presents a Personalized Adaptive Multi-Product Recommendation System(PAMR)leveraging transfer learning and Bi-GRU(Bidirectional Gated Recurrent Units).Using a large dataset of user reviews from Amazon and Flipkart,we employ transfer learning with pre-trained models(AlexNet,GoogleNet,ResNet-50)to extract high-level attributes from product data,ensuring effective feature representation even with limited ***-GRU captures both spatial and sequential dependencies in user-item *** innovation of this study lies in the innovative feature fusion technique that combines the strengths of multiple transfer learning models,and the integration of an attention mechanism within the Bi-GRU framework to prioritize relevant *** approach addresses the classic recommendation systems that often face challenges such as cold start along with data sparsity difficulties,by utilizing robust user and item *** model demonstrated an accuracy of up to 96.9%,with precision and an F1-score of 96.2%and 96.97%,respectively,on the Amazon dataset,significantly outperforming the baselines and marking a considerable advancement over traditional *** study highlights the effectiveness of combining transfer learning with Bi-GRU for scalable and adaptive recommendation systems,providing a versatile solution for real-world applications.

关键词： Personalized recommendation systems transfer learning bidirectional gated recurrent units(Bi-GRU) performance metrics adaptive systems product reviews

来源：评论

学校读者我要写书评

暂无评论

Weakly-supervised instance co-segmentation via tensor-based salient co-peak search

引用

Frontiers of computer science 2024年第2期18卷 83-92页

作者： Wuxiu QUAN Yu HU Tingting DAN Junyu LI Yue ZHANG Hongmin CAI School of Computer Science Guangdong Polytechnic Normal UniversityGuangzhou 510665China School of Computer Science and Engineering South China University of TechnologyGuangzhou 510006China

Instance co-segmentation aims to segment the co-occurrent instances among two *** task heavily relies on instance-related cues provided by co-peaks,which are generally estimated by exhaustively exploiting all paired candidates in point-to-point ***,such patterns could yield a high number of false-positive co-peaks,resulting in over-segmentation whenever there are mutual *** tackle with this issue,this paper proposes an instance co-segmentation method via tensor-based salient co-peak search(TSCPS-ICS).The proposed method explores high-order correlations via triple-to-triple matching among feature maps to find reliable co-peaks with the help of co-saliency *** proposed method is shown to capture more accurate intra-peaks and inter-peaks among feature maps,reducing the false-positive rate of co-peak *** having accurate co-peaks,one can efficiently infer responses of the targeted *** on four benchmark datasets validate the superior performance of the proposed method.

关键词： weakly-supervised co-segmentation co-peak tensormatching deep network instance segmentation

来源：评论

学校读者我要写书评

暂无评论

An Apricot Detection Algorithm in Complex Environments Based on Improved YOLOv7

IAENG International Journal of Computer Science

引用

IAENG International Journal of computer science 2024年第12期51卷 2135-2144页

作者： Guo, Qiang Ma, Chi Hu, Hui School of Computer Science and Software Engineering University of Science and Technology LiaoNing AnShan114051 China School of Computer Science and Engineering Huizhou University Huizhou516007 China

Apricot detection is a prerequisite for counting and harvesting tasks. Existing algorithms face challenges in adapting to the impacts of complex environmental factors such as lighting variations, shadows, dense foliage, and the uneven distribution of samples in mechanized apricot harvesting. This paper proposes an enhanced model, YOLOv7-DC, based on YOLOv7, to address these challenges. YOLOv7-DC preprocesses diverse apricot tree samples to accommodate real-world harvesting detection scenarios. To improve model inference speed and detection accuracy, the detection network is redesigned with a new feature fusion method. DCNv2 is embedded within the efficient layer aggregation network (ELAN), and PConv is introduced to replace conventional convolutions, reducing the parameter impact of DCNv2. The training process incorporates the CBAM attention mechanism to enhance spatial and channel information. The ConvMixer architecture captures spatial and channel relationships transmitted to the detection head through the attention mechanism, improving the model’s detection accuracy for each specific classification sample. Experimental results show that YOLOv7-DC maintains good detection speed and recognition rates across various classification tasks. The improved model achieves a 6.2% increase in average detection accuracy compared to previous algorithms, with a 13% reduction in model parameters. YOLOv7-DC is better suited for handling imbalanced samples and complex environmental scenarios. © (2024), (International Association of Engineers). All rights reserved.

关键词： Apricot biloba detection Attention mechanism Feature fusion YOLOv7

来源：评论

学校读者我要写书评

暂无评论

GDMNet: A Unified Multi-Task Network for Panoptic Driving Perception

引用

computers, Materials & Continua 2024年第8期80卷 2963-2978页

作者： Yunxiang Liu Haili Ma Jianlin Zhu Qiangbo Zhang School of Computer Science and Information Engineering Shanghai Institute of TechnologyShanghai201418China

To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object ***,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient ***,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training *** results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,*** detection performance surpasses that of other single-task or multi-task algorithm models.

关键词： Autonomous driving multitask learning drivable area segmentation lane detection vehicle detection

来源：评论

学校读者我要写书评

暂无评论

A secure double spectrum auction scheme

引用

Digital Communications and Networks 2024年第5期10卷 1415-1427页

作者： Jiaqi Wang Ning Lu Ziyang Gong Wenbo Shi Chang Choi School of Computer Science and Engineering Northeastern UniversityShenyang110004China School of Computer Science and Technology Xidian UniversityXi©anChina Dept.of Computer Engineering Gachon University1342Seongnam-daeroSujeong-guSeongnam-si13120G School of Computer and Communication Engineering Northeastern UniversityQinhuangdao066004China

With the arrival of the 5G era,wireless communication technologies and services are rapidly exhausting the limited spectrum *** auctions came into being,which can effectively utilize spectrum *** of the complexity of the electronic spectrum auction network environment,the security of spectrum auction can not be *** scholars focus on researching the security of the single-sided auctions,while ignoring the practical scenario of a secure double spectrum auction where participants are composed of multiple sellers and *** begin to design the secure double spectrum auction mechanisms,in which two semi-honest agents are introduced to finish the spectrum auction *** these two agents may collude with each other or be bribed by buyers and sellers,which may create security risks,therefore,a secure double spectrum auction is proposed in this *** traditional secure double spectrum auctions,the spectrum auction server with Software Guard Extensions(SGX)component is used in this paper,which is an Ethereum blockchain platform that performs spectrum auctions.A secure double spectrum protocol is also designed,using SGX technology and cryptographic tools such as Paillier cryptosystem,stealth address technology and one-time ring signatures to well protect the private information of spectrum *** addition,the smart contracts provided by the Ethereum blockchain platform are executed to assist offline verification,and to verify important spectrum auction information to ensure the fairness and impartiality of spectrum ***,security analysis and performance evaluation of our protocol are discussed.

关键词： Secure double spectrum auction SGX technology Privacy information Ethereum platform Verification

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：