Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
Background With the development of the Internet,the topology optimization of wireless sensor networks has received increasing ***,traditional optimization methods often overlook the energy imbalance caused by node loa...
详细信息
Background With the development of the Internet,the topology optimization of wireless sensor networks has received increasing ***,traditional optimization methods often overlook the energy imbalance caused by node loads,which affects network *** To improve the overall performance and efficiency of wireless sensor networks,a new method for optimizing the wireless sensor network topology based on K-means clustering and firefly algorithms is *** K-means clustering algorithm partitions nodes by minimizing the within-cluster variance,while the firefly algorithm is an optimization algorithm based on swarm intelligence that simulates the flashing interaction between fireflies to guide the search *** proposed method first introduces the K-means clustering algorithm to cluster nodes and then introduces a firefly algorithm to dynamically adjust the *** The results showed that the average clustering accuracies in the Wine and Iris data sets were 86.59%and 94.55%,respectively,demonstrating good clustering *** calculating the node mortality rate and network load balancing standard deviation,the proposed algorithm showed dead nodes at approximately 50 iterations,with an average load balancing standard deviation of 1.7×10^(4),proving its contribution to extending the network *** This demonstrates the superiority of the proposed algorithm in significantly improving the energy efficiency and load balancing of wireless sensor networks to extend the network *** research results indicate that wireless sensor networks have theoretical and practical significance in fields such as monitoring,healthcare,and agriculture.
Federated learning is widely used to solve the problem of data decentralization and can provide privacy protectionfor data owners. However, since multiple participants are required in federated learning, this allows a...
详细信息
Federated learning is widely used to solve the problem of data decentralization and can provide privacy protectionfor data owners. However, since multiple participants are required in federated learning, this allows attackers tocompromise. Byzantine attacks pose great threats to federated learning. Byzantine attackers upload maliciouslycreated local models to the server to affect the prediction performance and training speed of the global model. Todefend against Byzantine attacks, we propose a Byzantine robust federated learning scheme based on backdoortriggers. In our scheme, backdoor triggers are embedded into benign data samples, and then malicious localmodels can be identified by the server according to its validation dataset. Furthermore, we calculate the adjustmentfactors of local models according to the parameters of their final layers, which are used to defend against datapoisoning-based Byzantine attacks. To further enhance the robustness of our scheme, each localmodel is weightedand aggregated according to the number of times it is identified as malicious. Relevant experimental data showthat our scheme is effective against Byzantine attacks in both independent identically distributed (IID) and nonindependentidentically distributed (non-IID) scenarios.
The growing dependence on deep learning models for medical diagnosis underscores the critical need for robust interpretability and transparency to instill trust and ensure responsible usage. This study investigates th...
详细信息
Images captured under severe weather conditions, such as haze and fog, suffer from image quality degradation caused by atmospheric particle diffusion. This degradation manifests as color fading, reduced contrast, and ...
详细信息
Traditional e-commerce recommendation systems often struggle with dynamic user preferences and a vast array of products,leading to suboptimal user *** address this,our study presents a Personalized Adaptive Multi-Prod...
详细信息
Traditional e-commerce recommendation systems often struggle with dynamic user preferences and a vast array of products,leading to suboptimal user *** address this,our study presents a Personalized Adaptive Multi-Product Recommendation System(PAMR)leveraging transfer learning and Bi-GRU(Bidirectional Gated Recurrent Units).Using a large dataset of user reviews from Amazon and Flipkart,we employ transfer learning with pre-trained models(AlexNet,GoogleNet,ResNet-50)to extract high-level attributes from product data,ensuring effective feature representation even with limited ***-GRU captures both spatial and sequential dependencies in user-item *** innovation of this study lies in the innovative feature fusion technique that combines the strengths of multiple transfer learning models,and the integration of an attention mechanism within the Bi-GRU framework to prioritize relevant *** approach addresses the classic recommendation systems that often face challenges such as cold start along with data sparsity difficulties,by utilizing robust user and item *** model demonstrated an accuracy of up to 96.9%,with precision and an F1-score of 96.2%and 96.97%,respectively,on the Amazon dataset,significantly outperforming the baselines and marking a considerable advancement over traditional *** study highlights the effectiveness of combining transfer learning with Bi-GRU for scalable and adaptive recommendation systems,providing a versatile solution for real-world applications.
Instance co-segmentation aims to segment the co-occurrent instances among two *** task heavily relies on instance-related cues provided by co-peaks,which are generally estimated by exhaustively exploiting all paired c...
详细信息
Instance co-segmentation aims to segment the co-occurrent instances among two *** task heavily relies on instance-related cues provided by co-peaks,which are generally estimated by exhaustively exploiting all paired candidates in point-to-point ***,such patterns could yield a high number of false-positive co-peaks,resulting in over-segmentation whenever there are mutual *** tackle with this issue,this paper proposes an instance co-segmentation method via tensor-based salient co-peak search(TSCPS-ICS).The proposed method explores high-order correlations via triple-to-triple matching among feature maps to find reliable co-peaks with the help of co-saliency *** proposed method is shown to capture more accurate intra-peaks and inter-peaks among feature maps,reducing the false-positive rate of co-peak *** having accurate co-peaks,one can efficiently infer responses of the targeted *** on four benchmark datasets validate the superior performance of the proposed method.
Apricot detection is a prerequisite for counting and harvesting tasks. Existing algorithms face challenges in adapting to the impacts of complex environmental factors such as lighting variations, shadows, dense foliag...
详细信息
To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentat...
详细信息
To enhance the efficiency and accuracy of environmental perception for autonomous vehicles,we propose GDMNet,a unified multi-task perception network for autonomous driving,capable of performing drivable area segmentation,lane detection,and traffic object ***,in the encoding stage,features are extracted,and Generalized Efficient Layer Aggregation Network(GELAN)is utilized to enhance feature extraction and gradient ***,in the decoding stage,specialized detection heads are designed;the drivable area segmentation head employs DySample to expand feature maps,the lane detection head merges early-stage features and processes the output through the Focal Modulation Network(FMN).Lastly,the Minimum Point Distance IoU(MPDIoU)loss function is employed to compute the matching degree between traffic object detection boxes and predicted boxes,facilitating model training *** results on the BDD100K dataset demonstrate that the proposed network achieves a drivable area segmentation mean intersection over union(mIoU)of 92.2%,lane detection accuracy and intersection over union(IoU)of 75.3%and 26.4%,respectively,and traffic object detection recall and mAP of 89.7%and 78.2%,*** detection performance surpasses that of other single-task or multi-task algorithm models.
With the arrival of the 5G era,wireless communication technologies and services are rapidly exhausting the limited spectrum *** auctions came into being,which can effectively utilize spectrum *** of the complexity of ...
详细信息
With the arrival of the 5G era,wireless communication technologies and services are rapidly exhausting the limited spectrum *** auctions came into being,which can effectively utilize spectrum *** of the complexity of the electronic spectrum auction network environment,the security of spectrum auction can not be *** scholars focus on researching the security of the single-sided auctions,while ignoring the practical scenario of a secure double spectrum auction where participants are composed of multiple sellers and *** begin to design the secure double spectrum auction mechanisms,in which two semi-honest agents are introduced to finish the spectrum auction *** these two agents may collude with each other or be bribed by buyers and sellers,which may create security risks,therefore,a secure double spectrum auction is proposed in this *** traditional secure double spectrum auctions,the spectrum auction server with Software Guard Extensions(SGX)component is used in this paper,which is an Ethereum blockchain platform that performs spectrum auctions.A secure double spectrum protocol is also designed,using SGX technology and cryptographic tools such as Paillier cryptosystem,stealth address technology and one-time ring signatures to well protect the private information of spectrum *** addition,the smart contracts provided by the Ethereum blockchain platform are executed to assist offline verification,and to verify important spectrum auction information to ensure the fairness and impartiality of spectrum ***,security analysis and performance evaluation of our protocol are discussed.
暂无评论