Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
Artificial Intelligence (AI) became part of our daily life. Due to the increased capability of computation power, AI significantly impacts human life and changing the social systems as well. On one side AI making our ...
详细信息
The domain of natural language processing has reached to a level where various tools are available for automatically identifying and correcting grammatical errors in a wide range of languages text. Though these tools ...
详细信息
Breast cancer poses a threat to women’s health and contributes to an increase in mortality rates. Mammography has proven to be an effective tool for the early detection of breast cancer. However, it faces many challe...
详细信息
Breast cancer poses a threat to women’s health and contributes to an increase in mortality rates. Mammography has proven to be an effective tool for the early detection of breast cancer. However, it faces many challenges in early breast cancer detection due to poor image quality, traditional segmentation, and feature extraction. Therefore, this work addresses these issues and proposes an attention-based backpropagation convolutional neural network (ABB-CNN) to detect breast cancer from mammogram images more accurately. The proposed work includes image enhancement, reinforcement learning-based semantic segmentation (RLSS), and multiview feature extraction and classification. The image enhancement is performed by removing noise and artefacts through a hybrid filter (HF), image scaling through a pixel-based bilinear interpolation (PBI), and contrast enhancement through an election-based optimization (EO) algorithm. In addition, the RLSS introduces intelligent segmentation by utilizing a deep Q network (DQN) to segment the region of interest (ROI) strategically. Moreover, the proposed ABB-CNN facilitates multiview feature extraction from the segmented region to classify the mammograms into normal, malignant, and benign classes. The proposed framework is evaluated on the collected and the digital database for screening mammography (DDSM) datasets. The proposed framework provides better outcomes in terms of accuracy, sensitivity, specificity, precision, f-measure, false-negative rate (FNR) and area under the curve (AUC). This work achieved (99.20%, 99.35%), (99.56%, 99.66%), (98.96%, 98.99%), (99.05%, 99.12%), (0.44%, 0.34%), (99.31%, 99.39%) and (99.27%, 99.32%) of accuracy, sensitivity, specificity, precision, FNR, f-measure and AUC on (collected, DDSM datasets), respectively. This research addresses the prevalent challenges in breast cancer identification and offers a robust and highly accurate solution by integrating advanced deep-learning techniques. The evaluated re
The threat posed by phishing websites is significant in the digital era due to the widespread use of online activities, which has prompted the need for a comprehensive remedy. The project involves building a website t...
详细信息
The rapid growth of the IoT and wearable device industries has expanded remote patient monitoring, but healthcare systems face security vulnerabilities inherent in their client/server architecture. To address these ch...
详细信息
The variability of the output power of distributed renewable energy sources(DRESs)that originate from the fastchanging climatic conditions can negatively affect the grid ***,grid operators have incorporated ramp-rate ...
详细信息
The variability of the output power of distributed renewable energy sources(DRESs)that originate from the fastchanging climatic conditions can negatively affect the grid ***,grid operators have incorporated ramp-rate limitations(RRLs)for the injected DRES power in the grid *** the DRES penetration levels increase,the mitigation of high-power ramps is no longer considered as a system support function but rather an ancillary service(AS).Energy storage systems(ESSs)coordinated by RR control algorithms are often applied to mitigate these power ***,no unified definition of active power ramps,which is essential to treat the RRL as AS,currently *** paper assesses the various definitions for ramp-rate RR and proposes RRL method control for a central battery ESS(BESS)in distribution systems(DSs).The ultimate objective is to restrain high-power ramps at the distribution transformer level so that RRL can be traded as AS to the upstream transmission system(TS).The proposed control is based on the direct control of theΔP/Δt,which means that the control parameters are directly correlated with the RR requirements included in the grid *** addition,a novel method for restoring the state of charge(So C)within a specific range following a high ramp-up/down event is ***,a parametric method for estimating the sizing of central BESSs(BESS sizing for short)is *** BESS sizing is determined by considering the RR requirements,the DRES units,and the load mix of the examined *** BESS sizing is directly related to the constant RR achieved using the proposed ***,the proposed methodologies are validated through simulations in MATLAB/Simulink and laboratory tests in a commercially available BESS.
Inefficient task scheduling schemes compromise network performance and increase latency for delay intolerant tasks. Cybertwin based 6G services support data logging of operational queries for appropriate resource allo...
详细信息
The rapid increase in vehicle traffic volume in modern societies has raised the need to develop innovative solutions to reduce traffic congestion and enhance traffic management *** advanced technology,such as Intellig...
详细信息
The rapid increase in vehicle traffic volume in modern societies has raised the need to develop innovative solutions to reduce traffic congestion and enhance traffic management *** advanced technology,such as Intelligent Transportation Systems(ITS),enables improved traffic management,helps eliminate congestion,and supports a safer *** provides real-time information on vehicle traffic and transportation systems that can improve decision-making for road ***,ITS suffers from routing issues at the network layer when utilising Vehicular Ad Hoc Networks(VANETs).This is because each vehicle plays the role of a router in this network,which leads to a complex vehicle communication network,causing issues such as repeated link breakages between vehicles resulting from the mobility of the network and rapid topological *** may lead to loss or delay in packet transmissions;this weakness can be exploited in routing attacks,such as black-hole and gray-hole attacks,that threaten the availability of ITS *** this paper,a Blockchain-based smart contracts model is proposed to offer convenient and comprehensive security mechanisms,enhancing the trustworthiness between ***-Classification Blockchain-Based Contracts(SCBC)and Voting-Classification Blockchain-Based Contracts(VCBC)are utilised in the proposed *** results show that VCBC succeeds in attaining better results in PDR and TP performance even in the presence of Blackhole and Grayhole attacks.
Portable sinks assume a vital part in information assortment within wireless sensor networks (WSNs), offering dynamic and flexible solutions to address the limitations of static sink-based architectures. Unlike their ...
详细信息
暂无评论