Fog computing is an emerging paradigm that provides services near the end-user. The tremendous increase in IoT devices and big data leads to complexity in fog resource allocation. Inefficient resource allocation can l...
详细信息
Neural network-based encoder and decoder are one of the emerging techniques for image compression. To improve the compression rate, these models use a special module called the quantizer that improves the entropy of t...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
The Quantum Internet of Things (QIoT) in the healthcare industry holds the promise of transforming patient care, diagnostics, and medical research. Quantum-enhanced sensors, communication, and computation offer unprec...
详细信息
The Quantum Internet of Things (QIoT) in the healthcare industry holds the promise of transforming patient care, diagnostics, and medical research. Quantum-enhanced sensors, communication, and computation offer unprecedented capabilities that can revolutionize how healthcare services are delivered and experienced. This paper explores the potential of QIoT in the context of smart healthcare, where interconnected quantum-enabled devices and systems create an ecosystem that enhances data security, enables real-time monitoring, and advances medical knowledge. We delve into the applications of quantum sensors in precise health monitoring, the role of quantum communication in secure telemedicine, and the computational power of quantum computing in drug discovery and personalized medicine. We discuss challenges such as technical feasibility, scalability, and regulatory considerations, along with the emerging trends and opportunities in this transformative field. By examining the intersection of quantum technologies and smart healthcare, this paper aims to shed light on the novel approaches and breakthroughs that could redefine the future of healthcare delivery and patient outcomes. IEEE
In recent years, IoT has transformed personal environments by integrating diverse smart devices. This paper presents an advanced IoT architecture that optimizes network infrastructure, focusing on the adoption of MQTT...
详细信息
This study proposes a contactless and real-time hand gesture recognition system suitable for smartwatches. The proposed system adopts inductive proximity sensing to collect Mechanomyography (MMG) signals induced by fi...
详细信息
If adversaries were to obtain quantum computers in the future, their massive computing power would likely break existing security schemes. Since security is a continuous process, more substantial security schemes must...
详细信息
Crude oil prices (COP) profoundly influence global economic stability, with fluctuations reverberating across various sectors. Accurate forecasting of COP is indispensable for governments, policymakers, and stakeholde...
详细信息
Chatbots use artificial intelligence (AI) and natural language processing (NLP) algorithms to construct a clever system. By copying human connections in the most helpful way possi-ble, chatbots emulate individuals and...
详细信息
Text mining, a subfield of natural language processing (NLP), has received considerable attention in recent years due to its ability to extract valuable insights from large volumes of unstructured textual data. This r...
详细信息
暂无评论