The Telecare Medicine Information System (TMIS) revolutionizes healthcare delivery by integrating medical equipment and sensors, facilitating proactive and cost-effective services. Accessible online, TMIS empowers pat...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts...
详细信息
Video question answering(VideoQA) is a challenging yet important task that requires a joint understanding of low-level video content and high-level textual semantics. Despite the promising progress of existing efforts, recent studies revealed that current VideoQA models mostly tend to over-rely on the superficial correlations rooted in the dataset bias while overlooking the key video content, thus leading to unreliable results. Effectively understanding and modeling the temporal and semantic characteristics of a given video for robust VideoQA is crucial but, to our knowledge, has not been well investigated. To fill the research gap, we propose a robust VideoQA framework that can effectively model the cross-modality fusion and enforce the model to focus on the temporal and global content of videos when making a QA decision instead of exploiting the shortcuts in datasets. Specifically, we design a self-supervised contrastive learning objective to contrast the positive and negative pairs of multimodal input, where the fused representation of the original multimodal input is enforced to be closer to that of the intervened input based on video perturbation. We expect the fused representation to focus more on the global context of videos rather than some static keyframes. Moreover, we introduce an effective temporal order regularization to enforce the inherent sequential structure of videos for video representation. We also design a Kullback-Leibler divergence-based perturbation invariance regularization of the predicted answer distribution to improve the robustness of the model against temporal content perturbation of videos. Our method is model-agnostic and can be easily compatible with various VideoQA backbones. Extensive experimental results and analyses on several public datasets show the advantage of our method over the state-of-the-art methods in terms of both accuracy and robustness.
Background With the development of the Internet,the topology optimization of wireless sensor networks has received increasing ***,traditional optimization methods often overlook the energy imbalance caused by node loa...
详细信息
Background With the development of the Internet,the topology optimization of wireless sensor networks has received increasing ***,traditional optimization methods often overlook the energy imbalance caused by node loads,which affects network *** To improve the overall performance and efficiency of wireless sensor networks,a new method for optimizing the wireless sensor network topology based on K-means clustering and firefly algorithms is *** K-means clustering algorithm partitions nodes by minimizing the within-cluster variance,while the firefly algorithm is an optimization algorithm based on swarm intelligence that simulates the flashing interaction between fireflies to guide the search *** proposed method first introduces the K-means clustering algorithm to cluster nodes and then introduces a firefly algorithm to dynamically adjust the *** The results showed that the average clustering accuracies in the Wine and Iris data sets were 86.59%and 94.55%,respectively,demonstrating good clustering *** calculating the node mortality rate and network load balancing standard deviation,the proposed algorithm showed dead nodes at approximately 50 iterations,with an average load balancing standard deviation of 1.7×10^(4),proving its contribution to extending the network *** This demonstrates the superiority of the proposed algorithm in significantly improving the energy efficiency and load balancing of wireless sensor networks to extend the network *** research results indicate that wireless sensor networks have theoretical and practical significance in fields such as monitoring,healthcare,and agriculture.
Federated learning is widely used to solve the problem of data decentralization and can provide privacy protectionfor data owners. However, since multiple participants are required in federated learning, this allows a...
详细信息
Federated learning is widely used to solve the problem of data decentralization and can provide privacy protectionfor data owners. However, since multiple participants are required in federated learning, this allows attackers tocompromise. Byzantine attacks pose great threats to federated learning. Byzantine attackers upload maliciouslycreated local models to the server to affect the prediction performance and training speed of the global model. Todefend against Byzantine attacks, we propose a Byzantine robust federated learning scheme based on backdoortriggers. In our scheme, backdoor triggers are embedded into benign data samples, and then malicious localmodels can be identified by the server according to its validation dataset. Furthermore, we calculate the adjustmentfactors of local models according to the parameters of their final layers, which are used to defend against datapoisoning-based Byzantine attacks. To further enhance the robustness of our scheme, each localmodel is weightedand aggregated according to the number of times it is identified as malicious. Relevant experimental data showthat our scheme is effective against Byzantine attacks in both independent identically distributed (IID) and nonindependentidentically distributed (non-IID) scenarios.
The growing dependence on deep learning models for medical diagnosis underscores the critical need for robust interpretability and transparency to instill trust and ensure responsible usage. This study investigates th...
详细信息
Images captured under severe weather conditions, such as haze and fog, suffer from image quality degradation caused by atmospheric particle diffusion. This degradation manifests as color fading, reduced contrast, and ...
详细信息
Traditional e-commerce recommendation systems often struggle with dynamic user preferences and a vast array of products,leading to suboptimal user *** address this,our study presents a Personalized Adaptive Multi-Prod...
详细信息
Traditional e-commerce recommendation systems often struggle with dynamic user preferences and a vast array of products,leading to suboptimal user *** address this,our study presents a Personalized Adaptive Multi-Product Recommendation System(PAMR)leveraging transfer learning and Bi-GRU(Bidirectional Gated Recurrent Units).Using a large dataset of user reviews from Amazon and Flipkart,we employ transfer learning with pre-trained models(AlexNet,GoogleNet,ResNet-50)to extract high-level attributes from product data,ensuring effective feature representation even with limited ***-GRU captures both spatial and sequential dependencies in user-item *** innovation of this study lies in the innovative feature fusion technique that combines the strengths of multiple transfer learning models,and the integration of an attention mechanism within the Bi-GRU framework to prioritize relevant *** approach addresses the classic recommendation systems that often face challenges such as cold start along with data sparsity difficulties,by utilizing robust user and item *** model demonstrated an accuracy of up to 96.9%,with precision and an F1-score of 96.2%and 96.97%,respectively,on the Amazon dataset,significantly outperforming the baselines and marking a considerable advancement over traditional *** study highlights the effectiveness of combining transfer learning with Bi-GRU for scalable and adaptive recommendation systems,providing a versatile solution for real-world applications.
Image retrieval systems based on content can be developed for mobile applications that provide users with a seamless and efficient way to search for images based on their content. The development of CBIR systems for m...
详细信息
Instance co-segmentation aims to segment the co-occurrent instances among two *** task heavily relies on instance-related cues provided by co-peaks,which are generally estimated by exhaustively exploiting all paired c...
详细信息
Instance co-segmentation aims to segment the co-occurrent instances among two *** task heavily relies on instance-related cues provided by co-peaks,which are generally estimated by exhaustively exploiting all paired candidates in point-to-point ***,such patterns could yield a high number of false-positive co-peaks,resulting in over-segmentation whenever there are mutual *** tackle with this issue,this paper proposes an instance co-segmentation method via tensor-based salient co-peak search(TSCPS-ICS).The proposed method explores high-order correlations via triple-to-triple matching among feature maps to find reliable co-peaks with the help of co-saliency *** proposed method is shown to capture more accurate intra-peaks and inter-peaks among feature maps,reducing the false-positive rate of co-peak *** having accurate co-peaks,one can efficiently infer responses of the targeted *** on four benchmark datasets validate the superior performance of the proposed method.
Apricot detection is a prerequisite for counting and harvesting tasks. Existing algorithms face challenges in adapting to the impacts of complex environmental factors such as lighting variations, shadows, dense foliag...
详细信息
暂无评论