In the past few decades, gait as a biometric attribute has garnered a lot of attention for its application in a broad variety of security and privacy-related scenarios, including identity recognition and authenticatio...
详细信息
The proceedings contain 76 papers. The topics discussed include: ship detection with optical image based on attention and loss improved YOLO;ECG characteristic detection using DenseNet based on attention mechanism and...
ISBN:
(纸本)9781665499507
The proceedings contain 76 papers. The topics discussed include: ship detection with optical image based on attention and loss improved YOLO;ECG characteristic detection using DenseNet based on attention mechanism and feature pyramid;a method to detect the onsets and ends of paroxysmal atrial fibrillation episodes based on sliding window and coding;dynamic feature extraction using I-vector for video fire detection;bimodal information fusion network for salient object detection based on transformer;person re-identification method based on multi-view and attention mechanism;research on road unevenness recognition method based on off-road vehicle driving characteristic;and vehicle re-identification approach combining multiple attention mechanisms and style transfer.
Despite the fact that many character datasets for several languages are publicly available, there are only a very few standardized datasets for Tamil characters. This article presents a subset of the Mepco Tamil Chara...
详细信息
Liquid crystal display (LCD) screens are widely used in various types of smart meters, with ultrasonic water meters being one of their applications. The display on LCD screens is composed of various digits and icons, ...
详细信息
In criminal and victim identification, when information such as fingerprint and facial images cannot be obtained, the development of other new bio-metric recognition has become an important task. It is found from inve...
详细信息
Source-free Domain Adaptation (SFDA) aims to adapt a model trained on a given (source) environment to the new (target) environment, without directly accessing the source data. Due to the lack of labeled source data, i...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Source-free Domain Adaptation (SFDA) aims to adapt a model trained on a given (source) environment to the new (target) environment, without directly accessing the source data. Due to the lack of labeled source data, it is often difficult for SFDA methods to provide reliable class representations for the target data. To overcome this issue, we propose the idea of Confidence-based Subsets Feature Alignment (CSFA). CSFA divides the target data into two subsets: confident subset that consists of samples having low entropy class predictions from the source model, and non-confident subset with samples that do not. By using the pseudo-labels from the confident subset, we can frame the original SFDA problem as a Universal Domain Adaptation (UniDA) problem, and provide reliable class representations for the target data by aligning feature distributions of the two subsets. Specifically, we propose a multi-task framework that simultaneously applies a standard SFDA algorithm in combination with a UniDA-inspired algorithm, which further infuses class representations into the adaption process. We evaluate the proposed method on a wide range of cross-domain object recognition tasks and achieve higher or comparable accuracy compared to existing SFDA methods. Ablation studies are conducted to verify the effectiveness of the proposed method.
The proceedings contain 29 papers. The special focus in this conference is on Language Processing and Knowledge Management. The topics include: Machine Learning Serving the School Orientation Process;improving Ar...
ISBN:
(纸本)9783031850660
The proceedings contain 29 papers. The special focus in this conference is on Language Processing and Knowledge Management. The topics include: Machine Learning Serving the School Orientation Process;improving Arabic Fake News Detection Across Context-Aware Attention Deep Model Based on Natural Language Processing;towards a Maude-Based Approach for Formal Modeling Deep Neural Networks;Anti-pattern Based IoRT-Aware Business Process Structure Verification Approach;opinion Analysis Based on a Sentiment Lexical Ontology and Deep Learning Models: Tunisian Dialect Case;disfluent-to-Fluent Tunisian Dialect Speech Translation with Fine-Tuning Pre-trained Language Models;BERT-Based Model for Sarcasm Detection in Arabic Texts;typology of Event Data Imperfections;towards Sentiment Analysis for Libyan Dialect;text Categorization Can Enhance Domain-Agnostic Stopword Extraction;normalized Orthography for Tunisian Arabic;deep Learning Approach for Early Prediction of Depression on Social Network;adapting Large Language Models to Biomedical Domain: A Survey of Techniques and Approaches;tunisian Arabic Understanding: Resources Analysis and Evaluation;from Data to Decisions: An Ontology-Driven Method for Opinion Mining;LLMs for Cyberbullying Detection in Political Social Media;Assessing BERT Models for Arabic Named Entity recognition in a Multi-dialectal Context;tunisian Normalized Pronunciation;traffScOnto: Ontology for Traffic Management in the Context of Smart City Domain;SERTUS Dataset Collection from Spontaneous Environments;an Agricultural Sentiment Dataset for Pest Control and Crop Diseases;deep Learning Approach to Identify and Classify Arabic Verbal Multi-word Expressions;a Design pattern-Based Approach for Analyzing MapReduce Applications;analyzing the Impact of Big Data in Mental Health;The Impact of AI on Knowledge Management;a Rule-Based System for Translating Libyan Dialect Dual Forms to Modern Standard Arabic.
The proceedings contain 24 papers. The special focus in this conference is on Applications of Medical Artificial Intelligence. The topics include: SP-NAS: Surgical Phase recognition-Based Navigation Adjustment System ...
ISBN:
(纸本)9783031820069
The proceedings contain 24 papers. The special focus in this conference is on Applications of Medical Artificial Intelligence. The topics include: SP-NAS: Surgical Phase recognition-Based Navigation Adjustment System for Distal Gastrectomy;transforming Multimodal Models into Action Models for Radiotherapy;enhanced Interpretability in Histopathological Images via Combined Tissue and Cell-Level Graph Analysis;targeted Visual Prompting for Medical Visual Question Answering;deep Learning for Resolving 3D Microstructural Changes in the Fibrotic Liver;predicting Falls Through Muscle Weakness from a Single Whole Body Image: A Multimodal Contrastive Learning Framework;Optimizing ICU Readmission Prediction: A Comparative Evaluation of AI Tools;source Matters: Source Dataset Impact on Model Robustness in Medical Imaging;evaluating Perceived Workload, Usability and Usefulness of Artificial Intelligence Systems in Low-Resource Settings: Semi-automated Classification and Detection of Community Acquired Pneumonia;incremental Augmentation Strategies for Personalised Continual Learning in Digital Pathology Contexts;assessing Generalization Capabilities of Malaria Diagnostic Models from Thin Blood Smears;automated Feedback System for Surgical Skill Improvement in Endoscopic Sinus Surgery;quantifying Knee Cartilage Shape and Lesion: From Image to Metrics;RadImageGAN – A Multi-modal Dataset-Scale Generative AI for Medical Imaging;Ensemble-KAN: Leveraging Kolmogorov Arnold Networks to Discriminate Individuals with Psychiatric Disorders from Controls;SCIsegV2: A Universal Tool for Segmentation of Intramedullary Lesions in Spinal Cord Injury;EHRmonize: A Framework for Medical Concept Abstraction from Electronic Health Records using Large Language Models;evaluating the Impact of Pulse Oximetry Bias in Machine Learning Under Counterfactual Thinking;normative Modeling with Focal Loss and Adversarial Autoencoders for Alzheimer’s Disease Diagnosis and Biomarker Identification;one-Shot Medical
Video prediction is a complicated task as countless possible future frames exist that are equally plausible. While recent work have made progress in the prediction and generation of future video frames, these work hav...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Video prediction is a complicated task as countless possible future frames exist that are equally plausible. While recent work have made progress in the prediction and generation of future video frames, these work have not attempted to disentangle different features of videos such as an object's structure and its dynamics. Such a disentanglement would allow one to control these aspects to some extent in the prediction phase, while at the same time maintain the object's intrinsic properties that are learned as the model's internal representation. In this work, we propose Ladder Variational Recurrent Neural Networks (LVRNN). We employ a type of ladder autoencoder shown to be effective for feature disentanglement on images and apply it to the Variational Recurrent Neural Network (VRNN) architecture, which has been used for video prediction. We rely on extracted keypoints in each frame to separate the structure from the visual features. We then show how different levels of the ladder network learn to disentangle features and demonstrate that each of these levels can be used for controlling different aspects of future frames such as structure and dynamics. We evaluate our method on the Human3.6M and BAIR robot datasets. We show that our method is able to perform hierarchical disentanglement, yet provide reasonable results compared to similar methods.
Multi-spectral pedestrian detection has gained extensive attention over the past decade. To alleviate the problem of modality imbalance in the multi-spectral tasks, a novel cross-guided feature fusion network based on...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Multi-spectral pedestrian detection has gained extensive attention over the past decade. To alleviate the problem of modality imbalance in the multi-spectral tasks, a novel cross-guided feature fusion network based on the auto-encoder framework is proposed using RGB-thermal image pairs as inputs. To obtain the complementary features, a cross-guided loss is designed, so that the output images are balanced with both modalities in an unsupervised manner. An intra-modality reweighting module is implemented to filter the redundant features before the fusion. Finally, YOLOv3 is chosen as the detector fed by the fused features. The proposed method is verified using the public KAIST and VOT-RGBT datasets. Experimental results demonstrate that the proposed method can outperform the state-of-the-art methods, the miss rate of pedestrian detection reaches 48.57% and 4.52% using KAIST and VOT-RGBT datasets, respectively.
暂无评论