As smart homes become more prevalent in daily life, the ability to understand dynamic environments is essential which is increasingly dependent on AI systems. This study focuses on developing an intelligent algorithm ...
详细信息
This poster presents an approach to systematically distinguish between interaction techniques (ITs) in the context of magic ITs in immersive virtual environments. Currently, heterogeneous terms are used in research to...
详细信息
This poster presents an approach to systematically distinguish between interaction techniques (ITs) in the context of magic ITs in immersive virtual environments. Currently, heterogeneous terms are used in research to describe the concept of enhancing the abilities of users beyond the limits of the real world, such as magic, super-natural, hyper-natural, superhuman abilities, superpowers, augmentation or empowerment. As a first step towards clarifying and systematically defining the terminology, we propose using the orthogonal concepts of interalizability, congruence, and enhancement (or ICE-cube) as a simple yet expressive conceptual framework.
Person image synthesis with controllable body poses and appearances is an essential task owing to the practical needs in the context of virtual try-on, image editing and video production. However, existing methods fac...
详细信息
As smart homes become more prevalent in daily life, the ability to understand dynamic environments is essential which is increasingly dependent on AI systems. This study focuses on developing an intelligent algorithm ...
详细信息
ISBN:
(数字)9798331511272
ISBN:
(纸本)9798331511289
As smart homes become more prevalent in daily life, the ability to understand dynamic environments is essential which is increasingly dependent on AI systems. This study focuses on developing an intelligent algorithm which can navigate a robot through a kitchen, recognizing objects, and tracking their relocation. The kitchen was chosen as the testing ground due to its dynamic nature as objects are frequently moved, rearranged and replaced. Various techniques, such as SLAM feature-based tracking and deep learning-based object detection (e.g., Faster R-CNN), are commonly used for object tracking. Additionally, methods such as optical flow analysis and 3D reconstruction have also been used to track the relocation of objects. These approaches often face challenges when it comes to problems such as lighting variations and partial occlusions, where parts of the object are hidden in some frames but visible in others. The proposed method in this study leverages the YOLOv5 architecture, initialized with pre-trained weights and subsequently fine-tuned on a custom dataset. A novel method was developed, introducing a frame-scoring algorithm which calculates a score for each object based on its location and features within all frames. This scoring approach helps to identify changes by determining the best-associated frame for each object and comparing the results in each scene, overcoming limitations seen in other methods while maintaining simplicity in design. The experimental results demonstrate an accuracy of 97.72%, a precision of 95.83% and a recall of 96.84% for this algorithm, which highlights the efficacy of the model in detecting spatial changes.
This paper introduces a novel approach for enabling real-time imitation of human head motion by a Nao robot, with a primary focus on elevating human-robot interactions. By using the robust capabilities of the MediaPip...
详细信息
Guiding a user’s hand along a 3D path can help individuals avoid obstacles and manipulate everyday items with eyes-free. While prior work focused on haptic approaches using robots, auditory approaches for 3D path gui...
详细信息
Neural radiance field (NeRF), in particular, its extension by instant neural graphics primitives is a novel rendering method for view synthesis that uses real-world images to build photo-realistic immersive virtual sc...
详细信息
Neural radiance field (NeRF), in particular, its extension by instant neural graphics primitives is a novel rendering method for view synthesis that uses real-world images to build photo-realistic immersive virtual scenes. Despite its enormous potential for virtual reality (VR) applications, there is currently little robust integration of NeRF into typical VR systems available for research and benchmarking in the VR community. In this poster paper, we present an extension to instant neural graphics primitives and bring stereoscopic, high-resolution, low-latency, 6-DoF NeRF rendering to the Unity game engine for immersive VR applications. 1 1 Link to the repository: https://***/uhhhci/immersive-ngp
While handwritten zip code recognition and ledger sheet recognition are in practical use, character recognition technology for free handwritten documents is still in the process of commercialization. If character reco...
详细信息
While handwritten zip code recognition and ledger sheet recognition are in practical use, character recognition technology for free handwritten documents is still in the process of commercialization. If character recognition for free handwritten documents is realized, scanned images of notes and memos written on paper can be converted into text data on a computer, which will be useful for keyword searches and natural language processing. One of the reasons why character recognition for free handwritten documents is still in the research stage is that it is difficult to detect lines and extract characters from a document. In order to improve the accuracy of character recognition for free handwritten documents, it is first necessary to improve the accuracy of line detection and character segmentation. In this study, we propose a method to segment characters from words or strings using CNN and dynamic programming so that the sum of character similarities is optimal, and aim to improve the accuracy of character segmentation
Post-training quantization (PTQ) is a technique used to optimize and reduce the memory footprint and computational requirements of machine learning models. It has been used primarily for neural networks. For Brain-Com...
详细信息
ISBN:
(数字)9781665410205
ISBN:
(纸本)9781665410212
Post-training quantization (PTQ) is a technique used to optimize and reduce the memory footprint and computational requirements of machine learning models. It has been used primarily for neural networks. For Brain-computer Interfaces (BCI) that are fully portable and usable in various situations, it is necessary to provide approaches that are lightweight for storage and computation. In this paper, we propose the evaluation of post-training quantization on state-of-the-art approaches in brain-computer interfaces and assess their impact on accuracy. We evaluate the performance of the single-trial detection of event-related potentials representing one major BCI paradigm. The area under the receiver operating characteristic curve drops from 0.861 to 0.825 with PTQ when applied on both spatial filters and the classifier, while reducing the size of the model by about x 15. The results support the conclusion that PTQ can substantially reduce the memory footprint of the models while keeping roughly the same level of accuracy.
Brain-computer interface (BCI) systems facilitate unique communication between humans and computers, benefiting severely disabled individuals. Despite decades of research, BCIs are not fully integrated into clinical a...
详细信息
ISBN:
(数字)9798350359312
ISBN:
(纸本)9798350359329
Brain-computer interface (BCI) systems facilitate unique communication between humans and computers, benefiting severely disabled individuals. Despite decades of research, BCIs are not fully integrated into clinical and commercial settings. It’s crucial to assess and explain BCI performance, offering clear explanations for potential users to avoid frustration when it doesn’t work as expected. This work investigates the efficacy of different deep learning and Riemannian geometry-based classification models in the context of motor imagery (MI) based BCI using electroencephalography (EEG). We then propose an optimal transport theory-based approach using earth mover’s distance (EMD) to quantify the comparison of the feature relevance map with the domain knowledge of neuroscience. For this, we utilized explainable AI (XAI) techniques for generating feature relevance in the spatial domain to identify important channels for model outcomes. Three state-of-the-art models are implemented - 1) Riemannian geometry-based classifier, 2) EEGNet, and 3) EEG Conformer, and the observed trend in the model’s accuracy across different architectures on the dataset correlates with the proposed feature relevance metrics. The models with diverse architectures perform significantly better when trained on channels relevant to motor imagery than data-driven channel selection. This work focuses attention on the necessity for interpretability and incorporating metrics beyond accuracy, underscores the value of combining domain knowledge and quantifying model interpretations with data-driven approaches in creating reliable and robust Brain-computer Interfaces (BCIs).
暂无评论