检索结果-内蒙古大学图书馆

IEEE International Workshop on Robot and human Communication (ROMAN)

作者： Martina Gassen Frederic Metzler Erik Prescher Lisa Scherf Vignesh Prasad Felix Kaiser Dorothea Koert Interactive AI & Cognitive Models for Human-AI Interaction (IKIDA) TU Darmstadt Germany Centre for Cognitive Science TU Darmstadt Germany Department of Computer Science Institute for Intelligent Autonomous Systems TU Darmstadt Germany Department of Law and Economics Chair for Marketing and Human Resource Management TU Darmstadt Germany

Extracting modular segments from raw video demonstrations of high-level actions is important to understand the underlying building blocks for different tasks in human-robot interaction. While (data-hungry) supervised learning approaches for Action Segmentation show good performance when the underlying segments are predefined, their performance degrades when unseen actions are introduced on-the-go as new data samples are scarce. In this regard, Zero-and Few-Shot Learning approaches have shown good performance in generalizing to unseen examples. In Action Segmentation, where each frame needs to be labeled, annotating new data even for a few tasks can become tedious as the number of tasks scale. In this work, we propose Interactive Iterative Improvement $(I^{3})$ for Few-Shot Action Segmentation, a Semi-Supervised Interactive Meta-Learning approach for Zero-Shot Learning on unlabeled videos and Few-Shot Learning on small amounts of labeled videos. $I^{3}$ consists of a Prototypical Network model for frame-wise prediction coupled with a Hidden-Semi-Markov-Model to prevent over-segmentation. The model is iteratively improved in an interactive manner through users’ annotations provided via a webinterface. This is done in a task-agnostic manner that, in theory, can be reused for a number of different actions. Our model provides sequentially accurate segmentations using only a limited amount of labeled data which shows the efficacy of our learning approach. A lower edit distance compared to baselines indicates a lower number of required user edits making it well suited for non-expert users to smoothly provide annotations enabling them to have more control over the learned model.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Vision-Based Multimodal Interfaces: A Survey and Taxonomy for Enhanced Context-Aware System Design

arXiv

引用

arXiv 2025年

作者： Hu, Yongquan Tang, Jingyu Gong, Xinya Zhou, Zhongyi Zhang, Shuning Elvitigala, Don Samitha Mueller, Florian Hu, Wen Quigley, Aaron J. School of Computer Science and Engineering University of New South Wales Sydney Australia The University of Tokyo Tokyo Japan Exertion Games Lab Department of Human-Centred Computing Monash University Melbourne Australia School of Computer Science and Technology Huazhong University of Science and Technology Wuhan China Tsinghua University Beijing China Department of Computer Science and Engineering South University of Science and Technology Shenzhen China CSIRO’s Data61 University of New South Wales Sydney Australia

The recent surge in artificial intelligence, particularly in multimodal processing technology, has advanced human-computer interaction, by altering how intelligent systems perceive, understand, and respond to contextual information (i.e., context awareness). Despite such advancements, there is a significant gap in comprehensive reviews examining these advances, especially from a multimodal data perspective, which is crucial for refining system design. This paper addresses a key aspect of this gap by conducting a systematic survey of data modality-driven Vision-based Multimodal Interfaces (VMIs). VMIs are essential for integrating multimodal data, enabling more precise interpretation of user intentions and complex interactions across physical and digital environments. Unlike previous task- or scenario-driven surveys, this study highlights the critical role of the visual modality in processing contextual information and facilitating multimodal interaction. Adopting a design framework moving from the whole to the details and back, it classifies VMIs across dimensions, providing insights for developing effective, context-aware *** Codes 68U35, 68T07 Copyright © 2025, The Authors. All rights reserved.

关键词： Taxonomies

来源：评论

学校读者我要写书评

暂无评论

Passive User Identification and Authentication with Smartphone Sensor Data 3

Passive User Identification and Authentication with Smartpho...

引用

3rd IEEE International Conference on Transdisciplinary AI, TransAI 2021

作者： Raval, Aaditya Anwar, Mohd Human-Centered AI Lab Computer Science Department North Carolina Agricultural and Technical State University GreensboroNC United States

ISBN: (纸本)9781665434126

A unique digital identity, user ID, is essential for everyday online activities in the Internet era. These user IDs represent a user in a digital environment using stored credentials on a system called authentication system. It is possible to capture unique patterns of user movements from smartphone sensor data. This paper presents a framework for passive user identification and authentication using onboard sensors of an Android smartphone. Using this framework, we propose a data preprocessing scheme that uses the absolute difference of consecutive repeated measurements of 7 onboard sensors. We developed 5 models for user identification and authentication using various machine learning and deep learning methods. The experimental results and performance assessment conclude that the Sequential Neural Network (SNN) model performs best with 98.24% accuracy for authenticating users (binary classification) and 84.41% accuracy during multi-class classification for user identification. Furthermore, all the models developed for this research, namely MLP, SNN, CNN, SVM, and Bi-LSTM, provide 100% precision during binary classification for passive user authentication. Our models were trained on 556,746 sensor data samples, each having 20 features. These features include the proximity sensor data, light sensor data, triaxial measurements from accelerometers, gravity sensors, gyroscopes, magnetometers, and rotational vector sensors. We study the possible influence of absolute difference samples for user identification and authentication. © 2021 IEEE.

关键词： Smartphones

来源：评论

学校读者我要写书评

暂无评论

Semantics-Controlled Gaussian Splatting for Outdoor Scene Reconstruction and Rendering in Virtual Reality

Semantics-Controlled Gaussian Splatting for Outdoor Scene Re...

引用

IEEE Annual International Symposium Virtual Reality

作者： Hannah Schieber Jacob Young Tobias Langlotz Stefanie Zollmann Daniel Roth Technical University of Munich Human-Centered Computing and Extended Reality Lab TUM University Hospital Clinic for Orthopedics and Sports Orthopedics Munich Germany Friedrich-Alexander-Universität Erlangen-Nürnberg (FAU) Erlangen Germany Department of Computer Science University of Otago Dunedin New Zealand Department of Computer Science Aarhus University Aarhus Denmark

ISBN: (数字)9798331536459

ISBN: (纸本)9798331536466

Advancements in 3D rendering like Gaussian Splatting (GS) allow novel view synthesis and real-time rendering in virtual reality (VR). However, GS-created 3D environments are often difficult to edit. For scene enhancement or to incorporate 3D assets, segmenting Gaussians by class is essential. Existing segmentation approaches are typically limited to certain types of scenes, e.g., "circular" scenes, to determine clear object boundaries. However, this method is ineffective when removing large objects in non-"circling" scenes such as large outdoor *** propose Semantics-Controlled GS (SCGS), a segmentation-driven GS approach, enabling the separation of large scene parts in uncontrolled, natural environments. SCGS allows scene editing and the extraction of scene parts for VR. Additionally, we introduce a challenging outdoor dataset, overcoming the "circling" setup. We outperform the state-of-the-art in visual quality on our dataset and in segmentation quality on the 3D-OVS dataset. We conducted an exploratory user study, comparing a 360-video, plain GS, and SCGS in VR with a fixed viewpoint. In our subsequent main study, users were allowed to move freely, evaluating plain GS and SCGS. Our main study results show that participants clearly prefer SCGS over plain GS. We overall present an innovative approach that surpasses the state-of-the-art both technically and in user experience.

关键词： Visualization Three-dimensional displays Semantics Virtual reality User interfaces Rendering (computer graphics) User experience Real-time systems

来源：评论

学校读者我要写书评

暂无评论

Roopkotha: A Companion Robot for Enhancing Interactive Storytelling with Natural interaction

Roopkotha: A Companion Robot for Enhancing Interactive Story...

引用

Image Processing and Robotics (ICIP), International Conference on

作者： Kazi Mayesha Mehzabin Md. Zahidul Islam Md. Ashaduzzaman Nur Mohammad Shidujaman Hooman Samani Haipeng Mi RIoT Center and Department of Computer Science & Engineering Independent University Bangladesh Dhaka Bangladesh Creative Computing Institute University of the Arts London London United Kingdom Media and Interaction Lab Academy of Arts and Design Tsinghua University Beijing China

ISBN: (数字)9798350374766

ISBN: (纸本)9798350374773

Roopkotha is a storytelling robot that seamlessly combines traditional storytelling methods and technology, creating a captivating robot storyteller. We are creating a special prototype in the world of robots that tells stories in a way that is easy to understand and enjoy. In this era of technological advancement, Roopkotha combines voice recognition with Bangla Language processing, emotion recognition, human behavior detection. Roopkotha aims to revolutionize the way stories are told and engage with users on a deep emotional level. Furthermore, Roopkotha is equipped with advanced facial expressions and emotion recognition technology. The emotion recognition feature helps the robot to have a profound connection with the users.

关键词： Emotion recognition Image recognition human-robot interaction Prototypes Speech recognition Linguistics Feature extraction

来源：评论

学校读者我要写书评

暂无评论

DRDM: A Disentangled Representations Diffusion Model for Synthesizing Realistic Person Images

arXiv

引用

arXiv 2024年

作者： Huang, Enbo Zhang, Yuan Huang, Faliang Zhang, Guangyu Liu, Yang Guangxi Key Lab of Human-machine Interaction and Intelligent Decision Nanning Normal University Nanning China Zhidayuan AI Lab. Nanning China College of Mathematics and Informatics South China Agricultural University Guangzhou China School of Computer Science and Engineering Sun Yat-Sen University Guangzhou China

Person image synthesis with controllable body poses and appearances is an essential task owing to the practical needs in the context of virtual try-on, image editing and video production. However, existing methods face significant challenges with details missing, limbs distortion and the garment style deviation. To address these issues, we propose a Disentangled Representations Diffusion Model (DRDM) to generate photo-realistic images from source portraits in specific desired poses and appearances. First, a pose encoder is responsible for encoding pose features into a high-dimensional space to guide the generation of person images. Second, a body-part subspace decoupling block (BSDB) disentangles features from the different body parts of a source figure and feeds them to the various layers of the noise prediction block, thereby supplying the network with rich disentangled features for generating a realistic target image. Moreover, during inference, we develop a parsing map-based disentangled classifier-free guided sampling method, which amplifies the conditional signals of texture and pose. Extensive experimental results on the Deepfashion dataset demonstrate the effectiveness of our approach in achieving pose transfer and appearance control. Copyright © 2024, The Authors. All rights reserved.

关键词： Signal encoding

来源：评论

学校读者我要写书评

暂无评论

Deep learning for brain age estimation: A systematic review

引用

Information Fusion 2023年 96卷 130-143页

作者： Tanveer, M. Ganaie, M.A. Beheshti, Iman Goel, Tripti Ahmad, Nehal Lai, Kuan-Ting Huang, Kaizhu Zhang, Yu-Dong Del Ser, Javier Lin, Chin-Teng Department of Mathematics Indian Institute of Technology Indore Simrol Indore453552 India Department of Robotics University of Michigan Ann ArborMI48109 United States Department of Human Anatomy and Cell Science Rady Faculty of Health Sciences Max Rady College of Medicine University of Manitoba WinnipegMB Canada Biomedical Imaging Lab National Institute of Technology Silchar Assam 788010 India Department of Electrical Engineering and Computer Science National Taipei University of Technology Taipei Taiwan Department of Electronic Engineering National Taipei University of Technology Taipei Taiwan Data Science Research Center Duke Kunshan University China School of Computing and Mathematical Sciences University of Leicester LeicesterLE1 7RH United Kingdom Derio48160 Spain Bilbao48013 Spain School of Computer Science Faculty of Engineering and Information Technology University of Technology Sydney Sydney Australia

Over the years, Machine Learning models have been successfully employed on neuroimaging data for accurately predicting brain age. Deviations from the healthy brain aging pattern are associated with the accelerated brain aging and brain abnormalities. Hence, efficient and accurate diagnosis techniques are required to elicit accurate brain age estimations. Several contributions have been reported in the past for this purpose, resorting to different data-driven modeling methods. Recently, deep neural networks (also referred to as deep learning) have become prevalent in manifold neuroimaging studies, including brain age estimation. In this review, we offer a comprehensive analysis of the literature related to the adoption of deep learning for brain age estimation with neuroimaging data. We detail and analyze different deep learning architectures used for this application, pausing at research works published to date quantitatively exploring their application. We also examine different brain age estimation frameworks, comparatively exposing their advantages and weaknesses. Finally, the review concludes with an outlook towards future directions that should be followed by prospective studies. The ultimate goal of this paper is to establish a common and informed reference for newcomers and experienced researchers willing to approach brain age estimation by using deep learning models. © 2023 Elsevier B.V.

关键词： Deep neural networks

来源：评论

学校读者我要写书评

暂无评论

Confidence-oriented Contrastive Graph Clustering

Confidence-oriented Contrastive Graph Clustering

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Yan-Di Huang Guang-Yu Zhang Dong Huang Chang-Dong Wang Yang Liu Enbo Huang College of Mathematics and Informatics South China Agricultural University Guangzhou China School of Computer Science and Engineering Sun Yat-Sen University Guangdong Provincial Key Laboratory of Intellectual Property and Big Data Guangzhou China School of Computer Science and Engineering Sun Yat-Sen University Guangzhou China Guangxi Key Lab of Human-Machine Interaction and Intelligent Decision Nanning Normal University Nanning China

ISBN: (数字)9798350359312

ISBN: (纸本)9798350359329

Contrastive clustering has recently been an emerging topic in deep unsupervised learning. Nevertheless, the previous works mostly adopt the stochastic data augmentations, which easily leads to the semantic drift problem by limited transformations. Moreover, these approaches ignore the data distribution information when generating the positive and negative pair-wise samples. In light of this, this paper proposes a simple yet effective unsupervised clustering network termed Confidence-oriented Contrastive Graph Clustering (CoCGC). Particularly, we design an end-to-end network paradigm with un-shared weights, among which a hybrid graph filter is utilized to generate two views of reliable augmentations. Guided by the non-dominated sorting theory, we further construct a confidence-oriented sample set from the latent data distribution perspective. By considering the local density and cluster distribution of the embedding representations, the discriminative sample pairs can be derived from the confidence-oriented sets in a two-view contrastive manner. Finally, a cross-view neighbor contrastive loss is devised for better exploiting the self-supervised network signals. Extensive experimental results on five benchmark datasets demonstrate the effectiveness of our method against the existing state-of-the-art deep graph clustering methods.

关键词： Network topology Semantics Neural networks Reliability theory Reliability engineering Information filters Filtering theory

来源：评论

学校读者我要写书评

暂无评论

Surface defect identification using Bayesian filtering on a 3D mesh

arXiv

引用

arXiv 2025年

作者： Vedove, Matteo Dalle Bonetto, Matteo Lamon, Edoardo Palopoli, Luigi Saveriano, Matteo Fontanelli, Daniele Department of Industrial Engineering Università di Trento Trento Italy DRIM Ph.D. of national interest in Robotics and Intelligent Machines Italy Department of Information Engineering and Computer Science Università di Trento Trento Italy Human-Robot Interfaces and Interaction Istituto Italiano di Tecnologia Genoa Italy

This paper presents a CAD-based approach for automated surface defect detection. We leverage the a-priori knowledge embedded in a CAD model and integrate it with point cloud data acquired from commercially available stereo and depth cameras. The proposed method first transforms the CAD model into a high-density polygonal mesh, where each vertex represents a state variable in 3D space. Subsequently, a weighted least squares algorithm is employed to iteratively estimate the state of the scanned workpiece based on the captured point cloud measurements. This framework offers the potential to incorporate information from diverse sensors into the CAD domain, facilitating a more comprehensive analysis. Preliminary results demonstrate promising performance, with the algorithm achieving convergence to a sub-millimeter standard deviation in the region of interest using only approximately 50 point cloud samples. This highlights the potential of utilising commercially available stereo cameras for high-precision quality control applications. © 2025, CC BY-SA.

关键词： Mesh generation

来源：评论

学校读者我要写书评

暂无评论

Augmenting Scientific Creativity with Retrieval across Knowledge Domains

arXiv

引用

arXiv 2022年

作者： Kang, Hyeonsu B. Mysore, Sheshera Huang, Kevin Chang, Haw-Shiuan Prein, Thorben McCallum, Andrew Kittur, Aniket Olivetti, Elsa Human-Computer Interaction Institute CMU PA United States Manning College of Information and Computer Sciences UMASS AmherstMA United States Department of Materials Science and Engineering MIT MA United States

Exposure to ideas in domains outside a scientist’s own may benefit her in reformulating existing research problems in novel ways and discovering new application domains for existing solution ideas. While improved performance in scholarly search engines can help scientists efficiently identify relevant advances in domains they may already be familiar with, it may fall short of helping them explore diverse ideas outside such domains. In this paper we explore the design of systems aimed at augmenting the end-user ability in cross-domain exploration with flexible query specification. To this end, we develop an exploratory search system in which end-users can select a portion of text core to their interest from a paper abstract and retrieve papers that have a high similarity to the user-selected core aspect but differ in terms of domains. Furthermore, end-users can ‘zoom in’ to specific domain clusters to retrieve more papers from them and understand nuanced differences within the clusters. Our case studies with scientists uncover opportunities and design implications for systems aimed at facilitating cross-domain exploration and inspiration. © 2022, CC BY.

关键词： Search engines

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：