检索结果-内蒙古大学图书馆

作者： Schöne, Sandra Pol, Nirmal Kahrs, Lueder A. Medical Computer Vision and Robotics Lab University of Toronto Toronto Canada University of Stuttgart Stuttgart Germany Medical Computer Vision and Robotics Lab Institute of Biomedical Engineering University of Toronto Toronto Canada Department of Mathematical and Computational Sciences University of Toronto Mississauga Mississauga Canada

To automate surgical (sub-)tasks in robotic surgery, the knowledge of the exact pose of the instrument is mandatory. The application of Optical Coherence Tomography (OCT) to the problem of pose measurement appears promising due to its advantages of 3D imaging and micron-scale resolution. To investigate this, 175 image sequences of the EndoWrist Round Tip Scissor Tool were acquired with an OCT system. The images differ in the opening angles of the scissor blades and the rotation angles of the entire instrument about its central axis. These image sequences were further processed through computer vision methods of the individual images followed by point cloud generation. For pose estimation, an Iterative Closest Point algorithm was implemented to register the acquired point clouds to reference point clouds created from the instrument CAD file. The implemented algorithm was able to determine the opening angle with an overall error of 2 ± 1.3 and the rotation angle with a standard deviation between several runs of 0.6 ±2.8. However, the overall processing time of (39 ± 17)s on a standard PC leaves room for further investigations. © 2024 by Walter de Gruyter Berlin/Boston.

关键词： Optical coherence tomography

来源：评论

学校读者我要写书评

暂无评论

iTeach: Interactive Teaching for Robot Perception using Mixed Reality

arXiv

引用

arXiv 2024年

作者： Jaykumar, Jishnu P. Salvato, Cole Bomnale, Vinaya Wang, Jikai Xiang, Yu Intelligent Robotics and Vision Lab Department of Computer Science The University of Texas Dallas United States

We introduce iTeach, a human-in-the-loop Mixed Reality (MR) system that enhances robot perception through interactive teaching. Our system enables users to visualize robot perception outputs such as object detection and segmentation results using a MR device. Therefore, users can inspect failures of perception models using the system on real robots. Moreover, iTeach facilitates real-time, informed data collection and annotation, where users can use hand gesture, eye gaze and voice commands to annotate images collected from robots. The annotated images can be used to fine-tune perception models to improve their accuracy and adaptability. The system continually improves perception models by collecting annotations of failed examples from users. When applied to object detection and unseen object instance segmentation (UOIS) tasks, iTeach demonstrates encouraging results in improving pre-trained vision models for these two tasks. These results highlight the potential of MR to make robotic perception systems more capable and adaptive in real-world environments1 © 2024, CC BY.

关键词： Mixed reality

来源：评论

学校读者我要写书评

暂无评论

EfficientNet-SAM: A Novel EffecientNet with Spatial Attention Mechanism for COVID-19 Detection in Pulmonary CT Scans

EfficientNet-SAM: A Novel EffecientNet with Spatial Attentio...

引用

IEEE computer Society Conference on computer vision and Pattern Recognition Workshops (CVPRW)

作者： Ramy Farag Parth Upadhay Jacket Demby’s Yixiang Gao Katherin Garces Montoya Seyed Mohamad Ali Tousi Gbenga Omotara Guilherme DeSouza Department of Electrical Engineering and Computer Science Vision-Guided and Intelligent Robotics Lab - ViGIR Lab University of Missouri-Columbia

ISBN: (数字)9798350365474

ISBN: (纸本)9798350365481

Manual analysis and diagnosis of COVID-19 through the examination of Computed Tomography (CT) images of the lungs can be time-consuming and result in errors, especially given high volume of patients and numerous images per patient. So, we address the need for automation of this task by developing a new deep learning-based pipeline. Our motivation was sparked by the CVPR Workshop on "Domain Adaptation, Explainability and Fairness in AI for medical Image Analysis", more specifically, the "COVID-19 Diagnosis Competition (DEF-AI-MIA COV19D)" under the same Workshop. This challenge provides an opportunity to assess our proposed pipeline for COVID-19 detection from CT scan images. The same pipeline incorporates one of the architectures in the EfficientNet "family", but with an added Spatial Attention Mechanism: EfficientNet-SAM. Also, unlike the traditional/past pipelines, which relied on a preprocessing step, our pipeline takes the raw selected input images without any such step, except for an image-selection step to simply reduce the number of CT images required for training and/or testing. Moreover, our pipeline is computationally efficient, as, for example, it does not incorporate a decoder for segmenting the lungs. It also does not combine different models nor combine RNN with a backbone, as other pipelines in the past did. Nevertheless, our pipeline outperformed all approaches presented by other teams in last year’s instance of the same challenge using the validation subset. It also placed 5th in this year’s competition, ranking less than 1.3% below the 1st place and close to 3.5% above the 6th place based on the macro-F1 score.

关键词： COVID-19 Training Visualization Attention mechanisms Computed tomography Conferences Pipelines

来源：评论

学校读者我要写书评

暂无评论

Long-Term Invariant Local Features via Implicit Cross-Domain Correspondences

arXiv

引用

arXiv 2023年

作者： Pataki, Zador Altillawi, Mohammad Kanakis, Menelaos Pautrat, Rémi Shen, Fengyi Liu, Ziyuan Van Gool, Luc Pollefeys, Marc The Computer Vision and Geometry Lab Department of Computer Science ETH Zurich Switzerland The Computer Vision Center CVC-Barcelona The Intelligent Robotics Cloud Technology lab of Huawei-Munich Germany The Computer Vision Lab Department Electrical Engineering ETH Zurich Switzerland The Intelligent Robotics Cloud Technology lab of Huawei-Munich Germany The Intelligent Robotics Cloud Technology lab of Huawei-Munich Germany The Center for Processing Speech and Images KU Leuven The Computer Vision Lab ETH Zurich Switzerland

Modern learning-based visual feature extraction networks perform well in intra-domain localization, however, their performance significantly declines when image pairs are captured across long-term visual domain variations, such as different seasonal and daytime variations. In this paper, our first contribution is a benchmark to investigate the performance impact of long-term variations on visual localization. We conduct a thorough analysis of the performance of current state-of-the-art feature extraction networks under various domain changes and find a significant performance gap between intra- and cross-domain localization. We investigate different methods to close this gap by improving the supervision of modern feature extractor networks. We propose a novel data-centric method, Implicit Cross-Domain Correspondences (iCDC). iCDC represents the same environment with multiple Neural Radiance Fields, each fitting the scene under individual visual domains. It utilizes the underlying 3D representations to generate accurate correspondences across different long-term visual conditions. Our proposed method enhances cross-domain localization performance, significantly reducing the performance gap. When evaluated on popular long-term localization benchmarks, our trained networks consistently outperform existing methods. This work serves as a substantial stride toward more robust visual localization pipelines for long-term deployments, and opens up research avenues in the development of long-term invariant descriptors. Copyright © 2023, The Authors. All rights reserved.

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Maize EfficientNet Fusion: Advancing Maize Disease Detection with MF-NET 25

Maize EfficientNet Fusion: Advancing Maize Disease Detection...

引用

25th International Conference on Digital Image Computing: Techniques and Applications, DICTA 2024

作者： Khalid, Fatima Hanif, Muhammad Ul Ain, Qurat Ghulam Ishaq Khan Institute of Engineering Sciences and Engineering Faculty of Computer Science and Engineering KPK Topi Pakistan Ghulam Ishaq Khan Institute of Engineering Sciences and Engineering Aerial Robotics and Vision Lab KPK Topi Pakistan Shifa Tameer-E-Millat University Department of Computing Islamabad Pakistan

ISBN: (纸本)9798350379037

Maize is a vital global crop, essential for food security but highly susceptible to diseases that threaten yield and quality. Traditional methods for detecting these diseases are computationally intensive and rely on high-quality images, limiting their practical use in diverse field conditions. This research addresses these challenges by proposing MF-Net, a novel model that leverages two EfficientNet-b0 architectures with ReLU and Swish activation functions. Our approach involves partially training Model R with frozen early layers using ReLU, and fully training Model S with Swish activation for enhanced optimization and robustness. We employ model concatenation and feature fusion techniques to create a lightweight yet powerful model, complemented by additional layers for improved feature extraction and regularization. Experimental results on the Corn Disease and Severity (CD&S) dataset are compelling: MF-Net achieves a remarkable 95.7% accuracy and 96% precision in disease classification, and 88.3% accuracy and 88.46% precision in severity level detection. These results highlight the model's effectiveness under challenging conditions such as cluttered backgrounds and variable lighting, significantly reducing computation time without compromising accuracy. Hence, MF-Net presents an innovative and efficient solution for maize disease detection and severity assessment, offering significant advancements for agricultural practices and crop management. © 2024 IEEE.

关键词： CD&S Dataset EffiecientNet-b0 Fused Truncated EfficientNet Maize Disease Maize Severity Level

来源：评论

学校读者我要写书评

暂无评论

Learning Inverse Kinematics Multiplicity of Concentric Tube Robots Using Invertible Neural Networks

Learning Inverse Kinematics Multiplicity of Concentric Tube ...

引用

International Symposium on medical robotics (ISMR)

作者： Paul H. Kang Radian Gondokaryono Majid Roshanfar Robert H. Nguyen Thomas Looi James M. Drake Dale Podolsky Institute of Biomedical Engineering University of Toronto Canada Wilfred and Joyce Posluns Centre for Image Guided Innovation and Therapeutic Intervention Hospital for Sick Children Toronto Canada Department of Computer Science Medical Computer Vision and Robotics Lab University of Toronto Canada

ISBN: (数字)9798331599003

ISBN: (纸本)9798331599010

Concentric tube robots (CTR) are a promising technology for medical applications due to their small size, flexibility, and ability to make complex shapes. These robots are built from a series of pre-curved, super-elastic tubes that are arranged concentrically and manipulated through rotational and translational movements at their proximal ends. Achieving accurate kinematics is essential in making CTRs useful in minimally invasive surgical procedures where precision and safety is paramount. Due to the difficulty of incorporating nonlinear effects like friction and tube clearances into analytical models, previous works have investigated machine learningbased models for CTR kinematics, leading to higher kinematic accuracies. We present a kinematic model for CTRs using an invertible neural network architecture that, unlike other learningbased models, can generate multiple inverse kinematic solutions. Our model achieved a mean forward kinematic tip error of 2.86 mm ( $\mathbf{3. 4 3 \%}$ normalized to arclength), outperforming a Cosserat rod-based analytical CTR model. In three test cases (two static points and a circle trajectory), our model achieved mean inverse kinematic errors of $6.19,3.33$ , and $3.86 ~\text{mm}(5.66 \%, 4.12 \%$ , and 4.16 % normalized to arclength). We additionally present a robust data capture pipeline that is able to reconstruct the CTR's shape that uses state-of-the-art segmentation models.

关键词： Analytical models Concentric tube robots Accuracy Minimally invasive surgery Shape Neural networks Pipelines Kinematics Trajectory Safety

来源：评论

学校读者我要写书评

暂无评论

FusionPortableV2: A Unified Multi-Sensor Dataset for Generalized SLAM Across Diverse Platforms and Scalable Environments

arXiv

引用

arXiv 2024年

作者： Wei, Hexiang Jiao, Jianhao Hu, Xiangcheng Yu, Jingwen Xie, Xupeng Wu, Jin Zhu, Yilong Liu, Yuxuan Wang, Lujia Liu, Ming Department of Electronic and Computer Engineering The Hong Kong University of Science and Technology Hong Kong Robot Perception and Learning Lab Department of Computer Science University College London United Kingdom Shenzhen Key Laboratory of Robotics and Computer Vision Southern University of Science and Technology China Guangzhou China

Simultaneous Localization and Mapping (SLAM) technology has been widely applied in various robotic scenarios, from rescue operations to autonomous driving. However, the generalization of SLAM algorithms remains a significant challenge, as current datasets often lack scalability in terms of platforms and environments. To address this limitation, we present FusionPortableV2, a multi-sensor SLAM dataset featuring sensor diversity, varied motion patterns, and a wide range of environmental scenarios. Our dataset comprises 27 sequences, spanning over 2.5 hours and collected from four distinct platforms: a handheld suite, a legged robot, a unmanned ground vehicle (UGV), and a vehicle. These sequences cover diverse settings, including buildings, campuses, and urban areas, with a total length of 38.7km. Additionally, the dataset includes ground-truth (GT) trajectories and RGB point cloud maps covering approximately 0.3km2. To validate the utility of our dataset in advancing SLAM research, we assess several state-of-the-art (SOTA) SLAM algorithms. Furthermore, we demonstrate the dataset's broad application beyond traditional SLAM tasks by investigating its potential for monocular depth estimation. The completae dataset, including sensor data, GT, and calibration details, is accessible at https://***/dataset/fusionportable v2. Copyright © 2024, The Authors. All rights reserved.

关键词： SLAM robotics

来源：评论

学校读者我要写书评

暂无评论

Emotion Recognition from Occluded Facial Images Using Deep Ensemble Model

引用

computers, Materials & Continua 2022年第12期73卷 4465-4487页

作者： Zia Ullah Muhammad Ismail Mohmand Sadaqat ur Rehman Muhammad Zubair Maha Driss Wadii Boulila Rayan Sheikh Ibrahim Alwawi Department of Computer Science The Brains InstitutePeshawar25000Pakistan School of Natural and Computing Sciences University of AberdeenAberdeenUK Department of Neurosciences KU Leuven Medical SchoolLeuven3000Belgium Security Engineering Laboratory CCISPrince Sultan UniversityRiyadh12435Saudi Arabia Robotics and Internet of Things Lab Prince Sultan UniversityRiyadh12435Saudi Arabia Department of Computer Science Robert Gordon UniversityAberdeenUK

Facial expression recognition has been a hot topic for decades,but high intraclass variation makes it *** overcome intraclass variation for visual recognition,we introduce a novel fusion methodology,in which the proposed model first extract features followed by feature ***,RestNet-50,VGG-19,and Inception-V3 is used to ensure feature learning followed by feature ***,the three feature extraction models are utilized using Ensemble Learning techniques for final expression *** representation learnt by the proposed methodology is robust to occlusions and pose variations and offers promising *** evaluate the efficiency of the proposed model,we use two wild benchmark datasets Real-world Affective Faces Database(RAF-DB)and AffectNet for facial expression *** proposed model classifies the emotions into seven different categories namely:happiness,anger,fear,disgust,sadness,surprise,and ***,the performance of the proposed model is also compared with other algorithms focusing on the analysis of computational cost,convergence and accuracy based on a standard problem specific to classification applications.

关键词： Ensemble learning emotion recognition feature fusion occlusion

来源：评论

学校读者我要写书评

暂无评论

Preference detection of the humanoid robot face based on EEG and eye movement

引用

Neural Computing and Applications 2024年第19期36卷 11603-11621页

作者： Wang, Pengchao Mu, Wei Zhan, Gege Wang, Aiping Song, Zuoting Fang, Tao Zhang, Xueze Wang, Junkongshuai Niu, Lan Bin, Jianxiong Zhang, Lihua Jia, Jie Kang, Xiaoyang Laboratory for Neural Interface and Brain Computer Interface Engineering Research Center of AI and Robotics Ministry of Education Shanghai Engineering Research Center of AI and Robotics MOE Frontiers Center for Brain Science State Key Laboratory of Medical Neurobiology Institute of AI and Robotics Academy for Engineering and Technology Fudan University Shanghai200433 China Ji Hua Laboratory Guangdong Province Foshan528200 China Department of Rehabilitation Medicine Huashan Hospital Fudan University Shanghai200040 China Yiwu Research Institute of Fudan University Zhejiang Yiwu322000 China Research Center for Intelligent Sensing Zhejiang Lab Hangzhou311121 China

The face of a humanoid robot can affect the user experience, and the detection of face preference is particularly important. Preference detection belongs to a branch of emotion recognition that has received much attention from researchers. Most of the previous preference detection studies have been conducted based on a single modality. In this paper, we detect face preferences of humanoid robots based on electroencephalogram (EEG) signals and eye movement signals for single modality, canonical correlation analysis fusion modality, and bimodal deep autoencoder (BDAE) fusion modality, respectively. We validated the theory of frontal asymmetry by analyzing the preference patterns of EEG and found that participants had higher alpha wave energy for preference faces. In addition, hidden preferences extracted by EEG signals were better classified than preferences from participants' subjective feedback, and also, the classification performance of eye movement data was improved. Finally, experimental results showed that BDAE multimodal fusion using frontal alpha and beta power spectral densities and eye movement information as features performed best, with the highest average accuracy of 83.13% for the SVM and 71.09% for the KNN. © The Author(s), under exclusive licence to Springer-Verlag London Ltd., part of Springer Nature 2024.

关键词： Electroencephalography

来源：评论

学校读者我要写书评

暂无评论

Robot Person Following Under Partial Occlusion

Robot Person Following Under Partial Occlusion

引用

IEEE International Conference on robotics and Automation (ICRA)

作者： Hanjing Ye Jieting Zhao Yaling Pan Weinan Cherr Li He Hong Zhang Department of Electronic and Electrical Engineering Shenzhen Key Laboratory of Robotics and Computer Vision Southern University of Science and Technology (SUSTech) SUSTech Biomimetic and Intelligent Robotics Lab Guangdong University of Technology

Robot person following (RPF) is a capability that supports many useful human-robot-interaction (HRI) applications. However, existing solutions to person following often as-sume full observation of the tracked person. As a consequence, they cannot track the person reliably under partial occlusion where the assumption of full observation is not satisfied. In this paper, we focus on the problem of robot person following under partial occlusion caused by a limited field of view of a monocular camera. Based on the key insight that it is possible to locate the target person when one or more of hislher joints are visible, we propose a method in which each visible joint contributes a location estimate of the followed person. Experiments on a public person-following dataset show that, even under partial occlusion, the proposed method can still locate the person more reliably than the existing SOTA methods. As well, the application of our method is demonstrated in real experiments on a mobile robot.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：