To automate surgical (sub-)tasks in robotic surgery, the knowledge of the exact pose of the instrument is mandatory. The application of Optical Coherence Tomography (OCT) to the problem of pose measurement appears pro...
详细信息
We introduce iTeach, a human-in-the-loop Mixed Reality (MR) system that enhances robot perception through interactive teaching. Our system enables users to visualize robot perception outputs such as object detection a...
详细信息
Manual analysis and diagnosis of COVID-19 through the examination of Computed Tomography (CT) images of the lungs can be time-consuming and result in errors, especially given high volume of patients and numerous image...
详细信息
ISBN:
(数字)9798350365474
ISBN:
(纸本)9798350365481
Manual analysis and diagnosis of COVID-19 through the examination of Computed Tomography (CT) images of the lungs can be time-consuming and result in errors, especially given high volume of patients and numerous images per patient. So, we address the need for automation of this task by developing a new deep learning-based pipeline. Our motivation was sparked by the CVPR Workshop on "Domain Adaptation, Explainability and Fairness in AI for medical Image Analysis", more specifically, the "COVID-19 Diagnosis Competition (DEF-AI-MIA COV19D)" under the same Workshop. This challenge provides an opportunity to assess our proposed pipeline for COVID-19 detection from CT scan images. The same pipeline incorporates one of the architectures in the EfficientNet "family", but with an added Spatial Attention Mechanism: EfficientNet-SAM. Also, unlike the traditional/past pipelines, which relied on a preprocessing step, our pipeline takes the raw selected input images without any such step, except for an image-selection step to simply reduce the number of CT images required for training and/or testing. Moreover, our pipeline is computationally efficient, as, for example, it does not incorporate a decoder for segmenting the lungs. It also does not combine different models nor combine RNN with a backbone, as other pipelines in the past did. Nevertheless, our pipeline outperformed all approaches presented by other teams in last year’s instance of the same challenge using the validation subset. It also placed 5th in this year’s competition, ranking less than 1.3% below the 1st place and close to 3.5% above the 6th place based on the macro-F1 score.
Modern learning-based visual feature extraction networks perform well in intra-domain localization, however, their performance significantly declines when image pairs are captured across long-term visual domain variat...
详细信息
Maize is a vital global crop, essential for food security but highly susceptible to diseases that threaten yield and quality. Traditional methods for detecting these diseases are computationally intensive and rely on ...
详细信息
Concentric tube robots (CTR) are a promising technology for medical applications due to their small size, flexibility, and ability to make complex shapes. These robots are built from a series of pre-curved, super-elas...
详细信息
ISBN:
(数字)9798331599003
ISBN:
(纸本)9798331599010
Concentric tube robots (CTR) are a promising technology for medical applications due to their small size, flexibility, and ability to make complex shapes. These robots are built from a series of pre-curved, super-elastic tubes that are arranged concentrically and manipulated through rotational and translational movements at their proximal ends. Achieving accurate kinematics is essential in making CTRs useful in minimally invasive surgical procedures where precision and safety is paramount. Due to the difficulty of incorporating nonlinear effects like friction and tube clearances into analytical models, previous works have investigated machine learningbased models for CTR kinematics, leading to higher kinematic accuracies. We present a kinematic model for CTRs using an invertible neural network architecture that, unlike other learningbased models, can generate multiple inverse kinematic solutions. Our model achieved a mean forward kinematic tip error of 2.86 mm ( $\mathbf{3. 4 3 \%}$ normalized to arclength), outperforming a Cosserat rod-based analytical CTR model. In three test cases (two static points and a circle trajectory), our model achieved mean inverse kinematic errors of $6.19,3.33$ , and $3.86 ~\text{mm}(5.66 \%, 4.12 \%$ , and 4.16 % normalized to arclength). We additionally present a robust data capture pipeline that is able to reconstruct the CTR's shape that uses state-of-the-art segmentation models.
Simultaneous Localization and Mapping (SLAM) technology has been widely applied in various robotic scenarios, from rescue operations to autonomous driving. However, the generalization of SLAM algorithms remains a sign...
详细信息
Facial expression recognition has been a hot topic for decades,but high intraclass variation makes it *** overcome intraclass variation for visual recognition,we introduce a novel fusion methodology,in which the propo...
详细信息
Facial expression recognition has been a hot topic for decades,but high intraclass variation makes it *** overcome intraclass variation for visual recognition,we introduce a novel fusion methodology,in which the proposed model first extract features followed by feature ***,RestNet-50,VGG-19,and Inception-V3 is used to ensure feature learning followed by feature ***,the three feature extraction models are utilized using Ensemble Learning techniques for final expression *** representation learnt by the proposed methodology is robust to occlusions and pose variations and offers promising *** evaluate the efficiency of the proposed model,we use two wild benchmark datasets Real-world Affective Faces Database(RAF-DB)and AffectNet for facial expression *** proposed model classifies the emotions into seven different categories namely:happiness,anger,fear,disgust,sadness,surprise,and ***,the performance of the proposed model is also compared with other algorithms focusing on the analysis of computational cost,convergence and accuracy based on a standard problem specific to classification applications.
The face of a humanoid robot can affect the user experience, and the detection of face preference is particularly important. Preference detection belongs to a branch of emotion recognition that has received much atten...
详细信息
Robot person following (RPF) is a capability that supports many useful human-robot-interaction (HRI) applications. However, existing solutions to person following often as-sume full observation of the tracked person. ...
Robot person following (RPF) is a capability that supports many useful human-robot-interaction (HRI) applications. However, existing solutions to person following often as-sume full observation of the tracked person. As a consequence, they cannot track the person reliably under partial occlusion where the assumption of full observation is not satisfied. In this paper, we focus on the problem of robot person following under partial occlusion caused by a limited field of view of a monocular camera. Based on the key insight that it is possible to locate the target person when one or more of hislher joints are visible, we propose a method in which each visible joint contributes a location estimate of the followed person. Experiments on a public person-following dataset show that, even under partial occlusion, the proposed method can still locate the person more reliably than the existing SOTA methods. As well, the application of our method is demonstrated in real experiments on a mobile robot.
暂无评论