Estimating the 6D pose of objects is a critical challenge for robotics and augmented reality applications. The problem is aggravated by the fact that critical attributes, such as an object's texture and material, ...
详细信息
ISBN:
(纸本)9783031773914;9783031773921
Estimating the 6D pose of objects is a critical challenge for robotics and augmented reality applications. The problem is aggravated by the fact that critical attributes, such as an object's texture and material, as well as the specific lighting conditions under which it must be identified, are often unknown. Neural Radiance Fields (NeRFs) and 3D Gaussian splatting (3DGS) are techniques that enable high-quality reconstruction of real-world scenes. By revising the scene fitting function, these representations can facilitate the estimation of an object's pose within a given environment. However, a major complication is that the unique textures, materials, and lighting conditions are fixed within the scene, which can impair the accuracy of pose estimation. To address this, we adopt two alterations to the standard NeRF framework that enhance its ability to handle greatly varied object appearances such as material and texture. Our modified approaches are evaluated on the prevalent YCB-V object dataset, demonstrating their effectiveness. Our two proposed algorithms achieve mesh-free 6D Object Pose Estimation for objects with previously unseen appearances, requiring only a collection of input images to train the NeRF model.
While most vision tasks are essentially visual in nature (for recognition), some important tasks, especially in the medical field, also require quantitative analysis (for quantification) using quantitative images. Unl...
详细信息
ISBN:
(纸本)9783031732805;9783031732812
While most vision tasks are essentially visual in nature (for recognition), some important tasks, especially in the medical field, also require quantitative analysis (for quantification) using quantitative images. Unlike in visual analysis, pixel values in quantitative images correspond to physical metrics measured by specific devices (e.g., a depth image). However, recent work has shown that it is sometimes possible to synthesize accurate quantitative values from visual ones (e.g., depth from visual cues or defocus). This research aims to improve quantitative image synthesis (QIS) by exploring pretraining and image resolution scaling. We propose a benchmark for evaluating pretraining performance using the task of QIS-based bone mineral density (BMD) estimation from plain X-ray images, where the synthesized quantitative image is used to derive BMD. Our results show that appropriate pretraining can improve QIS performance, significantly raising the correlation of BMD estimation from 0.820 to 0.898, while others do not help or even hinder it. Scaling up the resolution can further boost the correlation up to 0.923, a significant enhancement over conventional methods.
Benefiting from the remarkable performance of Neural Radiance Fields (NeRF) technology in 3D reconstruction, its integration into Simultaneous Localization and Mapping (SLAM) tasks for map representation has become a ...
详细信息
ISBN:
(纸本)9789819787913;9789819787920
Benefiting from the remarkable performance of Neural Radiance Fields (NeRF) technology in 3D reconstruction, its integration into Simultaneous Localization and Mapping (SLAM) tasks for map representation has become a widely recognized and novel approach in recent years. This paper proposes an improved end-to-end multilevel NeRF-based dense RGB-D SLAM, building upon the current state-of-the-art NICE-SLAM. Firstly, we enhance the structure design of the multilevel MLP, improving its ability to represent high-level details and enhancing network scalability. Secondly, we refine the keyframe selection strategy to alleviate network forgetting issues. Finally, improvements are made in eliminating depth uncertainty and refining the multi-level weight settings to further enhance system performance. Extensive experiments across multiple datasets validate the accuracy improvement of our method compared to NICE-SLAM.
Hierarchical clustering is a popular classic technique for cluster analysis, in particular, because it is easy to understand and explain. The key limitation of hierarchical agglomerative clustering is its run time: th...
详细信息
ISBN:
(纸本)9783031758225;9783031758232
Hierarchical clustering is a popular classic technique for cluster analysis, in particular, because it is easy to understand and explain. The key limitation of hierarchical agglomerative clustering is its run time: the standard algorithm runs in cubic time, and improved methods use at least quadratic time. We propose novel strategies for accelerating hierarchical clustering using incremental similarity search. Using a priority search on a vantage-point tree, we often find the next merge without computing all pairwise distances. We propose two strategies based on heaps of searches for single linkage and a third strategy based on the nearest-neighbor chain algorithm for Ward, centroid, and median linkage, other linkages are not supported efficiently (yet). Experimentally, we demonstrate 2 to 10-fold speedups on real data sets and show that subquadratic scalability is possible although it can not be guaranteed.
Set data appears in various application scenarios, such as point cloud data in autonomous driving and flow cytometry data in the medical field. Set Transformer combines attention mechanism and sparse induction point l...
详细信息
ISBN:
(纸本)9789819784899;9789819784905
Set data appears in various application scenarios, such as point cloud data in autonomous driving and flow cytometry data in the medical field. Set Transformer combines attention mechanism and sparse induction point learning to propose a set data processing network with linear computational complexity, which is used for tasks such as point cloud object classification and abnormal flow cytometry data detection. However, the sparsity of inducing points, proposed in Set Transformer, makes it challenging for the network to learn the ability to encode the global structure of set data directly. This article proposes the Synchronous Encoding Cross Attention module and the Composite Inducing Points, which work together to address this issue. Firstly, the proposed Synchronous Encoding Cross Attention module encodes the set data while encoding the induction points, thereby filtering out interference information and enhancing the learning of set data global structural information in the subsequent cross attention. Secondly, we use composite inducing points to maintain the consistency of global structural information during the encoding process. Finally, we validate the effectiveness of the proposed methods in four application scenarios through comprehensive comparisons.
Medical imaging cohorts are often confounded by factors such as acquisition devices, hospital sites, patient backgrounds, and many more. As a result, deep learning models tend to learn spurious correlations instead of...
详细信息
ISBN:
(纸本)9783031732928;9783031732904
Medical imaging cohorts are often confounded by factors such as acquisition devices, hospital sites, patient backgrounds, and many more. As a result, deep learning models tend to learn spurious correlations instead of causally related features, limiting their generalizability to new and unseen data. This problem can be addressed by minimizing dependence measures between intermediate representations of task-related and non-task-related variables. These measures include mutual information, distance correlation, and the performance of adversarial classifiers. Here, we benchmark such dependence measures for the task of preventing shortcut learning. We study a simplified setting using Morpho-MNIST and a medical imaging task with CheXpert chest radiographs. Our results provide insights into how to mitigate confounding factors in medical imaging. The project's code is publicly available (https://***/berenslab/dependence-measures-medical-imaging).
In the context of Leonardo-LABS research on autonomous and intelligent systems a prototyping simulation stack has been recently settled to allow the development of core autonomy functions for future Leonardo rotary wi...
详细信息
ISBN:
(纸本)9783031713965;9783031713972
In the context of Leonardo-LABS research on autonomous and intelligent systems a prototyping simulation stack has been recently settled to allow the development of core autonomy functions for future Leonardo rotary wing unmanned platforms. In order to increase the safety and reliability of UAV(s) some requirements, shared among all Leonardo-LHD platforms, have been identified as key enablers for future autonomous flight. This lead the selection of the 4 streams currently under development: GOA (Ground Object Awareness), ATA (Air Traffic Awareness), ATOL(Autonomous Take-Off and Landing) and ANav (Alternative/GNSS-denied navigation). The proposed simulation solution relies on Ansys (R) AVxcelerate for simulating sensors output subsequently interfaced to a widely diffused robotic middle-ware (ROS2) allowing a rapid development of perception pipelines heading to a full SIL capability and to test algorithms under one-to-one correspondence with real sensors output. In this work the overall software architecture will be discussed together with the integration into the simulation of a UAV's flight dynamic and IMU models. Such integration makes use of the Bullet Physics C++ SDK and leverages the above mentioned connection with the ROS2 stack. AVxcelerate has been recently adopted in the automotive industry for supporting autonomous car development as it allows to model and simulate multiple sensors like visible-cameras, LWIR-cameras, various LiDAR models (both flashing and rotating) and mmWave array radars. Moreover among the most advantageous features of Ansys solution there is the capability of modeling the interaction of the light's spectrum with matter and the possibility of quickly and effectively set sensors properties (cameras optics, radar antennas, LiDAR rays geometry and mechanical behavior, etc.). In the last section of this work will be also presented the so called visual-inertial navigation case of study (belonging to the ANav set of autonomy functions). It will
As the usage of RDF knowledge graphs (KGs) becomes more pervasive in practical applications, there is a burgeoning need for high-quality RDF data. The SHApes Constraint Language (SHACL) enables precise constraint expr...
详细信息
ISBN:
(纸本)9783031789540;9783031789557
As the usage of RDF knowledge graphs (KGs) becomes more pervasive in practical applications, there is a burgeoning need for high-quality RDF data. The SHApes Constraint Language (SHACL) enables precise constraint expression on RDF graphs, ensuring data structure compliance. However, a SHACL validation that overlooks the crucial implicit information encoded in the ontology of the KG may result in unsound results. Semantic-aware SHACL validation addresses this by considering implicit information in RDF graphs, thus enabling thorough and accurate data validation. Current methods that incorporate entailment into SHACL validation often face efficiency challenges due to the resource-intensive nature of applying inference rules across entire datasets. In this doctoral work, we explore methods to enhance the efficiency of semantic-aware SHACL validation, presenting the problem statement, research questions, hypotheses. The paper concludes by our proposed method and sharing preliminary results from our research.
Online hotel booking became increasingly popular as time passed, and with its popularity, the data that can be collected based on customer actions has increased. This data can serve to build intelligent systems that c...
详细信息
ISBN:
(纸本)9783031777370;9783031777387
Online hotel booking became increasingly popular as time passed, and with its popularity, the data that can be collected based on customer actions has increased. This data can serve to build intelligent systems that can provide knowledge for both customers and hotel owners. In this paper, we focus on hotel owners who can benefit from the collected data by adjusting the prices to optimise the profit of their accommodations. To accomplish this, we built a system that collected the data from *** and gathered a helpful dataset for price prediction. We used five regression algorithms and an optimization technique to obtain the best results, leading us to a 9% error for price prediction. This result allows accommodation owners to predict the room price to keep the rooms fully occupied.
The proposed European Union regulations for Artificial Intelligence (AI) has highlighted the necessity and importance of explainable AI as a way for understanding the output and working basis of current AI systems. In...
详细信息
ISBN:
(纸本)9783031777370;9783031777387
The proposed European Union regulations for Artificial Intelligence (AI) has highlighted the necessity and importance of explainable AI as a way for understanding the output and working basis of current AI systems. In the context of recommender systems as popular AIbased tools focused on providing users with the items that best fit their preferences and needs in a search space overloaded with possible choices, the development of explainable recommendation approaches currently has an increasing interest nowadays. However, most of the research efforts are focused on explaining individual recommendations. In contrast, the current contribution is centered on developing a novel approach for generating post-hoc explanations over the output of group recommender systems. To accomplish this goal, it is used the local rule-based explanation approach (LORE), popular in machine learning-based settings. The development of an experimental protocol based on item tags for evaluating the proposed approach in terms of model fidelity and feature coverage rate, points out that it is able to achieve appropriate fidelity values, while covers most of the features used for characterizing items.
暂无评论