One of the key tasks for autonomous vehicles or robots is a robust perception of their 3d environment, which is why autonomous vehicles or robots are equipped with a wide range of different sensors. Building upon a ro...
详细信息
One of the key tasks for autonomous vehicles or robots is a robust perception of their 3d environment, which is why autonomous vehicles or robots are equipped with a wide range of different sensors. Building upon a robust sensor setup, understanding and interpreting their 3d environment is the next important step. semanticsegmentation of 3d sensor data, e.g. point clouds, provides valuable information for this task and is often seen as key enabler for 3d scene understanding. This work presents an iterative deep fusion architecture for semanticsegmentation of 3d point clouds, which builds upon a range image representation of the point clouds and additionally exploits camera features to increase accuracy and robustness. In contrast to other approaches, which fuse lidar and camera features once, the proposed fusion strategy iteratively combines and refines lidar and camera features at different scales inside the network architecture. Additionally, the proposed approach can deal with camera failure as well as jointly predict lidar and camera segmentation. We demonstrate the benefits of the presented iterative deep fusion approach on two challenging datasets, outperforming all range image-based lidar and fusion approaches. An in-depth evaluation underlines the effectiveness of the proposed fusion strategy and the potential of camera features for 3dsemanticsegmentation.
Purpose: We propose a single network trained by pixel-to-label deep learning to address the general issue of automatic multiple organ segmentation in three-dimensional (3d) computed tomography (CT) images. Our method ...
详细信息
Purpose: We propose a single network trained by pixel-to-label deep learning to address the general issue of automatic multiple organ segmentation in three-dimensional (3d) computed tomography (CT) images. Our method can be described as a voxel-wise multiple-class classification scheme for automatically assigning labels to each pixel/voxel in a 2d/3d CT image. Methods: We simplify the segmentation algorithms of anatomical structures (including multiple organs) in a CT image (generally in 3d) to a majority voting scheme over the semanticsegmentation of multiple 2d slices drawn from different viewpoints with redundancy. The proposed method inherits the spirit of fully convolutional networks (FCNs) that consist of convolution anddeconvolution layers for 2dsemantic image segmentation, and expands the core structure with 3d-2d-3d transformations to adapt to 3d CT image segmentation. All parameters in the proposed network are trained pixel-to-label from a small number of CT cases with human annotations as the ground truth. The proposed network naturally fulfills the requirements of multiple organ segmentations in CT cases of different sizes that cover arbitrary scan regions without any adjustment. Results: The proposed network was trained and validated using the simultaneous segmentation of 19 anatomical structures in the human torso, including 17 major organs and two special regions (lumen and content inside of stomach). Some of these structures have never been reported in previous research on CT segmentation. A database consisting of 240 (95% for training and 5% for testing) 3d CT scans, together with their manually annotated ground-truth segmentations, was used in our experiments. The results show that the 19 structures of interest were segmented with acceptable accuracy (88.1% and 87.9% voxels in the training and testing datasets, respectively, were labeled correctly) against the ground truth. Conclusions: We propose a single network based on pixel-to-label deep le
deep Learning is being widely used to identify and segment various 2d and 3d structures in voxelizeddata in fields such as robotics and medical imaging. Automated object detection andsegmentation has had a rich hist...
详细信息
ISBN:
(纸本)9781665479431
deep Learning is being widely used to identify and segment various 2d and 3d structures in voxelizeddata in fields such as robotics and medical imaging. Automated object detection andsegmentation has had a rich history in semicon inspection anddefect detection technologies for past few decades. deep learning-based object detection and image segmentation has the potential to further improve defect detection accuracy and reduce manpower required for the quality inspection process. We develop a novel framework that utilizes the advancements in deep learning-based object detection and image segmentation techniques to leverage on partial labeleddata and remaining unlabeleddata to significantly improve the performance of locating microscopic bumps anddefects such as voids for the defect detection process. We apply our Semi-Supervised Learning approach on various buried structures such as memory bumps and logic bumps. We briefly describe our fabrication and scanning process and thereafter, explain our approach in locating these different structures in 3d scans in detail. We extract the virtual 2d slices from 3d scans, perform Semi-Supervised object detection and image segmentation to classify each pixel of these individual slices into solders, voids, Cu-Pillars, and Cu-Pads. We compare our approach with state-of-the-ad fully supervised techniques and perform a thorough analysis to discuss the advantages anddisadvantages of our approach in both object detection and image segmentation steps.
暂无评论