The proceedings contain 23 papers. The special focus in this conference is on Skin Imaging Collaboration, Interpretability of Machine Intelligence in Medical image Computing, Embodied AI and Robotics for HealTHcare Wo...
ISBN:
(纸本)9783031776090
The proceedings contain 23 papers. The special focus in this conference is on Skin Imaging Collaboration, Interpretability of Machine Intelligence in Medical image Computing, Embodied AI and Robotics for HealTHcare Workshop and MICCAI Workshop on Distributed, Collaborative and Federated Learning. The topics include: DeCaF 2024 Preface;i2M2Net: Inter/Intra-modal Feature Masking Self-distillation for incomplete Multimodal Skin Lesion Diagnosis;from Majority to Minority: A Diffusion-Based Augmentation for Underrepresented Groups in Skin Lesion Analysis;segmentation Style Discovery: Application to Skin Lesion images;a vision Transformer with Adaptive Cross-image and Cross-Resolution Attention;lesion Elevation Prediction from Skin images Improves Diagnosis;DWARF: Disease-Weighted Network for Attention Map Refinement;PIPNet3D: Interpretable Detection of Alzheimer in MRI Scans;Detecting Unforeseen data Properties with Diffusion Autoencoder Embeddings Using Spine MRI data;interpretability of Uncertainty: Exploring Cortical Lesion Segmentation in Multiple Sclerosis;TextCAVs: Debugging vision Models Using Text;evaluating visual Explanations of Attention Maps for Transformer-Based Medical Imaging;Exploiting XAI Maps to Improve MS Lesion Segmentation and Detection in MRI;EndoGS: Deformable Endoscopic Tissues reconstruction with Gaussian Splatting;viSAGE: video Synthesis Using Action Graphs for Surgery;a Review of 3D reconstruction Techniques for Deformable Tissues in Robotic Surgery;SurgTrack: CAD-Free 3D Tracking of Real-World Surgical Instruments;MUTUAL: Towards Holistic Sensing and Inference in the Operating Room;Complex-Valued Federated Learning with Differential Privacy and MRI Applications;enhancing Privacy in Federated Learning: Secure Aggregation for Real-World Healthcare Applications;federated Impression for Learning with Distributed Heterogeneous data;A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation;probing the Effic
Unsupervised image-to-image Translation (UNIT) has gained significant attention due to its strong ability of data augmentation. UNIT aims to generate a visually pleasing image by synthesizing an image's content wi...
详细信息
ISBN:
(纸本)9789819985364;9789819985371
Unsupervised image-to-image Translation (UNIT) has gained significant attention due to its strong ability of data augmentation. UNIT aims to generate a visually pleasing image by synthesizing an image's content with another's style. However, current methods cannot ensure that the style of the generated image matches that of the input style image well. To overcome this issue, we present a new two-stage framework, called Unsupervised image-to-image Translation with Style Consistency (SC-UNIT), for improving the style consistency between the image of the style domain and the generated image. The key idea of SC-UNIT is to build a style consistency module to prevent the deviation of the learned style from the input one. Specifically, in the first stage, SC-UNIT trains a content encoder to extract the multiple-layer content features wherein the last-layer's feature can represent the abstract domain-shared content. In the second stage, we train a generator to integrate the content features with the style feature to generate a new image. During the generation process, dynamic skip connections and multiple-layer content features are used to build multiple-level content correspondences. Furthermore, we design a style reconstruction loss to make the style of the generated image consistent with that of the input style image. Numerous experimental results show that our SC-UNIT outperforms state-of-the-art methods in image quality, style diversity, and style consistency, even for domains with significant visual differences. The code is available at https://***/GZHU-DVL/SC-UNIT.
This work mainly addresses the challenges in 3D human pose and shape estimation from real partial point clouds. Existing 3D human estimation methods from point clouds usually have limited generalization ability on rea...
详细信息
ISBN:
(纸本)9789819785070;9789819785087
This work mainly addresses the challenges in 3D human pose and shape estimation from real partial point clouds. Existing 3D human estimation methods from point clouds usually have limited generalization ability on real data due to factors such as self-occlusion and random noise and domain gap between real data and synthetic data. In this paper, we propose a pose-aware auto-augmentation framework for 3D human pose and shape estimation from partial point clouds. Specifically, we design an occlusion-aware module for the estimator network that can obtain refined features to accurately regress human pose and shape parameters from partial point clouds, even if the point clouds are self-occlusive. Based on the pose parameters and global features of the point clouds from estimator network, we carefully design a learnable augmentor network that can intelligently drive and deform real data to enrich data diversity during the training of estimator network. To guide the augmentor network to generate challenging augmented samples, we adopt an adversarial learning strategy according to the error feedback of the estimator. The experimental results on real data and synthetic data demonstrate that the proposed approach can accurately estimate the 3D human pose and shape from partial point clouds and outperform prior works in terms of reconstruction accuracy.
In many real-world inverse problems, only incomplete measurement data are available for training which can pose a problem for learning a reconstruction function. Indeed, unsupervised learning using a fixed incomplete ...
详细信息
ISBN:
(纸本)9781713871088
In many real-world inverse problems, only incomplete measurement data are available for training which can pose a problem for learning a reconstruction function. Indeed, unsupervised learning using a fixed incomplete measurement process is impossible in general, as there is no information in the nullspace of the measurement operator. This limitation can be overcome by using measurements from multiple operators. While this idea has been successfully applied in various applications, a precise characterization of the conditions for learning is still lacking. In this paper, we fill this gap by presenting necessary and sufficient conditions for learning the underlying signal model needed for reconstruction which indicate the interplay between the number of distinct measurement operators, the number of measurements per operator, the dimension of the model and the dimension of the signals. Furthermore, we propose a novel and conceptually simple unsupervised learning loss which only requires access to incomplete measurement data and achieves a performance on par with supervised learning when the sufficient condition is verified. We validate our theoretical bounds and demonstrate the advantages of the proposed unsupervised loss compared to previous methods via a series of experiments on various imaging inverse problems, such as accelerated magnetic resonance imaging, compressed sensing and image inpainting.
To address the problem of incomplete Multi-view Stereo (MVS) reconstruction, the initial depth and loss function of the depth residual iterative network are investigated, and a new multi-view stereo reconstruction net...
详细信息
Computer vision is a very wide field, although, research is advancing rapidly. But there are still many incomplete areas of research. For this reason we are interested on 3D reconstruction field, which consists of cre...
详细信息
Panorama image has a large 360 degrees field of view, providing rich contextual information for object detection, widely used in virtual reality, augmented reality, scene understanding, etc. However, existing methods ...
详细信息
ISBN:
(纸本)9783031263125;9783031263132
Panorama image has a large 360 degrees field of view, providing rich contextual information for object detection, widely used in virtual reality, augmented reality, scene understanding, etc. However, existing methods for object detection on panorama image still have some problems. When 360 degrees content is converted to the projection plane, the geometric distortion brought by the projection model makes the neural network can not extract features efficiently, the objects at the boundary of the projection image are also incomplete. To solve these problems, in this paper, we propose a novel two-stage detection network, RepF-Net, comprehensively utilizing multiple distortion-aware convolution modules to deal with geometric distortion while performing effective features extraction, and using the non-maximum fusion algorithm to fuse the content of the detected object in the post-processing stage. Our proposed unified distortion-aware convolution modules can be used to deal with distortions from geometric transforms and projection models, and be used to solve the geometric distortion caused by equirectangular projection and stereographic projection in our network. Our proposed non-maximum fusion algorithm fuses the content of detected objects to deal with incomplete object content separated by the projection boundary. Experimental results show that our RepF-Net outperforms previous state-of-the-art methods by 6% on mAP. Based on RepF-Net, we present an implementation of 3D object detection and scene layout reconstruction application.
A deep learning approach will be used to recover ancient pictures that have suffered significant damage. Unlike typical reconstruction processes that are easily handled by supervised learning methods, real-world pictu...
详细信息
Compressed sensing using a dictionary is known to be effective for reconstructing CT images fromincomplete projection data (eg. limited-angle CT and sparse-view CT) and its practical applications are increasing. Howe...
详细信息
Photoacoustic imaging (PAI) is a newly emerging bimodal imaging technology based on the photoacoustic effect;specifically, it uses sound waves caused by light absorption in a material to obtain 3D structure data nonin...
详细信息
ISBN:
(纸本)9783031164460;9783031164453
Photoacoustic imaging (PAI) is a newly emerging bimodal imaging technology based on the photoacoustic effect;specifically, it uses sound waves caused by light absorption in a material to obtain 3D structure data noninvasively. PAI has attracted attention as a promising measurement technology for comprehensive clinical application and medical diagnosis. Because it requires exhaustively scanning an entire object and recording ultrasonic waves from various locations, it encounters two problems: a long imaging time and a huge data size. To reduce the imaging time, a common solution is to apply compressive sensing (CS) theory. CS can effectively accelerate the imaging process by reducing the number of measurements, but the data size is still large, and efficient compression of such incompletedata remains a problem. In this paper, we present the first attempt at direct compression of incomplete 3D PA observations, which simultaneously reduces the data acquisition time and alleviates the data size issue. Specifically, we first use a graph model to represent the incomplete observations. Then, we propose three coding modes and a reliability-aware rate-distortion optimization (RDO) to adaptively compress the data into sparse coefficients. Finally, we obtain a coded bit stream through entropy coding. We demonstrate the effectiveness of our proposed framework through both objective evaluation and subjective visual checking of real medical PA data captured from patients.
暂无评论