Reconstructing Hyperspectral images (HSIs) from Coded Aperture Snapshot Spectral Imaging (CASSI) is an important yet challenging task. The core issue lies in recovering reliable and detailed 3D HSI cube from 2D measur...
详细信息
ISBN:
(纸本)9789819620708;9789819620715
Reconstructing Hyperspectral images (HSIs) from Coded Aperture Snapshot Spectral Imaging (CASSI) is an important yet challenging task. The core issue lies in recovering reliable and detailed 3D HSI cube from 2D measurement. Deep unfolding framework which alternates between solving data subproblems and prior subproblems has made satisfactory progress in HSIs reconstruction task. However, current methods do not fully utilize the spatial spectral prior of HSIs. To solve this problem and further enhance the spectral-spatial representation capabilities in the prior subproblems, we propose a Spatial-Spectral Correlation Transformer Based on Deep Unfolding Framework (SSCDUF). Specifically, we introduce a multi-scale Spatial-Spectral Correlation Fusion Transformer (SSCT) module that simultaneously utilize the similarity and correlation of spectral features as well as local and non-local spatial features, jointly using spatial and spectral prior to enhance feature representation. Moreover, we further propose an Adaptive Aggregation Skip Connection (AASC) module to adaptively aggregate spatial and spectral features in multiple scales. Extensive experimental results on both simulated and real scenes demonstrate that SSCDUF outperforms the state-of-the-art methods in terms of quantitative metrics while maintaining low parameter costs and runtime.
With the rapid development of digital technology and deep learning, recovering 3D scene information and reconstructing human bodies from a single image has become a focal point of research in computer vision and compu...
详细信息
The proceedings contain 23 papers. The special focus in this conference is on Skin Imaging Collaboration, Interpretability of Machine Intelligence in Medical image Computing, Embodied AI and Robotics for HealTHcare Wo...
ISBN:
(纸本)9783031776090
The proceedings contain 23 papers. The special focus in this conference is on Skin Imaging Collaboration, Interpretability of Machine Intelligence in Medical image Computing, Embodied AI and Robotics for HealTHcare Workshop and MICCAI Workshop on Distributed, Collaborative and Federated Learning. The topics include: DeCaF 2024 Preface;i2M2Net: Inter/Intra-modal Feature Masking Self-distillation for incomplete Multimodal Skin Lesion Diagnosis;from Majority to Minority: A Diffusion-Based Augmentation for Underrepresented Groups in Skin Lesion Analysis;segmentation Style Discovery: Application to Skin Lesion images;a Vision Transformer with Adaptive Cross-image and Cross-Resolution Attention;lesion Elevation Prediction from Skin images Improves Diagnosis;DWARF: Disease-Weighted Network for Attention Map Refinement;PIPNet3D: Interpretable Detection of Alzheimer in MRI Scans;Detecting Unforeseen data Properties with Diffusion Autoencoder Embeddings Using Spine MRI data;interpretability of Uncertainty: Exploring Cortical Lesion Segmentation in Multiple Sclerosis;TextCAVs: Debugging Vision Models Using Text;evaluating Visual Explanations of Attention Maps for Transformer-Based Medical Imaging;Exploiting XAI Maps to Improve MS Lesion Segmentation and Detection in MRI;EndoGS: Deformable Endoscopic Tissues reconstruction with Gaussian Splatting;VISAGE: Video Synthesis Using Action Graphs for Surgery;a Review of 3D reconstruction Techniques for Deformable Tissues in Robotic Surgery;SurgTrack: CAD-Free 3D Tracking of Real-World Surgical Instruments;MUTUAL: Towards Holistic Sensing and Inference in the Operating Room;Complex-Valued Federated Learning with Differential Privacy and MRI Applications;enhancing Privacy in Federated Learning: Secure Aggregation for Real-World Healthcare Applications;federated Impression for Learning with Distributed Heterogeneous data;A Federated Learning-Friendly Approach for Parameter-Efficient Fine-Tuning of SAM in 3D Segmentation;probing the Effic
Compressive imaging (CI) consists of reconstructing images fromincomplete observed data. The reconstruction process involves solving an ill-posed inverse problem which is highly dependent on the number of real measur...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Compressive imaging (CI) consists of reconstructing images fromincomplete observed data. The reconstruction process involves solving an ill-posed inverse problem which is highly dependent on the number of real measurements, with a greater number of measurements typically leading to more accurate reconstructions. Due to their ability to learn data distributions, diffusion models (DM) have emerged as promising techniques for various inverse problems. Mainly, DMs solve inverse problems by conditioning the generation process to the acquired measurements. In this work, we introduce a new approach to improve this conditioning by exploiting synthetic measurements, which come from a synthetic sensing matrix. Synthetic measurements are estimated from real data via a neural network. The combined real and synthetic measurements form an augmented set, which is input into the conditional DM to enhance reconstruction capacity. Computational experiments demonstrate that augmenting measurements with the conditional DM improves performance compared to using only real measurements.
Purpose: Recent advancements in generative adversarial networks (GANs) have demonstrated substantial potential in medical image processing. Despite this progress, reconstructing images fromincompletedata remains a c...
详细信息
Stable Fast 3D is widely recognized for its remarkable capacity to generate 3D models from a single 2D image in as little as 0.5 seconds. This can be further improved upon by utilizing text-to-image latent diffusion e...
详细信息
ISBN:
(数字)9798331512248
ISBN:
(纸本)9798331512255
Stable Fast 3D is widely recognized for its remarkable capacity to generate 3D models from a single 2D image in as little as 0.5 seconds. This can be further improved upon by utilizing text-to-image latent diffusion especially using the inpainting technique in the stable diffusion. The purpose of this work is to improve the quality and fidelity of the generation of 3D models by allowing user-guided customizations during the reconstruction process. Inpainting confronts two significant challenges: incomplete or noisy input data, and visualization differences, by completing unobserved areas and improving input textures. Inpainting enables users to iteratively modify their inputs, and potentially provide more coherent and aesthetically pleasing final 3D models. Experimental results indicate that by utilizing inpainting incoporated with Stable Fast 3D, increases the model precision, while retaining the original speed of model generation. The method proposed in this paper expands the use of 3D reconstruction techniques to other domains including gaming, virtual reality, and product design by providing a solution that is both more interactive and easier to create high-quality 3D assets.
Reconstructing high-quality computed tomography (CT) images from limited-angle projections is a challenging and ill-posed problem, often resulting in severe artifacts and loss of structural details. Traditional analyt...
详细信息
ISBN:
(数字)9798331533816
ISBN:
(纸本)9798331533823
Reconstructing high-quality computed tomography (CT) images from limited-angle projections is a challenging and ill-posed problem, often resulting in severe artifacts and loss of structural details. Traditional analytical methods, such as Filtered Back Projection (FBP), struggle with incompletedata, while existing deep learning approaches face limitations in generalization and reliance on extensive paired datasets. To address these challenges, we propose a novel Generative Adversarial Network (GAN)-based framework comprising a U-Net-inspired generator enhanced with residual blocks and selfattention mechanisms, coupled with a PatchGAN discriminator. The generator effectively captures long-range dependencies and structural features critical for artifact removal and reconstruction accuracy, while the PatchGAN discriminator enforces local texture realism. Additionally, a perceptual loss derived from a pretrained VGG network preserves fine anatomical details and high-level semantic consistency. Extensive evaluations on clinical datasets demonstrate the superiority of our method over state-of-the-art techniques. Quantitative metrics, including PSNR, SSIM, and MSE, confirm significant improvements, and qualitative results showcase the effective suppression of artifacts and recovery of fine structural details.
The proceedings contain 20 papers. The topics discussed include: turbulence profiling using extended objects for slope detection and ranging (SLODAR);mitigating atmospheric effects in high-resolution infrared surveill...
详细信息
ISBN:
(纸本)0819463957
The proceedings contain 20 papers. The topics discussed include: turbulence profiling using extended objects for slope detection and ranging (SLODAR);mitigating atmospheric effects in high-resolution infrared surveillance imagery with bispectral speckle imaging;restoration of nonuniformly wrapped images using accurate frame by frame shiftmap accumulation;three-dimensional imagereconstruction in variable density acoustic tomography;imaging with singular electromagnetic beam;comparative study of projection/back-projection schemes in cryo-EM tomography;intensity diffraction tomography with a novel scanning protocol;quantifying and correcting motion artifacts in MRI;the optimal reconstructionfrom blurred and nonuniformly sampled data based on the optimum discrete approximation minimizing various worst-case measures of error;and analysis of gravel river beds using three-dimensional laser scanning.
As of 2023, a record 117 million people have been dis-placed worldwide, more than double the number from a decade ago [22]. Of these, 32 million are refugees under the UNHCR's mandate, with 8.7 million residing in...
详细信息
ISBN:
(数字)9798331510831
ISBN:
(纸本)9798331510848
As of 2023, a record 117 million people have been dis-placed worldwide, more than double the number from a decade ago [22]. Of these, 32 million are refugees under the UNHCR's mandate, with 8.7 million residing in refugee camps. A critical issue faced by these populations is the lack of access to electricity, with 80% of the 8.7 million refugees and displaced persons in camps globally relying on traditional biomass for cooking and lacking reliable power for essential tasks such as cooking and charging phones. Often, the burden of collecting firewood falls on women and children, who frequently travel up to 20 kilometers into dan-gerous areas, increasing their vulnerability. [7] Electricity access could significantly alleviate these challenges, but a major obstacle is the lack of accurate power grid infrastructure maps, particularly in resource-constrained environments like refugee camps, needed for energy access planning. Existing power grid maps are often outdated, incomplete, or dependent on costly, complex technologies, limiting their practicality. To address this issue, PGRID is a novel application-based approach, which utilizes high-resolution aerial imagery to detect electrical poles and segment electrical lines, creating precise power grid maps. PGRID was tested in the Turkana region of Kenya, specifically the Kakuma and Kalobeyei Camps, cov-ering 84 km 2 and housing over 200,000 residents. Our findings show that PGRID delivers high-fidelity power grid maps especially in unplanned settlements, with F1-scores of 0.71 and 0.82 for pole detection and line segmentation, respectively. This study highlights a practical application for leveraging open data and limited labels to improve power grid mapping in unplanned settlements, where the growing number of displaced persons urgently need sustainable energy infrastructure solutions.
Electroencephalography (EEG) is vital for brain-computer interfaces (BCIs) due to its non-invasive approach and high temporal resolution data capabilities, amid challenges such as data scarcity and the need for extens...
详细信息
ISBN:
(数字)9798350368741
ISBN:
(纸本)9798350368758
Electroencephalography (EEG) is vital for brain-computer interfaces (BCIs) due to its non-invasive approach and high temporal resolution data capabilities, amid challenges such as data scarcity and the need for extensive labeling. Significant inter-individual variability in EEG signals further limits model generalization. Concurrently, the use of self-supervised pre-training, particularly through masked modeling, is gaining traction in time series analysis to mitigate labeling costs. Although this method involves reconstructing masked signal from unmasked series, random masking can disrupt critical temporal variations, complicating effective representation learning. We thus introduce SSL-MEMI, a novel self-supervised contrastive learning framework for masked EEG motor imagery modeling, integrating Domain Adaptive Alignment (DAA) and Multi-View Temporal-spatial Attention module (MTSA) to effectively handle EEG variability. This framework utilizes manifold-based masking to reconstruct original sequences from masked series, thereby enhancing classification accuracy. When tested on the BCI Competition iv and High Gamma datasets, SSL-MEMI outperforms existing methods, achieving top accuracies and demonstrating superior domain adaptation through reduced Global ${\mathcal{A}}$-distance scores. This study advances EEG classification and indicates broader applications for self-supervised learning in biomedical signal processing. The source code is available at https://***/KunKun-Zhang/***.
暂无评论