Whole-slide image (WSI) classification methods play a crucial role in tumor diagnosis. Most of them use hematoxylin and eosin (H&E) stained images, while Immunohistochemistry (IHC) staining provides molecular mark...
详细信息
Parametric 3D models have enabled a wide variety of computer vision and graphics tasks, such as modeling human faces, bodies and hands. In 3D face modeling, 3DMM is the most widely used parametric model, but can't...
详细信息
ISBN:
(纸本)9798350390155;9798350390162
Parametric 3D models have enabled a wide variety of computer vision and graphics tasks, such as modeling human faces, bodies and hands. In 3D face modeling, 3DMM is the most widely used parametric model, but can't generate fine geometric details solely from identity and expression inputs. To tackle this limitation, we propose a neural parametric model named DNPM for the facial geometric details, which utilizes deep neural network to extract latent codes from facial displacement maps encoding details and wrinkles. Built upon DNPM, a novel 3DMM named Detailed3DMM is proposed, which augments traditional 3DMMs by including the synthesis of facial details only from the identity and expression inputs. Moreover, we show that DNPM and Detailed3DMM can facilitate two downstream applications: speech-driven detailed 3D facial animation and 3D face reconstructionfrom a degraded image. Extensive experiments have shown the usefulness of DNPM and Detailed3DMM, and the progressiveness of two proposed applications.
Transformer has achieved significant progress in light field image super-resolution (LFSR) due to its long-range dependency learning ability for inter-intra view feature aggregation. However, locality information of e...
详细信息
ISBN:
(纸本)9789819786848;9789819786855
Transformer has achieved significant progress in light field image super-resolution (LFSR) due to its long-range dependency learning ability for inter-intra view feature aggregation. However, locality information of each sub-aperture view is ignored in intra-view and inter-view aggregation with Transformer, hampering the high-quality light field imagereconstruction. To this end, we propose a global to local aggregation approach termed Focal Aggregation for LFSR. In particular, Focal Aggregation includes two strategies: inter-view global to local aggregation (InterG2L) and intra-view global to local aggregation (IntraG2L). InterG2L is proposed to obtain complementary information from different views. IntraG2L is developed to extract efficient representations of a single sub-aperture view. InterG2L and IntraG2L are organized in a cascade way so that the global information of the input can be gathered for each sub-aperture image in a coarse to fine aggregation approach. Meanwhile, we also develop a global to local hierarchical feature aggregation approach named HierG2L, which enhances the last hierarchical feature used for light field reconstruction according to the input. Based on the above three global to local aggregation strategies, we construct a focal aggregation transformer (FAT) for LFSR. Experiments are performed on commonly-used LFSR benchmarks. Results demonstrate that FAT achieves superior results compared with other leading methods on synthesized and real data.
Managing chronic wounds is a global challenge that can be alleviated by the adoption of automatic systems for clinical wound assessment from consumer-grade videos. While 2D image analysis approaches are insufficient f...
详细信息
The imagereconstruction of electrical impedance tomography (EIT) is highly ill-posed and nonlinear, the reconstructed images tend to have artifacts due to noise in the measurement system. Although deep neural network...
详细信息
ISBN:
(纸本)9789819913534;9789819913541
The imagereconstruction of electrical impedance tomography (EIT) is highly ill-posed and nonlinear, the reconstructed images tend to have artifacts due to noise in the measurement system. Although deep neural networks have demonstrated great potential to remove artifacts from initial conductivity images, the interpretability and generalization ability of the network is difficult to guarantee. A deep learning structure, namely, a conditional generative adversarial network with an attention mechanism and residual connection (CGAN-AMR), is proposed for EIT imagereconstruction. The attention mechanism is utilized in the generator to learn channel dependencies, and residual connection is employed by the discriminator to improve training efficiency. As a result, the accuracy and interpretability of CGAN-AMR are improved compared with CNN and CGAN methods in the EIT imaging task. The imaging results indicate that CGAN-AMR structure can effectively improve the clarity of the reconstructed lung images, the location and boundary of lung lesions are restored accurately.
image classification models often demonstrate unstable performance in real-world applications due to variations in image information, driven by differing visual perspectives of subject objects and lighting discrepanci...
详细信息
ISBN:
(纸本)9783031723469;9783031723476
image classification models often demonstrate unstable performance in real-world applications due to variations in image information, driven by differing visual perspectives of subject objects and lighting discrepancies. To mitigate these challenges, existing studies commonly incorporate additional modal information matching the visual data to regularize the model's learning process, enabling the extraction of high-quality visual features from complex image regions. Specifically, in the realm of multimodal learning, cross-modal alignment is recognized as an effective strategy, harmonizing different modal information by learning a domain-consistent latent feature space for visual and semantic features. However, this approach may face limitations due to the heterogeneity between multimodal information, such as differences in feature distribution and structure. To address this issue, we introduce a Multimodal Alignment and reconstruction Network (MARNet), designed to enhance the model's resistance to visual noise. Importantly, MARNet includes a cross-modal diffusion reconstruction module for smoothly and stably blending information across different domains. Experiments conducted on two benchmark datasets, Vireo-Food172 and Ingredient-101, demonstrate that MARNet effectively improves the quality of image information extracted by the model. It is a plug-and-play framework that can be rapidly integrated into various image classification frameworks, boosting model performance.
The ALICE experiment at the CERN LHC (Large Hadron Collider) is undertaking a major upgrade during the LHC Long Shutdown 2 in 2019-2021, which includes a new computing system called O-2 (Online-Offline). The raw data ...
详细信息
The ALICE experiment at the CERN LHC (Large Hadron Collider) is undertaking a major upgrade during the LHC Long Shutdown 2 in 2019-2021, which includes a new computing system called O-2 (Online-Offline). The raw data input from the ALICE detectors will increase a hundredfold, up to 3.5 TB/s. By reconstructing the data online, it will be possible to compress the data stream down to 100 GB/s before storing it permanently. The O-2 software is a message-passing system. It will run on approximately 500 computing nodes performing reconstruction, compression, calibration and quality control of the received data stream. As a direct consequence of having a distributed computing system, locally generated data might be incomplete and could require merging to obtain complete results. This paper presents the O-2 Mergers, the software designed to match and combine partial data into complete objects synchronously to data taking. Based on a detailed study and results of extensive benchmarks, a qualitative and quantitative comparison of different merging strategies considered to reach the final design and implementation of the software is discussed.
In order to address the challenges of inaccurate crop phenotypic parameters and unclear crop growth status in crop yield prediction, this study focuses on tomato fruits in greenhouse production environments. It delves...
详细信息
We investigated the impact of a CNN-based deep-learning (DL) image de-blurring algorithm on coronary artery calcium (CAC) detection performance in conventional CT imaging. Our approach comprises first de-noising the i...
详细信息
ISBN:
(纸本)9781510671553;9781510671546
We investigated the impact of a CNN-based deep-learning (DL) image de-blurring algorithm on coronary artery calcium (CAC) detection performance in conventional CT imaging. Our approach comprises first de-noising the image with a state-of-the-art CNN-based image de-noising algorithm. With improved SNR, it is then possible to sharpen the image with a CNN-based image de-blurring algorithm. We train such networks using natural images, i.e., a large set of diverse photographs. The de-noising strength in the final image can be adjusted by blending back the estimated noise from the first step to the desired degree. To assess the impact of the de-blurring algorithm, we scanned an anthropomorphic phantom containing 100 small calcifications on a CT system using a CAC scoring protocol. data were acquired at clinical and high dose, and subsequently reconstructed with and without the DL de-blurring algorithm, using 25% of the maximum de-noising strength. For each small CAC, detectability was defined as the ability to calculate an Agatston score (at least 3 adjacent voxels exceeding 130 HU). For the high dose scans, CAC detectability increased from 39% for the standard reconstruction to 49% with de-blurring. The same 39% CAC detectability at high dose without de-blurring was obtained with routine dose with de-blurring. In this work, we also show some visual impressions of applying our DL de-blurring method to clinical cardiac data.
A method for tomographic reconstruction of bi-modal data sets using cross modal regularization was developed. The method constrains the Simultaneous Algebraic reconstruction Technique by using clustering-based segment...
详细信息
暂无评论