Nonuniform haze on remote sensing images degrades image quality and hinders many high-level tasks. In this paper, we propose a Nonuniformly Dehaze Network towards nonuniform haze on visible remote sensing images. To e...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Nonuniform haze on remote sensing images degrades image quality and hinders many high-level tasks. In this paper, we propose a Nonuniformly Dehaze Network towards nonuniform haze on visible remote sensing images. To extract robust haze-aware features, we propose Nonuniformly Excite (NE) module. Inspired by the well-known gather-excite attention module, NE module works in a map-excite manner. In the map operation, we utilize a proposed Dual Attention Dehaze block to extract local enhanced features. In the gather operation, we utilize a strided deformable convolution to nonuniformly process features and extract nonlocal haze-aware features. In the excite operation, we employ a pixel-wise attention between local enhanced features and nonlocal haze-aware features, to gain finer haze-aware features. Moreover, we recursively embed NE modules in a multi-scale framework. It helps not only significantly reduce network's parameters, but also recursively deliver and fuse haze-aware features from higher levels, which makes learning more efficient. Experiments demonstrate that the proposed network performs favorably against the state-of-the-art methods on both synthetic and real-world images.
The field of neuromorphic vision is developing rapidly, and event cameras are finding their way into more and more applications. However, the data stream from these sensors is characterised by significant noise. In th...
详细信息
This paper addresses the problem of automatically detecting human skin in images without reliance on color information. A primary motivation of the work has been to achieve results that are consistent across the full ...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
This paper addresses the problem of automatically detecting human skin in images without reliance on color information. A primary motivation of the work has been to achieve results that are consistent across the full range of skin tones, even while using a training dataset that is significantly biased toward lighter skin tones. Previous skin-detection methods have used color cues almost exclusively, and we present a new approach that performs well in the absence of such information. A key aspect of the work is dataset repair through augmentation that is applied strategically during training, with the goal of color invariant feature learning to enhance generalization. We have demonstrated the concept using two architectures, and experimental results show improvements in both precision and recall for most Fitzpatrick skin tones in the benchmark ECU dataset. We further tested the system with the RFW dataset to show that the proposed method performs much more consistently across different ethnicities, thereby reducing the chance of bias based on skin color. To demonstrate the effectiveness of our work, extensive experiments were performed on grayscale images as well as images obtained under unconstrained illumination and with artificial filters.
Due to the high human cost of annotation, it is non-trivial to curate a large-scale medical dataset that is fully labeled for all classes of interest. Instead, it would be convenient to collect multiple small partiall...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Due to the high human cost of annotation, it is non-trivial to curate a large-scale medical dataset that is fully labeled for all classes of interest. Instead, it would be convenient to collect multiple small partially labeled datasets from different matching sources, where the medical images may have only been annotated for a subset of classes of interest. This paper offers an empirical understanding of an under-explored problem, namely partially supervised multi-label classification (PSMLC), where a multi-label classifier is trained with only partially labeled medical images. In contrast to the fully supervised counterpart, the partial supervision caused by medical data scarcity has non-trivial negative impacts on the model performance. A potential remedy could be augmenting the partial labels. Though vicinal risk minimization (VRM) has been a promising solution to improve the generalization ability of the model, its application to PSMLC remains an open question. To bridge the methodological gap, we provide the first VRM-based solution to PSMLC. The empirical results also provide insights into future research directions on partially supervised learning under data scarcity.
Counterfactual explanations and adversarial attacks have a related goal: flipping output labels with minimal perturbations regardless of their characteristics. Yet, adversarial attacks cannot be used directly in a cou...
详细信息
ISBN:
(纸本)9798350301298
Counterfactual explanations and adversarial attacks have a related goal: flipping output labels with minimal perturbations regardless of their characteristics. Yet, adversarial attacks cannot be used directly in a counterfactual explanation perspective, as such perturbations are perceived as noise and not as actionable and understandable image modifications. Building on the robust learning literature, this paper proposes an elegant method to turn adversarial attacks into semantically meaningful perturbations, without modifying the classifiers to explain. The proposed approach hypothesizes that Denoising Diffusion Probabilistic Models are excellent regularizers for avoiding high-frequency and out-of-distribution perturbations when generating adversarial attacks. The paper's key idea is to build attacks through a diffusion model to polish them. This allows studying the target model regardless of its robustification level. Extensive experimentation shows the advantages of our counterfactual explanation approach over current State-of-the-Art in multiple testbeds.
The subtask of Human Action recognition (AR) in the dark is gaining a lot of traction nowadays, which takes a significant place in the field of computervision. The implementation of its application includes self-driv...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
The subtask of Human Action recognition (AR) in the dark is gaining a lot of traction nowadays, which takes a significant place in the field of computervision. The implementation of its application includes self-driving at night, human-pose estimation, night surveillance, etc. Currently, solutions such as DLN for AR have emerged. However, due to the poor accuracy even when leveraging on large amounts of datasets and complex architectures, the development of AR in the dark has been slow to progress. In this paper, we propose a novel and straightforward method: Z-Domain Entropy Adaptable Flex. This constructs a neural network architecture R(2+1)D, including (i) a self-attention mechanism, which combines and extracts corresponding and complementary features from the dual pathways;(ii) Zero-DCE low light image enhancement, which improves enhanced quality;and (iii) FlexMatch method, which can generates the pseudo-labels flexibly. With the help of pseudo-labels from FlexMatch, our proposed Z-DEAF method facilitates the process of gaining desired classification boundaries. This works by repeating Expanding Entropy and Shrinking Entropy. It aims to solve the problem of unclear classification boundaries between the categories. Our model obtains superior performance in experiments, and achieves state-of-the-art results on ARID.
The analysis of the multi-layer structure of wild forests is an important challenge of automated large-scale forestry. While modern aerial LiDARs offer geometric information across all vegetation layers, most datasets...
详细信息
ISBN:
(纸本)9781665487399
The analysis of the multi-layer structure of wild forests is an important challenge of automated large-scale forestry. While modern aerial LiDARs offer geometric information across all vegetation layers, most datasets and methods focus only on the segmentation and reconstruction of the top of canopy. We release WildForest3D, which consists of 29 study plots and over 2000 individual trees across 47 000m(2) with dense 3D annotation, along with occupancy and height maps for 3 vegetation layers: ground vegetation, understory, and overstory. We propose a 3D deep network architecture predicting for the first time both 3D pointwise labels and high-resolution layer occupancy rasters simultaneously. This allows us to produce a precise estimation of the thickness of each vegetation layer as well as the corresponding watertight meshes, therefore meeting most forestry purposes. Both the dataset and the model are released in open access: https://***/ekalinicheva/multi_layer_vegetation.
Skin lesion image datasets gained popularity in recent years with the successes of ISIC datasets and challenges. While the users of these datasets are growing, the Dark Corner Artifact (DCA) phenomenon is under explor...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Skin lesion image datasets gained popularity in recent years with the successes of ISIC datasets and challenges. While the users of these datasets are growing, the Dark Corner Artifact (DCA) phenomenon is under explored. This paper provides a better understanding of how and why DCA occurs, the types of DCAs and investigates the DCA within a curated ISIC image dataset. We introduce new labels of image artifacts on a curated balanced dataset of 9,810 images and identified 2,631 images with different intensities of DCA. Then, we improve the quality of this dataset by introducing automated DCA detection and removal methods. We evaluate the performance of our methods with image quality metrics on an unseen dataset (Dermofit), and achieved better SSIM score in every DCA intensity level. Further, we study the effects of DCA removal on a binary classification task (melanoma vs non-melanoma). Although deep learning performances in this task show marginal differences, we demonstrate that with DCA removal, it can help to shift the network activations to the skin lesions. All the artifact labels and codes are available at: https://***/mmu-dermatologyresearch/dark_corner_artifact_removal.
The recent advance of deep learning technology brings the possibility of assisting the pathologist to predict the patients' survival from whole-slide pathological images (WSIs). However, most of the prevalent meth...
详细信息
ISBN:
(纸本)9798350353006
The recent advance of deep learning technology brings the possibility of assisting the pathologist to predict the patients' survival from whole-slide pathological images (WSIs). However, most of the prevalent methods only worked on the sampled patches in specifically or randomly selected tumor areas of WSIs, which has very limited capability to capture the complex interactions between tumor and its surrounding micro-environment components. As a matter of fact, tumor is supported and nurtured in the heterogeneous tumor micro-environment(TME), and the de-tailed analysis of TME and their correlation with tumors are important to in-depth analyze the mechanism of cancer development. In this paper, we considered the spatial interactions among tumor and its two major TME components (i.e., lymphocytes and stromal fibrosis) and presented a Tumor Micro-environment Interactions Guided Graph Learning (TMEGL) algorithm for the prognosis prediction of human cancers. Specifically, we firstly selected different types of patches as nodes to build graph for each WSI. Then, a novel TME neighborhood organization guided graph embedding algorithm was proposed to learn node representations that can preserve their topological structure information. Finally, a Gated Graph Attention Network is applied to capture the survival-associated intersections among tumor and different TME components for clinical outcome prediction. We tested TMEGL on three cancer cohorts derived from The Cancer Genome Atlas (TCGA), and the experimental results indicated that TMEGL not only outperforms the existing WSI-based survival analysis models, but also has good explainable ability for survival prediction.
The goal of meta-learning is to generalize to new tasks and goals as quickly as possible. Ideally, we would like approaches that generalize to new goals and tasks on the first attempt. Requiring a policy to perform on...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
The goal of meta-learning is to generalize to new tasks and goals as quickly as possible. Ideally, we would like approaches that generalize to new goals and tasks on the first attempt. Requiring a policy to perform on a new task on the first attempt without even a single example trajectory is a zero-shot problem formulation. When tasks are identified by goal images, the tasks can be considered visually goal-directed. In this work, we explore the problem of visual goal-directed zero-shot meta-imitation learning. Inspired by several popular approaches to Meta-RL, we composed several core ideas related to task-embedding and planning by gradient descent to attempt to explore this problem. To evaluate these approaches, we adapted the Meta-world benchmark tasks to create 24 distinct visual goal-directed manipulation tasks. We found that 7 out of 24 tasks could be successfully completed on the first attempt by at least one of the approaches we tested. We demonstrated that goal-directed zero-shot approaches can translate to a physical robot with a demonstration based on Jenga block manipulation tasks using a Kinova Jaco robotic arm.
暂无评论