The utilization of web-based applications has greatly increased in recent years. This, in turn, has raised security concerns and data breaches leading to losses of millions of dollars. A large part of these attacks is...
详细信息
We present a novel holistic deep-learning approach for multi-task learning from a single indoor panoramic image. Our framework, named MultiPanoWise, extends vision transformers to jointly infer multiple pixel-wise sig...
详细信息
ISBN:
(纸本)9798350365474
We present a novel holistic deep-learning approach for multi-task learning from a single indoor panoramic image. Our framework, named MultiPanoWise, extends vision transformers to jointly infer multiple pixel-wise signals, such as depth, normals, and semantic segmentation, as well as signals from intrinsic decomposition, such as reflectance and shading. Our solution leverages a specific architecture combining a transformer-based encoder-decoder with multiple heads, by introducing, in particular, a novel context adjustment approach, to enforce knowledge distillation between the various signals. Moreover, at training time we introduce a hybrid loss scalarization method based on an augmented Chebychev/hypervolume scheme. We illustrate the capabilities of the proposed architecture on publicdomain synthetic and real-world datasets. We demonstrate performance improvements with respect to the most recent methods specifically designed for single tasks, like, for example, individual depth estimation or semantic segmentation. To our knowledge, this is the first architecture capable of achieving state-of-the-art performance on the joint extraction of heterogeneous signals from single indoor omnidirectional images.
Remotely controlled aerial vehicles such as drones are used for military applications such as surveillance, intelligence and target acquisition. real-time object detection and classification is one of the most recurre...
详细信息
Hyperspectral imaging is one of the most promising techniques for intraoperative tissue characterisation. Snapshot mosaic cameras, which can capture hyperspectral data in a single exposure, have the potential to make ...
详细信息
Hyperspectral imaging is one of the most promising techniques for intraoperative tissue characterisation. Snapshot mosaic cameras, which can capture hyperspectral data in a single exposure, have the potential to make a real-time hyperspectral imaging system for surgical decision-making possible. However, optimal exploitation of the captured data requires solving an ill-posed demosaicking problem and applying additional spectral corrections. In this work, we propose a supervised learning-based image demosaicking algorithm for snapshot hyperspectral images. Due to the lack of publicly available medical images acquired with snapshot mosaic cameras, a synthetic image generation approach is proposed to simulate snapshot images from existing medical image datasets captured by high-resolution, but slow, hyperspectral imaging devices. image reconstruction is achieved using convolutional neural networks for hyperspectral image super-resolution, followed by spectral correction using a sensor-specific calibration matrix. The results are evaluated both quantitatively and qualitatively, showing clear improvements in image quality compared to a baseline demosaicking method using linear interpolation. Moreover, the fast processingtime of 45 ms of our algorithm to obtain super-resolved RGB or oxygenation saturation maps per image for a state-of-the-art snapshot mosaic camera demonstrates the potential for its seamless integration into real-time surgical hyperspectral imaging applications.
Loop Closure Detection (LCD) is an essential component of Simultaneous Localization and Mapping (SLAM), helping to correct drift errors, facilitate map merging, or both by identifying previously observed scenes. Despi...
详细信息
ISBN:
(纸本)9798350377712;9798350377705
Loop Closure Detection (LCD) is an essential component of Simultaneous Localization and Mapping (SLAM), helping to correct drift errors, facilitate map merging, or both by identifying previously observed scenes. Despite its importance, traditional LCD algorithms based on single sensor such as camera or LiDAR exhibit degraded performance in challenging scenarios due to their inherent limitations. To address this issue, we propose a novel LCD method based on camera-LiDAR fusion, exploiting the rich textural information from cameras and the accurate geometric data from LiDAR to ensure robustness and speed in challenging environments. Specifically, we first employ deep hashing learning to encode deepimage features into binary image descriptors for extremely fast loop candidate (LC) retrieval. Then, LiDAR points are augmented with image color for accurate geometric verification. Finally, we incorporate a spatial-temporal consistency check that mandates an LC to have consistently matched neighbors to be accepted as true. Our method is extensively verified and compared with the state-of-the-art methods on various datasets encompassing both indoor and outdoor environments. Experimental results demonstrate that our method obtains the best performance, increasing the maximum recall rate at 100% precision by a significant margin of 20% while operating in real-time at an average speed of 30 fps.
deeplearning (DL) models have become prevalent and consistently exhibit outstanding performance across diverse domains, particularly in information security. As a subset of machine learning (ML), DL has proven adept ...
详细信息
deep reinforcement learning(DRL)-based path planning algorithms have gained significant attention in recent years due to their end-to-end processing and robustness. Nevertheless, they face challenges in scenarios with...
详细信息
ISBN:
(纸本)9798350387780;9798350387797
deep reinforcement learning(DRL)-based path planning algorithms have gained significant attention in recent years due to their end-to-end processing and robustness. Nevertheless, they face challenges in scenarios with long distances and dense obstacles because they only consider local environmental information. This paper presents a hybrid path planning and following approach that combines path planning based on soft actorcritic(SAC) with path following. Initially, a path is generated using the sampling-based method Adaptively Informed Trees (AIT*), and the path subsequently serves as tracking points for the planner. This approach guides the agent to move faster and more effectively toward the goal, and continuously updates the tracking point in realtime. We conducted an experiment to evaluate the training process and performance of this hybrid path planning approach based on DRL while comparing it to the original DRL-based approach. The experimental results illustrate the superiority of the presented approach in both the training process and task performance.
Cotton is one of the most vital cash crops cultivated around the globe and its yield directly influences the economy and the livelihood of huge number of people associated with it. The major loss of cotton occurs due ...
详细信息
Whole slide imaging (WSI) has become an essential tool in pathological diagnosis, owing to its convenience on remote and collaborative review. However, how to bring the sample at the optimal position in the axial dire...
详细信息
Whole slide imaging (WSI) has become an essential tool in pathological diagnosis, owing to its convenience on remote and collaborative review. However, how to bring the sample at the optimal position in the axial direction and image without defocusing artefacts is still a challenge, as traditional methods are either not universal or time-consuming. Until recently, deeplearning has been shown to be effective in the autofocusing task in predicting defocusing distance. Here, we apply quantized spiral phase modulation on the Fourier domain of the captured images before feeding them into a light-weight neural network. It can significantly reduce the average predicting error to be lower than any previous work on an open dataset. Also, the high predicting speed strongly supports it can be applied on an edge device for real-time tasks with limited computational source and memory footprint. (C) 2022 Optica Publishing Group under the terms of the Optica Open Access Publishing Agreement
This article proposes a quantum spatial graph convolutional neural network (QSGCN) model that is implementable on quantum circuits, providing a novel avenue to processing non-Euclidean type data based on the state-of-...
详细信息
This article proposes a quantum spatial graph convolutional neural network (QSGCN) model that is implementable on quantum circuits, providing a novel avenue to processing non-Euclidean type data based on the state-of-the-art parameterized quantum circuit (PQC) computing platforms. Four basic blocks are constructed to formulate the whole QSGCN model, including the quantum encoding, the quantum graph convolutional layer, the quantum graph pooling layer, and the network optimization. In particular, the trainability of the QSGCN model is analyzed through discussions on the barren plateau phenomenon. Simulation results from various types of graph data are presented to demonstrate the learning, generalization, and robustness capabilities of the proposed quantum neural network (QNN) model.
暂无评论