Smoke detection plays a crucial role in the safety production of petrochemical enterprises and fire prevention. Image-based machine learning and deep learning methods have been widely studied. Recently, many works hav...
详细信息
Boundary equilibrium generative adversarial networks(BEGANs)are the improved version of generative adversarial networks(GANs).In this paper,an improved BEGAN with a skip-connection technique in the generator and the d...
详细信息
Boundary equilibrium generative adversarial networks(BEGANs)are the improved version of generative adversarial networks(GANs).In this paper,an improved BEGAN with a skip-connection technique in the generator and the discriminator is ***,an alternative time-scale update rule is adopted to balance the learning rate of the generator and the ***,the performance of the proposed method is quantitatively evaluated by Fréchet inception distance(FID)and inception score(IS).The test results show that the performance of the proposed method is better than that of the original BEGAN.
Fault information of rotating machinery is often drowned in strong noise signals, so it is crucial to accurately identify faults from high-intensity noise signals. In this article, an end-to-end fault diagnosis model ...
Fault information of rotating machinery is often drowned in strong noise signals, so it is crucial to accurately identify faults from high-intensity noise signals. In this article, an end-to-end fault diagnosis model is developed, which consists of a multi-stage selection filter based on wavelet packet and 2D-CNN. First, the original measured mechanical signals were processed by the three-level wavelet packet decomposition to obtain eight sub-bands with coefficient matrices. Second, the signal is reconstructed using different numbers of sub-bands, where the number is increased by one at a time to obtain eight different multi-stage reconstructed signals. Third, the reconstructed signals are reorganized into 2D signal maps;and a parallel training network is constructed using signal maps and 2D-CNN to achieve fault classification. Then, guided by the training results, eight parallel classification results are compared, so as to train the best fault diagnosis model. Finally, the simulation experiment based on a bearing data set illustrates the proposed multi-stage selection filter is effective and feasible in application.
Monocular RGB-based category-level object pose estimation is more practical and cost-effective for robotics. However, existing methods do not fully exploit the rich semantic and contextual information in multimodal da...
详细信息
As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhanc...
详细信息
As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhanc...
详细信息
ISBN:
(数字)9781665410205
ISBN:
(纸本)9781665410212
As human-machine interaction continues to evolve, the capacity for environmental perception is becoming increasingly crucial. Integrating the two most common types of sensory data, images, and point clouds, can enhance detection accuracy. Currently, there is no existing model capable of detecting an object's position in both point clouds and images while also determining their corresponding relationship. This information is invaluable for human-machine interactions, offering new possibilities for their enhancement. In light of this, this paper introduces an end-to-end Consistency Object Detection (COD) algorithm framework that requires only a single forward inference to simultaneously obtain an object's position in both point clouds and images and establish their correlation. Furthermore, to assess the accuracy of the object correlation between point clouds and images, this paper proposes a new evaluation metric, Consistency Precision (CP). To verify the effectiveness of the proposed framework, an extensive set of experiments has been conducted on the KITTI and DAIR-V2X datasets. The study also explored how the proposed consistency detection method performs on images when the calibration parameters between images and point clouds are disturbed, compared to existing post-processing methods. The experimental results demonstrate that the proposed method exhibits ex-cellent detection performance and robustness, achieving end-to-end consistency detection. The source code will be made publicly available at https://***/xifen523/COD.
Domain adaptive semantic segmentation enables robust pixel-wise understanding in real-world driving scenes. Source-free domain adaptation, as a more practical technique, addresses the concerns of data privacy and stor...
详细信息
This paper introduces the task of Auditory Referring Multi-Object Tracking (AR-MOT), which dynamically tracks specific objects in a video sequence based on audio expressions and appears as a challenging problem in aut...
详细信息
High-quality panoramic images with a Field of View (FoV) of 360° are essential for contemporary panoramic computer vision tasks. However, conventional imaging systems come with sophisticated lens designs and heav...
详细信息
High-quality panoramic images with a Field of View (FoV) of 360° are essential for contemporary panoramic computer vision tasks. However, conventional imaging systems come with sophisticated lens designs and heavy optical components. This disqualifies their usage in many mobile and wearable applications where thin and portable, minimalist imaging systems are desired. In this paper, we propose a Panoramic Computational Imaging Engine (PCIE) to achieve minimalist and high-quality panoramic imaging. With less than three spherical lenses, a Minimalist Panoramic Imaging Prototype (MPIP) is constructed based on the design of the Panoramic Annular Lens (PAL), but with low-quality imaging results due to aberrations and small image plane size. We propose two pipelines, i.e. Aberration Correction (AC) and Super-Resolution and Aberration Correction (SR&AC), to solve the image quality problems of MPIP, with imaging sensors of small and large pixel size, respectively. To leverage the prior information of the optical system, we propose a Point Spread Function (PSF) representation method to produce a PSF map as an additional modality. A PSF-aware Aberration-image Recovery Transformer (PART) is designed as a universal network for the two pipelines, in which the self-attention calculation and feature extraction are guided by the PSF map. We train PART on synthetic image pairs from simulation and put forward the PALHQ dataset to fill the gap of real-world high-quality PAL images for low-level vision. A comprehensive variety of experiments on synthetic and real-world benchmarks demonstrates the impressive imaging results of PCIE and the effectiveness of the PSF representation. We further deliver heuristic experimental findings for minimalist and high-quality panoramic imaging, in terms of the choices of prototype and pipeline, network architecture, training strategies, and dataset construction. Our dataset and code will be available at https://***/zju-jiangqi/PCIE-PART. Copyrig
Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image sem...
详细信息
Light field cameras are capable of capturing intricate angular and spatial details. This allows for acquiring complex light patterns and details from multiple angles, significantly enhancing the precision of image semantic segmentation. However, two significant issues arise: (1) The extensive angular information of light field cameras contains a large amount of redundant data, which is overwhelming for the limited hardware resources of intelligent agents. (2) A relative displacement difference exists in the data collected by different micro-lenses. To address these issues, we propose an Omni-Aperture Fusion model (OAFuser) that leverages dense context from the central view and extracts the angular information from sub-aperture images to generate semantically consistent results. To simultaneously streamline the redundant information from the light field cameras and avoid feature loss during network propagation, we present a simple yet very effective Sub-Aperture Fusion Module (SAFM). This module efficiently embeds sub-aperture images in angular features, allowing the network to process each sub-aperture image with a minimal computational demand of only (∼1GFlops). Furthermore, to address the mismatched spatial information across viewpoints, we present a Center Angular Rectification Module (CARM) to realize feature resorting and prevent feature occlusion caused by misalignment. The proposed OAFuser achieves state-of-the-art performance on four UrbanLF datasets in terms of all evaluation metrics and sets a new record of 84.93% in mIoU on the UrbanLF-Real Extended dataset, with a gain of +3.69%. The source code for OAFuser is available at https://***/FeiBryantkit/OAFuser. Impact Statement-To solve the data abundance problem, we have reduced the significant computational consumption of light field cameras while not introducing any additional parameters. The proposed method has practical value for the deployment and application of light field cameras. The proposed
暂无评论