The reconstruction of 3D road centerlines becomes a physical problem of solving an energy- minimizing 3D B-splines shape model. The reconstruction is described as a process whereby a 3D road centerline shape model is ...
详细信息
ISBN:
(纸本)0819421030
The reconstruction of 3D road centerlines becomes a physical problem of solving an energy- minimizing 3D B-splines shape model. The reconstruction is described as a process whereby a 3D road centerline shape model is deformed gradually, driven by forces arising from object space (internal energy) and image sequences (external energy). Recent test results demonstrate that this approach functions reliably even in situations where navigation errors exist and the road condition is far from ideal.
This demo paper gives a real-time learned image codec on FPGA. By using Xilinx VCU128, the proposed system reaches 720P@30fps codec, which is 7.76x faster than prior work.
ISBN:
(纸本)9781665475921
This demo paper gives a real-time learned image codec on FPGA. By using Xilinx VCU128, the proposed system reaches 720P@30fps codec, which is 7.76x faster than prior work.
Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image ...
详细信息
ISBN:
(纸本)9781728180687
Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image features tend to be grouped by giving a set of organizing principles. In this paper, we propose an approach for the detection of perceptual groups in an image. We are mainly interested in features grouped by the proximity law of Gestalt. We conceive an object-based model within a stochastic framework using a marked point process (MPP). We use a Bayesian learning method to extract perceptual groups in a scene. The proposed model tested on synthetic images proves the efficient detection of perceptual groups in noisy images.
This paper demonstrates a model-based reinforcement learning framework for training a self-flying drone. We implement the Dreamer proposed in a prior work as an environment model that responds to the action taken by t...
详细信息
ISBN:
(纸本)9781728185514
This paper demonstrates a model-based reinforcement learning framework for training a self-flying drone. We implement the Dreamer proposed in a prior work as an environment model that responds to the action taken by the drone by predicting the next video frame as a new state signal. The Dreamer is a conditional video sequence generator. This model-based environment avoids the time-consuming interactions between the agent and the environment, speeding up largely the training process. This demonstration showcases for the first time the application of the Dreamer to train an agent that can finish the racing task in the Airsim simulator.
Glass reflection is a problem when taking photos through glass windows or showcases. As the visual quality of captured image can be enhanced by removing reflection, we develop an intelligent reflection elimination ima...
详细信息
ISBN:
(纸本)9781665475921
Glass reflection is a problem when taking photos through glass windows or showcases. As the visual quality of captured image can be enhanced by removing reflection, we develop an intelligent reflection elimination imaging device based on polarizer to minimize reflection effect on the images. The system mainly consists of a polarizing module, an image analysis module and a reflection removal module. The users can hold the device and capture images with minimum reflection whether in the day or night. The demo video is available at: https://***/10.6084/***.19687830.v1.
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains po...
详细信息
ISBN:
(纸本)9781728180687
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains poses and 3D points associated to descriptors. In this paper we introduce a new method that leverages the stereo vision by adding geometric information to visual descriptors. This method can be used when the vertical direction of the camera is known (for example on a wheeled robot). This new geometric visual descriptor can be used with several image based localization algorithms based on visual words. We test the approach with different datasets (indoor, outdoor) and we show experimentally that the new geometricvisual descriptor improves standard image based localization approaches.
This paper presents a deep learning-based audio-in-image watermarking scheme. Audio-in-image watermarking is the process of covertly embedding and extracting audio watermarks on a cover-image. Using audio watermarks c...
详细信息
ISBN:
(纸本)9781728185514
This paper presents a deep learning-based audio-in-image watermarking scheme. Audio-in-image watermarking is the process of covertly embedding and extracting audio watermarks on a cover-image. Using audio watermarks can open up possibilities for different downstream applications. For the purpose of implementing an audio-in-image watermarking that adapts to the demands of increasingly diverse situations, a neural network architecture is designed to automatically learn the watermarking process in an unsupervised manner. In addition, a similarity network is developed to recognize the audio watermarks under distortions, therefore providing robustness to the proposed method. Experimental results have shown high fidelity and robustness of the proposed blind audio-in-image watermarking scheme.
A problem of motion segmentation in RGB image sequence is addressed. An algorithm proposed is based on local motion modeling and pixel labeling approach. An information vector used for labeling consists of six compone...
详细信息
ISBN:
(纸本)0819421030
A problem of motion segmentation in RGB image sequence is addressed. An algorithm proposed is based on local motion modeling and pixel labeling approach. An information vector used for labeling consists of six components; three color components and three differences of colors. To develop the labeling algorithm a statistical model of motion sequence, which uses a six-variate Gaussian distribution, is chosen. Moreover, the use of a hidden Markov random field (MRF) framework is proposed in order to carry out the segmentation more accurately. The experimental results of the application of the method to an RGB sequence showing a woman's turning head are included and discussed.
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various ...
详细信息
ISBN:
(纸本)9781728185514
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various analyses of the input image, such as object detection or segmentation, besides decoding the image. At the same time, privacy concerns around visual analytics have grown in response to the increasing capabilities of such systems to reveal private information. In this paper, we propose a method to make latent-space inference more privacy-friendly using mutual information-based criteria. In particular, we show how organizing and compressing the latent representation of the image according to task-specific mutual information can make the model maintain high analytics accuracy while becoming less able to reconstruct the input image and thereby reveal private information.
Quanta image sensors are a novel paradigm in image sensor technology. Their direct application to quanta image sensors-based imaging systems is challenging because a bit-plane image is a set of binary images. In this ...
详细信息
ISBN:
(纸本)9798331529543;9798331529550
Quanta image sensors are a novel paradigm in image sensor technology. Their direct application to quanta image sensors-based imaging systems is challenging because a bit-plane image is a set of binary images. In this paper, we introduce spatiotemporal priors based on the intensity invariance and smoothness characteristics of the motion vector. Specifically, we model when the image sequences align with the correct motion vector, the spatiotemporal structure becomes more consistent. Moreover, the spatial smoothness prior is incorporated through the smoothing filtering of the evaluation metrics of motion vector candidates. The experimental results show that the proposed method is more effective than conventional methods.
暂无评论