This paper demonstrates a model-based reinforcement learning framework for training a self-flying drone. We implement the Dreamer proposed in a prior work as an environment model that responds to the action taken by t...
详细信息
ISBN:
(纸本)9781728185514
This paper demonstrates a model-based reinforcement learning framework for training a self-flying drone. We implement the Dreamer proposed in a prior work as an environment model that responds to the action taken by the drone by predicting the next video frame as a new state signal. The Dreamer is a conditional video sequence generator. This model-based environment avoids the time-consuming interactions between the agent and the environment, speeding up largely the training process. This demonstration showcases for the first time the application of the Dreamer to train an agent that can finish the racing task in the Airsim simulator.
With the advent of the era of big data, people are exposed to a variety of data information every day. With the rise of high technology in the computer industry, data models are becoming more and more complex and chan...
详细信息
Denoising helps improve image quality and recover important information from noisy pictures. This work introduces a novel denoising method. GGIF, WLS, and 2D Bilateral Filtering are used in a hybrid filter. The hybrid...
详细信息
image captioning is one of the most prevalent and difficult challenges in Natural Language processing and Computer Vision: given an image, a written description of the image must be developed. The counterpart of the t...
详细信息
With the development of communication technology, people will be exposed to more and more graphics and images in the process of life andwork. Like using digital devices such as camera, scanner and camera to obtain ima...
详细信息
ISBN:
(纸本)9783031243660;9783031243677
With the development of communication technology, people will be exposed to more and more graphics and images in the process of life andwork. Like using digital devices such as camera, scanner and camera to obtain images, but these instruments and equipment can only obtain two-dimensional image information of objects, which is completely insufficient. Inmany fields, three-dimensional information of objects is necessary. In this paper, the 3D printing design of ceramic products is simulated based on 3D image reproduction technology. The satisfaction of users with the ceramic visual effect and hand-held comfort produced by 3D image reproduction simulation technology is investigated by means of questionnaire, and the computer vision technology and stereo matching technology are compared. The results show that more than 85% of users are very satisfied with the ceramic visual effect and hand-held comfort of three-dimensional image reproduction simulation technology, and less than 5% of users are not satisfied;The satisfaction of ceramic visual effect produced by computer vision technology and stereo matching technology is less than 60%, and the hand-held comfort is less than 70%.
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various ...
详细信息
ISBN:
(纸本)9781728185514
Learning-based compression systems have shown great potential for multi-task inference from their latent-space representation of the input image. In such systems, the decoder is supposed to be able to perform various analyses of the input image, such as object detection or segmentation, besides decoding the image. At the same time, privacy concerns around visual analytics have grown in response to the increasing capabilities of such systems to reveal private information. In this paper, we propose a method to make latent-space inference more privacy-friendly using mutual information-based criteria. In particular, we show how organizing and compressing the latent representation of the image according to task-specific mutual information can make the model maintain high analytics accuracy while becoming less able to reconstruct the input image and thereby reveal private information.
RDPlot is an open source GUI application for plotting Rate-Distortion (RD)-curves and calculating Bjontegaard Delta (BD) statistics [1]. It supports parsing the output of commonly used reference software packages, par...
详细信息
At present, the visual parking assistance system in intelligent driving generally has the problems of unclear parking image quality and high hardware cost. In order to reduce the difficulty of parking and improve the ...
详细信息
ISBN:
(纸本)9789811903908;9789811903892
At present, the visual parking assistance system in intelligent driving generally has the problems of unclear parking image quality and high hardware cost. In order to reduce the difficulty of parking and improve the ability to adapt to the environment, this paper proposes a vehicle assistance system based on parking image enhancement. Firstly, Retinex algorithm is used to balance the image illumination information and enhance the color saturation, so that it can adapt to more complex environmental conditions;secondly, Ackerman steering theorem is used to draw the dynamic parking aid line, and the coordinate transformation technology is used to output it to the vehicle screen. The adaptability and effectiveness of the developed system are verified by the relevant experimental research.
Learning-based image compression has reached the performance of classical methods such as BPG. One common approach is to use an autoencoder network to map the pixel information to a latent space and then approximate t...
详细信息
ISBN:
(纸本)9781728185514
Learning-based image compression has reached the performance of classical methods such as BPG. One common approach is to use an autoencoder network to map the pixel information to a latent space and then approximate the symbol probabilities in that space with a context model. During inference, the learned context model provides symbol probabilities, which are used by the entropy encoder to obtain the bitstream. Currently, the most effective context models use autoregression, but autoregression results in a very high decoding complexity due to the serialized data processing. In this work, we propose a method to parallelize the autoregressive process used for image compression. In our experiments, we achieve a decoding speed that is over 8 times faster than the standard autoregressive context model almost without compression performance reduction.
Neurons in the medial superior temporal (MSTd) region of the visual cortex of the brain can efficiently recognize the firing patterns from the neurons in the MT region. The process is similar to sparse coding in non-n...
详细信息
暂无评论