This demo paper gives a real-time learned image codec on FPGA. By using Xilinx VCU128, the proposed system reaches 720P@30fps codec, which is 7.76x faster than prior work.
ISBN:
(纸本)9781665475921
This demo paper gives a real-time learned image codec on FPGA. By using Xilinx VCU128, the proposed system reaches 720P@30fps codec, which is 7.76x faster than prior work.
The application of Human perceptual models in image and video coding is motivated by the fact that non-perceptual distortion metrics (mean square error) do not correlate well with the perceived quality at lower bit-ra...
详细信息
ISBN:
(纸本)0819444111
The application of Human perceptual models in image and video coding is motivated by the fact that non-perceptual distortion metrics (mean square error) do not correlate well with the perceived quality at lower bit-rates despite their acceptable signal to noise ratio. In this paper, we propose a novel approach for indexing the visual content of images based on human perceptual thresholds employed for encoding. In other words, the thresholds that are employed in perceptual coding also serve as an index. These thresholds depend on the overall luminance, frequency/orientation, and the variety of patterns in an image and can serve as indexing features. These features therefore have the potential to retrieve perceptually similar images in response to a query image. Detailed simulations have been carried out using the proposed indexing concept in the DCT compressed domain. Here, the indices have been computed using the DCTune coding technique, which has been shown to provide a superior visual quality in encoding images. Simulation results demonstrate that superior retrieval performance can be achieved for specific classes of images while comparable performance is obtained for other image classes.
In this paper we present a method of creating domain-based multiple descriptions of images and video. Descriptions are created by partitioning the domain of the signal into sets whose points are maximally separated fr...
详细信息
ISBN:
(纸本)0819444111
In this paper we present a method of creating domain-based multiple descriptions of images and video. Descriptions are created by partitioning the domain of the signal into sets whose points are maximally separated from each other. This property enables simple error concealment methods to produce good estimates of lost signal samples. We present the approach in the context of Internet transmission of subband-coded images and scalable motion compensated 3-D subband-coded video, but applications are not limited to these scenarios.
Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image ...
详细信息
ISBN:
(纸本)9781728180687
Perceptual organization is the process of assigning each part of a scene to a specified association of features to be a part of the same organization. In the twenty century, Gestalt psychologists formalized how image features tend to be grouped by giving a set of organizing principles. In this paper, we propose an approach for the detection of perceptual groups in an image. We are mainly interested in features grouped by the proximity law of Gestalt. We conceive an object-based model within a stochastic framework using a marked point process (MPP). We use a Bayesian learning method to extract perceptual groups in a scene. The proposed model tested on synthetic images proves the efficient detection of perceptual groups in noisy images.
image retrieval and image compression are both very active fields of research. Unfortunately, in the past they were pursued independently leading to image indexing methods being both efficient and effective but restri...
详细信息
ISBN:
(纸本)0819444111
image retrieval and image compression are both very active fields of research. Unfortunately, in the past they were pursued independently leading to image indexing methods being both efficient and effective but restricted to uncompressed images. In this paper we introduce an image retrieval technique that operates in the compressed domain of vector quantised images. Vector quantisation (VQ) achieves compression by representing image blocks as indices into a codebook of prototype blocks. By realising that, if images are coded with their own VQ codebook then much of the image information is contained in the codebook itself, we propose the comparison of the codebooks, based on a Modified Hausdorff distance, as a novel method for compressed domain image retrieval. Experiments, based on an image database comprising many colourful pictures show this technique to give excellent results, outperforming classical colour indexing techniques.
This paper demonstrates a model-based reinforcement learning framework for training a self-flying drone. We implement the Dreamer proposed in a prior work as an environment model that responds to the action taken by t...
详细信息
ISBN:
(纸本)9781728185514
This paper demonstrates a model-based reinforcement learning framework for training a self-flying drone. We implement the Dreamer proposed in a prior work as an environment model that responds to the action taken by the drone by predicting the next video frame as a new state signal. The Dreamer is a conditional video sequence generator. This model-based environment avoids the time-consuming interactions between the agent and the environment, speeding up largely the training process. This demonstration showcases for the first time the application of the Dreamer to train an agent that can finish the racing task in the Airsim simulator.
In this paper(1), a further study based on the results of [4] is conducted to accomplish a near-optimum design of stream-shuffling and error concealment for error-resilient transmission of wireless video. In particula...
详细信息
ISBN:
(纸本)0819444111
In this paper(1), a further study based on the results of [4] is conducted to accomplish a near-optimum design of stream-shuffling and error concealment for error-resilient transmission of wireless video. In particular, a two-phase stream-shuffling scheme is systematically scheduled in conjunction with embedded channel codes, while a new error detection method and adaptive error concealment algorithm are contrived to effectively mitigate the effects of residual errors on visual quality. Comparison with the algorithms proposed in [4], the new methodology may provide better understanding of optimum design of stream-shuffling in a systematic way, which yielding improved performance in wireless video communication.
Glass reflection is a problem when taking photos through glass windows or showcases. As the visual quality of captured image can be enhanced by removing reflection, we develop an intelligent reflection elimination ima...
详细信息
ISBN:
(纸本)9781665475921
Glass reflection is a problem when taking photos through glass windows or showcases. As the visual quality of captured image can be enhanced by removing reflection, we develop an intelligent reflection elimination imaging device based on polarizer to minimize reflection effect on the images. The system mainly consists of a polarizing module, an image analysis module and a reflection removal module. The users can hold the device and capture images with minimum reflection whether in the day or night. The demo video is available at: https://***/10.6084/***.19687830.v1.
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains po...
详细信息
ISBN:
(纸本)9781728180687
This paper addresses the problem of image based localization. The goal is to find quickly and accurately the relative pose from a query taken from a stereo camera and a map obtained using visual SLAM which contains poses and 3D points associated to descriptors. In this paper we introduce a new method that leverages the stereo vision by adding geometric information to visual descriptors. This method can be used when the vertical direction of the camera is known (for example on a wheeled robot). This new geometric visual descriptor can be used with several image based localization algorithms based on visual words. We test the approach with different datasets (indoor, outdoor) and we show experimentally that the new geometricvisual descriptor improves standard image based localization approaches.
暂无评论