An embedded still image coding algorithm with rate-distortion optimization based on human visual weighting is proposed. The rate-distortion optimized embedding code coefficient with decreasing R-D slope, so the first ...
详细信息
The most important visual artefacts that result from DCT based image compression are mosquito noise and blocking. Mosquito noise becomes visible due to the uniform spatial distribution of the quantization noise in blo...
详细信息
ISBN:
(纸本)0780331923
The most important visual artefacts that result from DCT based image compression are mosquito noise and blocking. Mosquito noise becomes visible due to the uniform spatial distribution of the quantization noise in blocks which contain textured as well as flat regions, which results in clearly visible noise in the flat regions. Spatial noise shaping is an encoder based technique which can be used to reduce mosquito noise by 'hiding' the noise in that part of the block where it has the lowest visibility. We propose an efficient noise-shaping algorithm which gives accurate control of the spatial noise distribution. The encoder based technique is fully compatible with the JPEG and MPEG standards. Experiments show that noise shaping leads to a significant mosquito noise reduction.
visual Question Answering (VQA) has received increasing attentions due to the success of computer vision and natural language processing. The computer is required to understand the image, comprehend and reply to the q...
详细信息
ISBN:
(纸本)9781538636497
visual Question Answering (VQA) has received increasing attentions due to the success of computer vision and natural language processing. The computer is required to understand the image, comprehend and reply to the question. The data modal of images makes it harder to answer than textual questions. In general, as VQA tasks use Convolutional Neural Networks (CNN) to extract image features, a better CNN model is preferred for obtaining better image representations. In this paper, the Static Correlative Filter (SCF) which is an advanced technique in convolutional layers is employed for VQA, as convolutional layer is the major component of CNN. The effectiveness of SCF for VQA is demonstrated by the experiments on the benchmark dataset of COCO-QA with two baseline image question answering models.
Watermarking is used for copyright protection using logo, image, stamp and text as a watermark. Recently, especially embedding binary image to color images has been worked alot and gave very promising results. In this...
详细信息
ISBN:
(纸本)9781424419982
Watermarking is used for copyright protection using logo, image, stamp and text as a watermark. Recently, especially embedding binary image to color images has been worked alot and gave very promising results. In this work we used text with small size characters and embed to color image as an image. We used Discrete Wavelet Transform to embed partions of watermark to different frequency bants of cover image in two or more level decomposition, that is very useful in both copyright protection and information hiding. After common attacks we have got very promising results.
Humans can view an image and immediately determine what the image is trying to convey. While this may be an easy event for humans, it is still considerably difficult for a computer to understand of its own accord. The...
详细信息
ISBN:
(纸本)9781467365406
Humans can view an image and immediately determine what the image is trying to convey. While this may be an easy event for humans, it is still considerably difficult for a computer to understand of its own accord. The challenge broadly lies in developing an automatic process to complement and supplant human visual and neural systems. In this paper, we address the core issue of imparting an image the ability to caption itself automatically. We propose a hybrid engine that utilizes a combination of feature detection algorithms coupled with context-free grammar to create a model that serves to semantically and logically describe an image in its entirety. Our hybrid engine model has an F1 score of 94.33% and a unigram score of 75% when evaluated on a novel dataset trained on human-annotated images.
In this work, we have investigated the processes required for visual extracting and the remote control of KUKA KR-125 industrial robot manipulator. For this purpose, the robot controller communicates with the external...
详细信息
ISBN:
(纸本)9781479928132
In this work, we have investigated the processes required for visual extracting and the remote control of KUKA KR-125 industrial robot manipulator. For this purpose, the robot controller communicates with the external system via anEthernet cable ieee 802.3. The exchanged data are transmitted thanks to TCP/IP Protocol. To do this, we performed a client/server application with all relevant motions control. Second, we set up a Kinect in the robotproximity for the detection of objects (recognition of form, determination of position etc ...) and finally we applied it to a practical example: we have programmed the robot to be able to stack object thanks to the reliability of the visualprocessing.
Wireless communication systems operating in fading channels often demand pilots for channel estimation and data recovery, leading to substantial transmission overhead. In this paper, we propose a novel pilot-free sema...
详细信息
ISBN:
(纸本)9798350349405;9798350349399
Wireless communication systems operating in fading channels often demand pilots for channel estimation and data recovery, leading to substantial transmission overhead. In this paper, we propose a novel pilot-free semantic communication system designed for transmitting images over multi-user MIMO (MU-MIMO) fading channels. Specifically, our method involves extracting multi-scale semantic features from the source image at the transmitter, effectively embedding pilot-like information. At the receiver, we extract channel features from these semantic features at each scale, enabling the reconstruction of the source image without requiring explicit channel estimation and signal detection. To enhance the image reconstruction process, we introduce a novel module, called Resnet Transformer, which combines multi-head self-attention (MHSA) with Resnet block. Our experimental results demonstrate that this pilot-free system outperforms existing pilot-aided semantic communication methods in terms of perceptual quality and transmission efficiency.
We propose a novel blind robust watermarking scheme which embeds a digital watermark into the DCT coefficients of the host image based on human visual models(HVS). Watermark extraction utilizes Independent Component A...
详细信息
ISBN:
(纸本)9780769532783
We propose a novel blind robust watermarking scheme which embeds a digital watermark into the DCT coefficients of the host image based on human visual models(HVS). Watermark extraction utilizes Independent Component Analysis (ICA) technique which is a blind source separation procedure without requiring the host image and the original watermark. Our method is an attempt toward the proof that the Kerckhoffs principle can be suitable in the watermarking framework. Through a series of experiments, it is shown that the proposed method is effective against the attacks, such as JPEG compression image resizing and filtering.
An integrated methodology for detection of cracks and surface degradation evaluation is presented in this paper. Structural health monitoring along with the techniques of imageprocessing are used here for the strengt...
详细信息
As video traffic over Internet and service provider networks peaks today, the need for image quality measurement is obvious. Such measurements need to be done real-time, and that requires "No-Reference" meas...
详细信息
ISBN:
(纸本)9781479948741
As video traffic over Internet and service provider networks peaks today, the need for image quality measurement is obvious. Such measurements need to be done real-time, and that requires "No-Reference" measurements, i.e., measurements that do not use the uncompressed (raw) images. Now, great majority of video streams are digital. Digital video streams employ compressed streams, and they are DCT-based. In such streams, quality degradation results from reduced bitrates, and since DCT algorithm is block-based, quality degradation manifests itself as "Blockiness". In this work, we present a blockiness measurement method (called RED) that gives results closer to human perception than the literature. One of the main contributions of RED is the analytic relationship it establishes between its automatically produced blockiness metric and the scores obtained from human testers. RED optimizes the parameters of this relationship through "Regression". Another unique feature of RED compared to the literature is that it skips over some DCT blocks with the help of "Edge Detection".
暂无评论