This paper describes a novel filtering method to reconstruct an arbitrarily focused image from two differently focused images. Based on the assumption that image scene has two layers - foreground and background -, the...
详细信息
ISBN:
(纸本)0819437034
This paper describes a novel filtering method to reconstruct an arbitrarily focused image from two differently focused images. Based on the assumption that image scene has two layers - foreground and background -, the method uses the linear imaging model of the acquired two differently focused images and the desired image with arbitrary blurring;effect manipulated independently in each Ia;ver. The linear equation that holds between these images, which is derived fi om the imaging model, can be formulated as image restoration problem. This paper shows the solution of this problem, completely exists as an inverse filter, and the desired image can be reconstructed only by the linear filtering. As a result, reconstruction with high accuracy and fast processing call be achieved. Experiments using real images are shown.
Most existing attention-based methods on image captioning focus on the current visual information and text information at each step to generate the next word, without considering the coherence between the visual infor...
详细信息
ISBN:
(纸本)9783030638221;9783030638238
Most existing attention-based methods on image captioning focus on the current visual information and text information at each step to generate the next word, without considering the coherence between the visual information and the text information itself. We propose sufficient visual information (SVI) module to supplement the existing visual information contained in the network, and propose sufficient text information (STI) module to predict more text Words to supplement the text information contained in the network. Sufficient visual information module embed the attention value from the past two steps into the current attention to adapt to human visual coherence. Sufficient text information module can predict the next three words in one step, and jointly use their probabilities for inference. Finally, this paper combines these two modules to form an image captioning algorithm based on sufficient visual information and text information model (SVITI) to further integrate existing visual information and future text information in the network, thereby improving the image captioning performance of the model. These three methods are used in the classic image captioning algorithm, and have achieved achieve significant performance improvement compared to the latest method on the MS COCO dataset.
Former research on perceptual image coding was mainly developed in the traditional sequential coding frame-work, where the codestream is neither rate nor resolution scalable. In this paper, our earlier embedded subban...
详细信息
ISBN:
(纸本)0819444111
Former research on perceptual image coding was mainly developed in the traditional sequential coding frame-work, where the codestream is neither rate nor resolution scalable. In this paper, our earlier embedded subband/wavelet image coding algorithm EZBC is further developed for highly scalable image coding applications. Special attention is given to perceptual image coding under varying viewing/display conditions - a common situation in typical scalable coding application environments. Unlike the conventional perceptual image coding approach, all the perceptually coded images (individually targeted at particular viewing conditions) are decoded from a single compressed bitstream file. The experimental results show the bitrate savings by the proposed algorithm are significant, particularly for coding of high-definition (HD) images.
In this paper we describe an architecture and an implementation method for multipoint teleconference systems as one of the most important applications of imagecommunications. We studied a centralized architecture usi...
详细信息
ISBN:
(纸本)0819410187
In this paper we describe an architecture and an implementation method for multipoint teleconference systems as one of the most important applications of imagecommunications. We studied a centralized architecture using Multipoint Control Units (MCUs) as service providers. In order to apply this architecture to large-scale systems we adopted a hierarchical star configuration for inter-MCUs connections. Also we have developed a composite mechanism of international standard protocols and our original protocol for high performance services. We have built a prototype teleconference system which can provide a variety of services including several procedures for opening conferences and various types of conference modes. These services can be used not only by our original videoconference terminals but also by international standard terminals and ordinary audio telephones.
This paper introduces a simple yet effective retrieval framework for object retrieval and localization. Our method is based on min-Hash method using compositional structure preserved object representation. Compared to...
详细信息
ISBN:
(纸本)9781509028603
This paper introduces a simple yet effective retrieval framework for object retrieval and localization. Our method is based on min-Hash method using compositional structure preserved object representation. Compared to the traditional hash-based Content-based image retrieval (CBIR) system which always suffers from low recall due to insufficient discriminability of image representations, our method contributes in the following three terms: firstly, a new image feature, namely Pair of Geometric Coupled Words (PGCW), is presented to impose spatial context into visual words and generate very discriminative hash functions. Secondly, we select a batch of hashing functions by learning from a number of supervised retrievals. The sketches are then generated by selecting the hashing functions from the constructed object model. Finally, in the step of hash sketches matching, we introduce an auxiliary offset space, in which the object localization can be estimated by clustering. Our approach valids on popular public image databases and outperforms stateof-the-art methods.
Effective image coding techniques are crucial for digital image storage and transmission. Traditional methods struggle to maintain high visual quality at low bitrates. In this paper, we present MobileViT-GAN, a novel ...
详细信息
In this paper, we propose a novel hierarchical subspace regression algorithm based on edge orientations of compressed face image patches which includes two parts: training and restoration phases. In the training phase...
详细信息
ISBN:
(纸本)9781538661192
In this paper, we propose a novel hierarchical subspace regression algorithm based on edge orientations of compressed face image patches which includes two parts: training and restoration phases. In the training phase, the rule of the face edge-orientation (EO) distribution is used to classify the image patches into shallow subspaces. Then, the k-means clustering is used to cluster the deep subspaces of each EO-based shallow subspace, and corresponding linear mapping training is performed for each deep subspace. In restoration phase, an appropriate linear mapping selected based on the EO of compressed input image patch is applied to generate the restored output image patch. The experimental results show that the peak signal to noise ratio (PSNR) and structural similarity index (SSIM) are better than the existing popular algorithm, and they can effectively remove the blocking artifact and zigzag effect, so as to improve the visual effect.
The commercial success and acceptability of 3D technology will critically depend on the overall visual quality of the rendered images. Therefore Depth image Based Rendering (DIBR) is a crucial component of the 3D syst...
详细信息
ISBN:
(纸本)9780819482341
The commercial success and acceptability of 3D technology will critically depend on the overall visual quality of the rendered images. Therefore Depth image Based Rendering (DIBR) is a crucial component of the 3D system chain. In this paper, we describe a high quality DIBR system for view synthesis. In particular, a procedure is outlined for the processing of the so-called mixed pixels at object boundaries. A Layered Depth image (LDI) representation of the scene is obtained given an image and corresponding depth map. In the process, all significant mixed pixels in the image are automatically separated into their local foreground and local background. Our results show superior rendering quality, especially at object edges.
Perceptual quality metrics derived from deep features have led to a boost in modelling the Human visual System (HVS) to perceive the quality of visual content. In this work, we study the effectiveness of fine-tuning t...
详细信息
ISBN:
(纸本)9798350350463;9798350350456
Perceptual quality metrics derived from deep features have led to a boost in modelling the Human visual System (HVS) to perceive the quality of visual content. In this work, we study the effectiveness of fine-tuning three standard convolutional neural networks (CNNs) viz. ResNet50, VGG16 and MobileNetV2 to predict the quality of stereoscopic images in the no-reference setting. This work also aims to understand the impact of using disparity maps for quality prediction. Interestingly, our experiments demonstrate that disparity maps do not significantly contribute to improving perceptual quality estimation in the deep learning framework. To the best of our knowledge, this is the first study that explores the impact of disparity along with the chosen models for Stereoscopic image Quality Assessment. We present a detailed study of our experiments with various architectural configurations on the LIVE Phase I and II datasets. Further, our results demonstrate the innate capability of deep features for quality prediction. Finally, the simple fine-tuning of the models results in solutions that compete with state-of-the-art patch-based stereoscopic image quality assessment methods.
One of the most important problems in biomedical image analysis is the low amount of data and the cost of accessing to the marked data by researchers. In order to provide a solution to this problem, microscopic fluore...
详细信息
ISBN:
(纸本)9781665436496
One of the most important problems in biomedical image analysis is the low amount of data and the cost of accessing to the marked data by researchers. In order to provide a solution to this problem, microscopic fluorescence in situ hybridization (FISH) images are synthesized with generative adversarial network in this paper. The generative adversarial network is trained to synthesize FISH images from mask images. The trained model was implemented on 150 test images and the performance of the model both was presented with visual results and evaluated quantitatively by calculating the performance metrics. By evaluating the synthesized FISH images in terms of image quality and structural features, it is observed that they can be used to provide a solution to the problem of the lack of data.
暂无评论