Many users want to preserve their visual record of the moment that they want to commemorate. Nonetheless, it is still challenging to remember the actual emotional feeling for that moment even by looking at the old pic...
详细信息
ISBN:
(纸本)9781479961399
Many users want to preserve their visual record of the moment that they want to commemorate. Nonetheless, it is still challenging to remember the actual emotional feeling for that moment even by looking at the old picture. There are methods such as to tag or hide the message within the image. However, tradeoffs exist by attaching additional data for the former method and the quality of the image is degraded for the latter one. It is difficult to avoid these two tradeoffs. In this paper, we propose D-mago to preserve the moment to remember as an image, which is consists of the visual information and the emotional feeling without binding extra data or degrading the quality of the image. To further verify the benefit of our proposed algorithm, we conducted series of evaluation studies to see the effectiveness of the proposed scheme. The results indicate that D-mago overcomes the preceding tradeoffs by maintaining PSNR above 40 dB.
Inpainting applications include object removal on images and videos, crack filling, error concealment, texture synthesis, where in this paper, its usage for image coherence and perspective emphasis on video frames in ...
详细信息
ISBN:
(纸本)9781538615010
Inpainting applications include object removal on images and videos, crack filling, error concealment, texture synthesis, where in this paper, its usage for image coherence and perspective emphasis on video frames in 2D image-to-video conversion system is analysed. Besides, the performance of different techniques in object removal and image reconstruction is compared using visual experiments and quality metrics.
Magnetic Resonance Imaging (MRI) is widely used for medical diagnosis, staging and follow-up of disease. However, MRI images may have artifacts due to various reasons such as patient movement or machine distortion, wh...
详细信息
ISBN:
(纸本)9781665475921
Magnetic Resonance Imaging (MRI) is widely used for medical diagnosis, staging and follow-up of disease. However, MRI images may have artifacts due to various reasons such as patient movement or machine distortion, which may be unintentionally introduced during the procedure of medical image acquisition, processing, etc. These artifacts may affect the effectiveness of diagnosis or even cause false diagnosis. To solve this problem, we propose a general medical image quality assessment (MIQA) methodology, including subjective MIQA procedures and objective MIQA algorithms. We further apply this methodology to MRI images in this paper due to its widespread use in practical applications. We first establish a magnetic resonance imaging quality assessment (MRIQA) database, which contains 3809 MRI images. Then a subjective image quality assessment experiment is conducted by expert doctors according to the diagnostic value of these images, which split all MRI images into 1285 low quality images and 2524 high quality images. We then conduct a baseline deep learning experiment, and propose an attention based MIQANet model to automatically separate MRI images into high quality and low quality based on their diagnosis value. Our proposed method achieves a great quality assessment accuracy of 96.59%. The constructed MRIQA database and proposed MIQA model will be public available to further promote medical IQA research.
In this study a new method is proposed for inserting advertisement visuals into images automatically and without disturbing the image content. In this method important areas are determined using deep learning based ob...
详细信息
ISBN:
(纸本)9781538615010
In this study a new method is proposed for inserting advertisement visuals into images automatically and without disturbing the image content. In this method important areas are determined using deep learning based object, face and text detection, edge and saliency maps are obtained, and these information are used for the identification of the best location for inserting the advertisement visual. In order to select the best available advertisement visual from an advertisement pool shape and color features are utilized.
This paper presents an audio visual (AV) person identification system using Linear Regression-based Classifier (LRC) for person identification. Class specific models are created by stacking q-dimensional speech and im...
详细信息
ISBN:
(纸本)9781467328210;9781467328203
This paper presents an audio visual (AV) person identification system using Linear Regression-based Classifier (LRC) for person identification. Class specific models are created by stacking q-dimensional speech and image vectors from the training data. The person identification task is considered a linear regression problem, i.e., a test (speech or image) feature vector is expressed as a linear combination of the (speech or image) model of the class it belongs to. The Euclidean distance between a test feature vector and the estimated response vectors for all the class specific models are used as matching scores. These matching scores from both modalities are normalized using the min-max score normalization technique and then combined using the the sum rule of fusion. The system was tested on 88 subjects from the AusTalk AV database. Experimental results show that the identification accuracy after AV fusion is higher compared to the identification accuracy of an individual modality.
This paper is concerned with investigating, experiencing, and validating a local adaptive threshold system with compound motion analysis. The motivation here is to analyze moving objects in outdoor/indoor video frames...
详细信息
ISBN:
(纸本)9781424471379
This paper is concerned with investigating, experiencing, and validating a local adaptive threshold system with compound motion analysis. The motivation here is to analyze moving objects in outdoor/indoor video frames with respect to: movement detection, objects segmentation, features extraction besides DFT-based velocity computation. The underlying methodology exhibits a single correlation of the behavioral-mathematical model of the examined image sequences among identifying the image as time-varying functions applicable for processing through 2D Discrete Fourier Transform (DFT). The justification of this method has been revealed through output data, human visual inspection, histogramming;showing appreciable accuracy, lower level of noise, and shorter segmentation time in comparison with some available standard techniques. The horizon of applications of the presented method may involve security control, industry and traffic control, surveillance, and general civil and military fields.
Exposure Fusion is a popular multi-exposure image fusion method which blends a set of differently exposed low dynamic range images of a scene to obtain another low dynamic range but contrast rich image. This approach ...
详细信息
ISBN:
(纸本)9781479948741
Exposure Fusion is a popular multi-exposure image fusion method which blends a set of differently exposed low dynamic range images of a scene to obtain another low dynamic range but contrast rich image. This approach carries out the integration process by using three local quality measures, namely contrast, saturation and exposedness. Our aim in this study is to extend the exposure fusion method by incorporating a novel visual saliency based quality measure. This new measure captures the parts of the scene that grabs our attention and gives more prominence to these salient regions, which is otherwise impassible 19, the previous measures in use. Our experiments show that, as compared to the exposure fusion method, our saliency-guided approach gives more vivid results and leads to sharp boundaries in the output images.
visual presentation of a talking person requires the generation of image frames showing the speaker in various views while pronouncing various phonemes. The existing approaches, mostly use either a complex 3D geometri...
详细信息
ISBN:
(纸本)0780367251
visual presentation of a talking person requires the generation of image frames showing the speaker in various views while pronouncing various phonemes. The existing approaches, mostly use either a complex 3D geometric model to reconstruct a desired image or a set of 2D images for each viewpoint, to select from We propose a new system which utilizes facial feature detection and image-based transformation to create any talking frame using only one given image from desired viewpoint and a set of reference images from one standard view The proposed approach, together with optical flow-based view morphing and a customizable concatenative Text-ToSpeech, makes a personalized visual speech generation system which can be used for moving/talking head applications where an optimal trade-of between computational complexity and image database requirements is necessary.
Rapid 3D reconstruction of dynamic scenes is very useful in 3D object structure analysis, accident avoidance for UAV, and other visual applications. Against dynamic scenes, coded structured light methods have been pro...
详细信息
ISBN:
(纸本)9781728180687
Rapid 3D reconstruction of dynamic scenes is very useful in 3D object structure analysis, accident avoidance for UAV, and other visual applications. Against dynamic scenes, coded structured light methods have been proposed to obtain the depth information of an object in 3D world, and most of them are based on spatial codification. A brutal truth is that two or more cameras and projectors from different viewpoints are needed to measure the dynamic scene simultaneously for rapid 3D reconstruction. However, when two traditional patterns, especially the binaries, are mutually overlapped, interference between them arises to a new challenge to 3D reconstruction. Traditional patterns can hardly be separated from each other, which surely influence the quality of the 3D reconstruction. To eliminate the interference problem, we propose a scheme of orthogonal coded multi-view structured light systems, which can obtain accurate of depth maps for a scene. Besides, we also test the stability of the orthogonal patterns by establishing three different scenes and making a comparisons to traditional patterns. New state-of-the-art results can be obtained by our scheme in the experiments.
We present a robust and portable visual-based skin and face detection system developed for use in a multiple speaker teleconferencing system, employing both audio and video cues. An omni-directional video sensor is us...
详细信息
ISBN:
(纸本)0780367251
We present a robust and portable visual-based skin and face detection system developed for use in a multiple speaker teleconferencing system, employing both audio and video cues. An omni-directional video sensor is used to provide a view of the entire visual hemisphere, thereby allowing for multiple dynamic views of all the participants. Regions of skin are detected using simple statistical methods, along with histogram color models for both skin and non-skin color classes. Regions of skin belonging to the same person are grouped together, and using simple spatial properties, the position of each person's face is inferred. Preliminary results suggest the system is capable of detecting human faces present in an omni-directional image despite the poor resolution inherent with such an omni-directional sensor.
暂无评论