An increasing number of image processing applications require an automated quality prediction of the visual content as perceived by humans. Since, sparse coding is suggested to be an underlying strategy of the brain&#...
详细信息
An increasing number of image processing applications require an automated quality prediction of the visual content as perceived by humans. Since, sparse coding is suggested to be an underlying strategy of the brain's neural system, it would be logical to assume that specific tasks like quality assessment also attempt to adhere to this strategy. However, existing perceptual quality predictors, often mimicking the different stages of the human visual system and deploying machine learning strategies, such as neural networks, rarely integrate the concept of sparse coding in their design. In this paper, we first investigate the validity of such assumption by performing an empirical analysis on the relation between the structural information of the scene-captured via sparseness significance- and perceptual quality. Subsequently, we propose a new approach to integrate the significance of sparse coding features in the future imagequality measure (IQM) designs. We utilize the Fourier transform as a case study, which leads to a new IQM called sparseness significance ranking measure (SSRM). This measure essentially deploys a Fourier basis for sparse coding, a ranking mechanism based upon the amplitudes of the sparse coefficients and subsequently a complex correlation metric that assesses the correspondence between the ranked coefficient amplitude profiles of the reference and the distorted image. Moreover, we introduce a new methodology, namely separation ratio analysis, to assess the prediction quality of individual features or quality predictors given a target perceptual quality. The quality predictions by the proposed SSRM show excellent compatibility with perceptual quality scores. A set of routine benchmarking experiments utilizing the LIVE and CSIQ, IVC and TID2008 databases indicates a highly competitive performance with state of the art IQMs. Moreover, it delivers this performance at a low computational cost.
sparse coding has received an increasing amount of interest in recent years. It is an unsupervised learning algorithm, which finds a basis set capturing high-level semantics in the data and learns sparse coordinates i...
详细信息
sparse coding has received an increasing amount of interest in recent years. It is an unsupervised learning algorithm, which finds a basis set capturing high-level semantics in the data and learns sparse coordinates in terms of the basis set. Originally applied to modeling the human visual cortex, sparse coding has been shown useful for many applications. However, most of the existing approaches to sparse coding fail to consider the geometrical structure of the data space. In many real applications, the data is more likely to reside on a low-dimensional submanifold embedded in the high-dimensional ambient space. It has been shown that the geometrical information of the data is important for discrimination. In this paper, we propose a graph based algorithm, called graph regularized sparse coding, to learn the sparse representations that explicitly take into account the local manifold structure of the data. By using graph Laplacian as a smooth operator, the obtained sparse representations vary smoothly along the geodesics of the data manifold. The extensive experimental results on image classification and clustering have demonstrated the effectiveness of our proposed algorithm.
Oriented edges in images commonly occur in co-linear and co-circular arrangements, obeying the "good continuation law" of Gestalt psychology. The human visual system appears to exploit this property of image...
详细信息
ISBN:
(纸本)9780992862633
Oriented edges in images commonly occur in co-linear and co-circular arrangements, obeying the "good continuation law" of Gestalt psychology. The human visual system appears to exploit this property of images, with contour detection, line completion, and grouping performance well predicted by such an "association field" between edge elements [1, 2]. In this paper, we show that an association field of this type can he used to enhance the sparse representation of natural images. First, we define the sparseLets framework as an efficient representation of images based on a discrete wavelet transform. Second, we extract second-order information about edge co-occurrences from a set of images of natural scenes. Finally, we incorporate this prior information into our framework and show that it allows for the extraction of features relevant to natural scenes, like a round shape. This novel approach points the way to practical computer vision algorithms with human-like performance.
Nuclear Magnetic Resonance Spectroscopy is a technique for the analysis of complex biochemical materials. Thereby the identification of known sub-patterns is important. These measurements require an accurate preproces...
详细信息
ISBN:
(纸本)9780769531656
Nuclear Magnetic Resonance Spectroscopy is a technique for the analysis of complex biochemical materials. Thereby the identification of known sub-patterns is important. These measurements require an accurate preprocessing and analysis to meet clinical standards. Here we present a method for an appropriate sparse encoding of NMR spectral data combined with a fuzzy classification system allowing the identification of sub-patterns including mixtures thereof. The method is evaluated in contrast to an alternative approach using simulated metabolic spectra.
High Efficiency Video coding - Screen Content coding (HEVC-SCC) is an extension to HEVC which adds sophisticated compression methods for computer generated content. A video frame is usually split into blocks that are ...
详细信息
ISBN:
(纸本)9781728197661
High Efficiency Video coding - Screen Content coding (HEVC-SCC) is an extension to HEVC which adds sophisticated compression methods for computer generated content. A video frame is usually split into blocks that are predicted and subtracted from the original, which leaves a residual. These blocks are transformed by integer discrete sine transform (IntDST) or integer discrete cosine transform (IntDCT), quantized, and entropy coded into a bitstream. In contrast to camera captured content, screen content contains a lot of similar and repeated blocks. The HEVC-SCC tools utilize these similarities in various ways. After these tools are executed, the remaining signals are handled by IntDST/IntDCT which is designed to code camera-captured content. Fortunately, in sparse coding, the dictionary learning process which uses these residuals adapts much better and the outcome is significantly sparser than for camera captured content. This paper proposes a sparse coding scheme which takes advantage of the similar and repeated intra prediction residuals and targets low to mid frequency/energy blocks with a low sparsity setup. We also applied an approach which splits the common test conditions (CTC) sequences into categories for training and testing purposes. It is integrated as an alternate transform where the selection between traditional transform and our proposed method is based on a rate-distortion optimization (RDO) decision. It is integrated in HEVC-SCC test model (HM) HM-16.18+SCM-8.7. Experimental results show that the proposed method achieves a Bjontegaard rate difference (BD-rate) of up to 4.6% in an extreme computationally demanding setup for the "all intra" configuration compared with HM-16.18+SCM-8.7.
Recent research has shown that the speaker's lip shape and movement contain rich identity-related information and can be adopted for speaker identification and authentication. Among all the static lip features, th...
详细信息
ISBN:
(纸本)9781479946129
Recent research has shown that the speaker's lip shape and movement contain rich identity-related information and can be adopted for speaker identification and authentication. Among all the static lip features, the lip texture (intensity variation inside the outer lip contour) is of high discriminative power to differentiate various speakers. However, the existing lip texture feature representations cannot describe the texture information adequately and provide unsatisfactory identification results. In this paper, a sparse representation of the lip texture is proposed and a corresponding visual speaker identification scheme is presented. In the training stage, a sparse dictionary is built based on the texture samples for each speaker. In the testing stage, for any lip image investigated, the lip texture information is extracted and the reconstruction errors using all the dictionaries for every speaker are calculated. The lip image is identified to the speaker with the minimum reconstruction error. The experimental results show that the proposed sparse coding based scheme can achieve much better identification accuracy (91.37% for isolate image and 98.21% for image sequence) compared with several state-of-the-art methods when considering the lip texture information only.
In orthogonal frequency division multiplexing (OFDM) systems, frequency domain pilot-aided channel estimation is based on interpolating a down-sampled version of the channel frequency response. This is achieved by tra...
详细信息
ISBN:
(纸本)9781728119045
In orthogonal frequency division multiplexing (OFDM) systems, frequency domain pilot-aided channel estimation is based on interpolating a down-sampled version of the channel frequency response. This is achieved by transforming the channel frequency response to the time domain, eliminating time-domain channel coefficients beyond a given delay spread, and transforming back to the frequency domain. A sliding window is used to identify the most dominant channel coefficients within a prescribed delay spread which will be retained, where others will be eliminated. This setting relies on assuming a consecutive channel tap distribution and overlooks possible channel sparsity. To take advantage of this sparsity, we propose a method for obtaining the channel taps as a sparse recovery process. The proposed method is shown to substantially improve the channel estimation quality. The improvement is commensurate with the sparsity of the channel.
Given the explosive growth of online videos, it is becoming increasingly important to relieve the tedious work of browsing and managing the video content of interest. Video summarization aims at providing such a techn...
详细信息
Given the explosive growth of online videos, it is becoming increasingly important to relieve the tedious work of browsing and managing the video content of interest. Video summarization aims at providing such a technique by transforming one or multiple videos into a compact one. However, conventional multi-video summarization methods often fail to produce satisfying results as they ignore the users' search intents. To this end, this paper proposes a novel query-aware approach by formulating the multi-video summarization in a sparse coding framework, where the web images searched by a query are taken as the important preference information to reveal the query intent. To provide a user-friendly summarization, this paper also develops an event-keyframe presentation structure to present keyframes in groups of specific events related to the query by using an unsupervised multi-graph fusion method. Moreover, we release a new public dataset named MVS1K, which contains about 1000 videos from 10 queries and their video tags, manual annotations, and associated web images. Extensive experiments on the MVS1K and TVSum datasets demonstrate that our approaches produce competitively objective and subjective results. (C) 2018 Published by Elsevier Inc.
Transfer learning can transfer knowledge from a source domain to a target domain, promoting the performance of the model learned from the source data. sparse coding can make the representation of a model more succinct...
详细信息
Transfer learning can transfer knowledge from a source domain to a target domain, promoting the performance of the model learned from the source data. sparse coding can make the representation of a model more succinct and easy to manipulate. Existing transfer sparse coding methods assume the data from the source and the target domains are accurate, which can provide useful information. However, in many real applications, the data in the source and target domains may contain noise and useless information, which could severely degrade the performance of the learned model. In this paper, we propose a transfer robust sparse coding based on graph and joint distribution adaption for image representation. The noise matrix model is utilized to handle noise and useless information in the transfer sparse coding. Moreover, the differences of marginal distribution and conditional distribution are simultaneously reduced in the transfer robust sparse coding. Extensive experiments on six benchmark datasets show the proposed method can effectively deal with the noise and useless information and therefore outperforms several state-of-the-art transfer learning methods on cross-distribution domains. (c) 2018Elsevier B.V. All rights reserved.
sparse coding is a prevalent method for image inpainting and feature extraction,which can repair corrupted images or improve data processing efficiency,and has numerous applications in computer vision and signal ***,s...
详细信息
sparse coding is a prevalent method for image inpainting and feature extraction,which can repair corrupted images or improve data processing efficiency,and has numerous applications in computer vision and signal ***,sev-eral memristor-based in-memory computing systems have been proposed to enhance the efficiency of sparse coding ***,the variations and low precision of the devices will deteriorate the dictionary,causing inevitable degradation in the accuracy and reliability of the *** this work,a digital-analog hybrid memristive sparse coding system is pro-posed utilizing a multilevel Pt/Al_(2)O_(3)/AlO_(x)/W memristor,which employs the forward stagewise regression algorithm:The approxi-mate cosine distance calculation is conducted in the analog part to speed up the computation,followed by high-precision coeffi-cient updates performed in the digital *** determine that four states of the aforementioned memristor are sufficient for the processing of natural ***,through dynamic adjustment of the mapping ratio,the precision require-ment for the digit-to-analog converters can be reduced to 4 *** to the previous system,our system achieves higher image reconstruction quality of the 38 dB peak-signal-to-noise ***,in the context of image inpainting,images containing 50%missing pixels can be restored with a reconstruction error of 0.0424 root-mean-squared error.
暂无评论