Traditional computer vision methods cannot match human performance well on fabric smoothness classification, as this is a subjective assessment based on sparse, comprehensive and low-cost visual perception. This paper...
详细信息
Traditional computer vision methods cannot match human performance well on fabric smoothness classification, as this is a subjective assessment based on sparse, comprehensive and low-cost visual perception. This paper reports a new assessment method of fabric smoothness appearance, including feature designing and wrinkle classification. A multidimensional feature was designed by generalizing vector quantization of dense scale-invariant feature transform (SIFT) descriptors for sparse coding and max pooling. sparse coding provides clear understanding about the receptive fields of visual neurons and can build a codebook from the low features. A one-against-rest linear support vector machine (SVM) was utilized to classify the nine grades of smoothness and quantized to level 0.1 by space distance. Results showed that the proposed approach achieved remarkable classification accuracy in comparison with Bag-of-Feature (BOF) and Spatial Pyramid Matching (SPM) algorithms.
Automatic annotation of images with descriptive words is a challenging problem with vast applications in the areas of image search and retrieval. This problem can be viewed as a label-assignment problem by a classifie...
详细信息
Automatic annotation of images with descriptive words is a challenging problem with vast applications in the areas of image search and retrieval. This problem can be viewed as a label-assignment problem by a classifier dealing with a very large set of labels, i.e., the vocabulary set. We propose a novel annotation method that employs two layers of sparse coding and performs coarse-to-fine labeling. Themes extracted from the training data are treated as coarse labels. Each theme is a set of training images that share a common subject in their visual and textual contents. Our system extracts coarse labels for training and test images without requiring any prior knowledge. Vocabulary words are the fine labels to be associated with images. Most of the annotation methods achieve low recall due to the large number of available fine labels, i.e., vocabulary words. These systems also tend to achieve high precision for highly frequent words only. On the other hand, text mining literature discusses a general trend where relatively rare/moderately frequent words are more important for search retrieval process than the extremely frequent words. Our system not only outperforms various previously proposed annotation systems, but also achieves symmetric response in terms of precision and recall. Our system scores and maintains high precision for words with a wide range of frequencies. Such behavior is achieved by intelligently reducing the number of available fine labels or words for each image based on coarse labels assigned to it. (C) 2017 Elsevier B.V. All rights reserved.
In this paper, we propose a novel method to detect anomaly from videos based on sparse reconstruction. Different from the traditional methods, two kinds of dictionaries are employed for anomaly detection with one repr...
详细信息
In this paper, we propose a novel method to detect anomaly from videos based on sparse reconstruction. Different from the traditional methods, two kinds of dictionaries are employed for anomaly detection with one representing the global dictionary and the other indicating the online one. The global dictionary is first trained on training samples, and then used for the local online dictionary learning and anomaly detection. A novel updating scheme is proposed in the local online dictionary learning for an accurate anomaly detection. Experiments on the public databases show that our method can effectively detect abnormal events in complex scenes.
sparse coding, which aims at finding appropriate sparse representations of data with an overcomplete dictionary set, is a well-established signal processing methodology and has good efficiency in various areas. The va...
详细信息
sparse coding, which aims at finding appropriate sparse representations of data with an overcomplete dictionary set, is a well-established signal processing methodology and has good efficiency in various areas. The varying sparse constraint can influence the performances of sparse coding algorithms greatly. However, commonly used sparse regularization may not be robust in high-coherence condition. In this paper, inspired from independently interpretable lasso (IILasso), which considers the coherence of sensing matrix columns in constraint to implement the strategy of selecting uncorrelated variables, we propose a new regularization by introducing , lp norm (0 < p < 1) into the regularization part of IILasso. The new regularization can efficiently enhance the performances in obtaining sparse and accurate coefficient. To solve the optimization problem with the new regularization, we propose to use the coordinate descent algorithm with weighted l(1) norm, named independently interpretable weighted lasso (IIWLasso), and the proximal operator, named independently interpretable iterative shrinkage thresholding algorithm (II-ISTA) and independently interpretable proximal operator for l(2/3 )norm regularization (II2/3PO). We present synthetic data experiments and gene expression data experiments to validate the performance of our proposed algorithms. The experiment results show that all independently interpretable algorithms can perform better than their original ones in different coherence conditions. Among them, IIWLasso can obtain relatively best performance both in relative norm error and support error of synthetic data and misclassification error of tenfold cross-validating gene expression data.
sparse coding, which aims at finding appropriate sparse representations of data with an overcomplete dictionary set, has become a mature class of methods with good efficiency in various areas, but it faces limitations...
详细信息
sparse coding, which aims at finding appropriate sparse representations of data with an overcomplete dictionary set, has become a mature class of methods with good efficiency in various areas, but it faces limitations in immediate processing such as real-time video denoising. Unsupervised deep neural network structured sparse coding (DNN-SC) algorithms can enhance the efficiency of iterative sparse coding algorithms to achieve the goal. In this paper, we first propose a sparse coding algorithm by adding the idea "weighted" in the iterative shrinkage thresholding algorithm (ISTA), named WISTA, which can enjoy the benefit of the l(p) norm (0 < p < 1) sparsity constraint. Then, we propose two novel DNN-SC algorithms by combining deep learning with WISTA and the iterative half thresholding algorithm (IHTA), which is the 10.5 norm sparse coding algorithm. Furthermore, we present that by changing the loss function, the DNN can be learned supervisedly and unsupervisedly. Unsupervised learning is the key to ensure the DNN to be learned online during processing, which enables the use of the DNN-SC algorithms in applications lacking labels for signals. Synthetic data experiments show that WISTA can outperform ISTA and IHTA. Moreover, the DNN-structured WISTA can successfully achieve converged results of WISTA. In real-world data experiments, the procedure of utilizing DNN-SC algorithms in image denoising is first presented. All DNN-SC algorithms can accelerate at least 45 times while maintaining PSNR results compared with their corresponding sparse coding algorithms. Finally, the strategy of utilizing DNN-SC algorithms in real-time video denoising is presented. The video-denoising experiments show that the DNN-structured ISTA and WISTA can conduct real-time video denoising for 25 frames/s 360 x 480 pixels gray-scaled videos.
With the opening of electricity market, the interaction between grids and users is becoming more and more frequent. Household electricity demand estimation is a significant and indispensable process of the necessary p...
详细信息
With the opening of electricity market, the interaction between grids and users is becoming more and more frequent. Household electricity demand estimation is a significant and indispensable process of the necessary precise demand response in the future. Large-scale coverage of the Advanced Metering Infrastructure provides a large volume of user electricity data and brings opportunities for residential electricity consumption forecasting, but, on the other hand, it has brought tremendous pressure on the communication link and data computing center. This paper proposes an efficient edge sparse coding method based on the K-singular value decomposition (K-SVD) algorithm to extract hidden usage behavior patterns (UBPs) from load datasets and reduce the cost of communication, storage, and computation. The load of representative household appliances is introduced as the initial dictionary of the K-SVD algorithm in order to make the UBPs more proximate to the residents' daily electricity consumption. Then, a linear support vector machine (SVM)-based method with UBPs is used to predict the subsequent interval household electricity demand. The experimental result shows that the proposed algorithm can effectively follow the trend of the real load curve and realize accurate forecasting of the peak electricity demand. (C) 2018 Institute of Electrical Engineers of Japan. Published by John Wiley & Sons, Inc.
The functional system of the human brain can be viewed as a complex network. Among various features of the brain functional network, community structure has raised significant interest in recent years. Increasing evid...
详细信息
The functional system of the human brain can be viewed as a complex network. Among various features of the brain functional network, community structure has raised significant interest in recent years. Increasing evidence has revealed that most realistic complex networks have an overlapping community structure. However, the overlapping community structure of the brain functional network has not been adequately studied. In this paper, we propose a novel method called sparse symmetric non-negative matrix factorization (ssNMF) to detect the overlapping community structure of the brain functional network. Specifically, it is formulated by combining the effective techniques of non-negative matrix factorization and sparse coding. Besides, the non-negative adaptive sparse representation is applied to construct the whole-brain functional network, based on which ssNMF is performed to detect the community structure. Both simulated and real functional magnetic resonance imaging data are used to evaluate ssNMF. The experimental results demonstrate that the proposed ssNMF method is capable of accurately and stably detecting the underlying overlapping community structure. Moreover, the physiological interpretation of the overlapping community structure detected by ssNMF is straightforward. This novel framework, we think, provides an effective tool to study overlapping community structure and facilitates the understanding of the network organization of the functional human brain.
In-loop filtering is an important task in video coding, as it refines both the reconstructed signal for display and the pictures used for inter-prediction. In order to remove coding artifacts, machine learning based m...
详细信息
ISBN:
(纸本)9781450357739
In-loop filtering is an important task in video coding, as it refines both the reconstructed signal for display and the pictures used for inter-prediction. In order to remove coding artifacts, machine learning based methods are assumed to be beneficial, as they utilize some prior knowledge on the characteristics of raw images. In this contribution, a dictionary learning / sparse coding based in-loop filter and a frequency adaptation model based on the l(p)-ball-energy in the spectral domain is proposed. Thereby the dictionary is trained on raw data and the algorithms are controlled mainly by the parameter for the sparsity. The frequency adaption model results in further improvement of the sparse coding based loop filter. Experimental results show that the proposed method results in coding gains up to -4.6 % at peak and -1.74 % on average against HEVC in a Random Access coding configuration.
Predicting human gaze is important for efficiently processing and understanding numerous incoming visual information from first-person videos (FPVs). Even though people continuously gaze in noisy environments, most ex...
详细信息
Predicting human gaze is important for efficiently processing and understanding numerous incoming visual information from first-person videos (FPVs). Even though people continuously gaze in noisy environments, most existing gaze prediction algorithms are based on saliency mapping, which is sensitive to noisy surroundings in the real world. Sparsity-based saliency detection algorithms perform favorably against state-of-the-art methods. In this paper, we apply a novel saliency detection method based on sparse coding with the l(1/2)-norm for predicting human gaze in FPVs. Image boundaries are first extracted via superpixels as bases for a dictionary, from which a sparse representation model is constructed. For each superpixel, we first compute sparse reconstruction errors. Then, a saliency map is updated based on the reconstruction errors. To receive the sparse reconstruction errors, the most widely utilized sparse constraint is the l(1)-norm. However, the l(1)-norm leads to over-penalization of large components in a sparse vector. We employ the l(1/2)-norm for sparse coding, which can lead to a sparser solution for a more accurate gaze prediction than the l(1)-norm. We transform the complex nonconvex optimization of sparse coding with the l(1/2)-norm to a number of one-dimensional minimization problems. In this way, we obtain the closed-form solutions efficiently. The experimental results using a real-world gaze dataset demonstrate that the proposed algorithm performs better than the state-of-the-art methods of gaze prediction for FPVs.
In multi-instance learning problems, samples are represented by multisets, which are named as bags. Each bag includes a set of feature vectors called instances. This differs multi-instance learning problems from class...
详细信息
In multi-instance learning problems, samples are represented by multisets, which are named as bags. Each bag includes a set of feature vectors called instances. This differs multi-instance learning problems from classical supervised learning problems. In this paper, to convert a multi-instance learning problem into a supervised learning problem, fixed-size feature vectors of bags are computed using a dissimilarity based method. Then, dictionary learning based bagging and random subspace ensemble classification models are proposed to exploit the underlying discriminative structure of the dissimilarity based features. Experimental results are obtained on 11 different datasets from different multi-instance learning problem domains. It is shown that the proposed random subspace based dictionary ensemble algorithm gives the best results on 8 datasets in terms of classification accuracy and area under curve. (C) 2019 Elsevier Ltd. All rights reserved.
暂无评论