In recent years, designing the coding and pooling structures in layered networks has been shown to be a useful method for learning high-level feature representations for visual data. Yet, such learning structures have...
详细信息
ISBN:
(纸本)9781622767595
In recent years, designing the coding and pooling structures in layered networks has been shown to be a useful method for learning high-level feature representations for visual data. Yet, such learning structures have not been extensively studied for audio signals. In this paper, we investigate different pooling strategies based on the sparse coding scheme and propose a temporal pyramid pooling method to extract discriminative and shift-invariant feature representations. We demonstrate the superiority of our new feature representation over traditional features on the acoustic event classification task.
Cloud computing is an emerging technology developed for providing various computing and storage services over the Internet. In this paper, we proposed a privacy-preserving cloud-aware scenario for compressive multimed...
详细信息
ISBN:
(纸本)9781467325882;9781467325875
Cloud computing is an emerging technology developed for providing various computing and storage services over the Internet. In this paper, we proposed a privacy-preserving cloud-aware scenario for compressive multimedia applications, including multimedia compression, adaptation, editing/manipulation, enhancement, retrieval, and recognition. In the proposed framework, we investigate the applicability of our/existing compressive sensing (CS)-based multimedia compression and securely compressive multimedia "trans-sensing" techniques based on sparse coding for securely delivering compressively sensed multimedia data over a cloud-aware scenario. Moreover, we also investigate the applicability of our/existing sparse coding-based frameworks for several multimedia applications by leveraging the strong capability of a media cloud. More specifically, to consider several fundamental challenges for multimedia cloud computing, such as security and network/device heterogeneities, we investigate the applications of CS and sparse coding techniques in multimedia delivery and applications. As a result, we can build a unified cloud-aware framework for privacy-preserving multimedia applications via sparse coding.
The enhancement analysis in video forensics is used to enhance the. clarity of video frames of a video exhibit. The enhanced version of these video frames is important as to assist law enforcement agency for investiga...
详细信息
ISBN:
(纸本)9781467351171;9781467351188
The enhancement analysis in video forensics is used to enhance the. clarity of video frames of a video exhibit. The enhanced version of these video frames is important as to assist law enforcement agency for investigation or to be tended as evidence in court. The most significant problem observed in the analysis is the enhancement of objects under probe in video. In many cases, the probes appeared to be in low-resolution and degraded with noise, lens blur and compression artifacts. The enhancement of these, low quality probes via conventional method of denoising and resizing has proven to further degrade the quality of the probe, The. objective. of this paper is to propose an enhancement analysis algorithm based on super-resolution. Hence, we present an solution which is a single-frame solution for super-resolution. For that purpose, our proposed method incorporates sparse coding with Non-Negative Matrix Factorization in order to improve hallucination of probes in video. sparse coding is employed in learning a localized pari-based subspace which synthesizes higher resolution with respect to overcomplete patch dictionaries. We test our proposed method and compare with state-of-the-art methods namely resampling and super-resolution method, by enhancing probes in exhibit videos. We measure the image quality using peak-signal-to-noise-ratio. The result shows that our proposed method outperforms state-of the-art methods after enhancing, probes in exhibit videos.
Linear subspace learning has achieved great success in feature extraction, and it aims to map high dimensional data into low dimensional feature space which can reflect the important inherent structure of original dat...
详细信息
ISBN:
(纸本)9781467329644;9781467329637
Linear subspace learning has achieved great success in feature extraction, and it aims to map high dimensional data into low dimensional feature space which can reflect the important inherent structure of original data. In this paper, a novel approach termed Discrimination Preserving Projection (DPP) based on sparse coding is proposed, which mainly focus on combining locality supervised linear subspace learning with sparse coding. In our approach, we decompose images into two parts including more discrimination part and less discrimination part via dictionary learning and sparse coding firstly. Then, a locality supervised criterion which preserves the more discrimination part components while weaken the less discrimination part components is presented. Extensive experiments on publicly available databases are conducted to verify the effectiveness of the proposed algorithm and corroborate the above claims.
In this work we present a sparse dictionary learning method, specifically tuned to solve the tracking problem. Recently, sparse representation has drawn much attention because of its genuineness and strong mathematica...
详细信息
ISBN:
(纸本)9781467356046
In this work we present a sparse dictionary learning method, specifically tuned to solve the tracking problem. Recently, sparse representation has drawn much attention because of its genuineness and strong mathematical background. In this paper we present an online method for dictionary learning which is desirable for problems such as tracking. Online learning methods are preferable because the whole data are not available at the current time. The presented method tries to use the advantages of the generative and discriminative models to achieve better performance. The experimental results show our method can overcome many tracking challenges.
This paper propose a non-local sparse model for SAR image despeckling. sparse coding models and non-local means have been both proven very effective in natural image restoration tasks. While self-similarities exist wi...
详细信息
ISBN:
(纸本)9781467312745;9781467312721
This paper propose a non-local sparse model for SAR image despeckling. sparse coding models and non-local means have been both proven very effective in natural image restoration tasks. While self-similarities exist widely in SAR images, which encourages combining these two approaches together for SAR image despeckling tasks. A grouped-sparsity regularizer is imposed to enforce similar image patches to admit similar estimates. Image adaptive dictionary is learned by block-coordinate descent algorithm. Considering the importance of point targets, a new term is integrated into sparse coding models for preserving of point targets. Experimental results show the effectiveness of the proposed algorithm in SAR image despeckling task.
Reading text in scene images is a challenging task and is still an active research nowadays. The difficulties come from low resolution, complex background, non uniform lightning or blurring effects of scene images. Th...
详细信息
ISBN:
(纸本)9781467356046
Reading text in scene images is a challenging task and is still an active research nowadays. The difficulties come from low resolution, complex background, non uniform lightning or blurring effects of scene images. This paper focuses on recognizing characters in scene images based on the feature learning method proposed in [6] and the conclusion on comparison between sparse coding and vector quantization in [8] to build better feature representations before training the model by using SVM We asset the performance of the proposed method on some popular scene image datasets such as ICDAR 2003(3) and Chars 74k(4) Experimental results show that our proposed system has reached an encouraging recognition rate for both ICDAR 2003 and Chars74k datasets. More specially, our system archived 83.8% (62-class problem), 87% (36-class problem) of recognition rate on ICDAR 2003 Sample subset (698 images), and 73.8% accuracy on GoodImg subset (7705 images) of Chars74K dataset. in this work, our contribution is that we applied the ideas as well as the conclusions in [8] for scene text recognition problem and the experimental results show that our system outperforms other existing methods.
Super resolution reconstruction produces a higher resolution image based on a set of low resolution images, taken from the same scene. Recently, many papers have been published, proposing a variety algorithms of video...
详细信息
ISBN:
(纸本)9781467350518
Super resolution reconstruction produces a higher resolution image based on a set of low resolution images, taken from the same scene. Recently, many papers have been published, proposing a variety algorithms of video super resolution. This paper presents a new approach to video super resolution, based on sparse coding and belief propagation. First, find the candidate pixels on multiple frames using sparse coding and belief propagation. Second, exploit the similarities of candidate pixels using the Non-local Means method to average out the noise among similar patches. The experimental results show the effectiveness of our method and demonstrate its robustness to other super resolution methods.
Real-time super-resolution within surveillance video streams is a powerful tool for security and crime prevention allowing for example, events, faces or objects such number-plates and luggage to be more accurately ide...
详细信息
ISBN:
(纸本)9780819492876
Real-time super-resolution within surveillance video streams is a powerful tool for security and crime prevention allowing for example, events, faces or objects such number-plates and luggage to be more accurately identified on the fly and from a distance. However, many of the state of the art approaches to super-resolution are computationally too expensive to be suitable for real-time applications within a surveillance context. We consider one particular contemporary method based on sparse coding, 1 and show how, by relaxing some model constraints, it can be sped up significantly compared to the reference implementation, and thus approach real-time performance with visually indistinct reduction in fidelity. The final computation is three orders of magnitude faster than the reference implementation. The quality of the output is maintained: PSNR of the super-resolved images compared to ground truth is not significantly different to the reference implementation, while maintaining a noticeable improvement over baseline bicubic-interpolation approach.
We develop an intermediate representation for deformable part models and show that this representation has favorable performance characteristics for multi-class problems when the number of classes is high. Our model u...
详细信息
ISBN:
(纸本)9783642337086
We develop an intermediate representation for deformable part models and show that this representation has favorable performance characteristics for multi-class problems when the number of classes is high. Our model uses sparse coding of part filters to represent each filter as a sparse linear combination of shared dictionary elements. This leads to a universal set of parts that are shared among all object classes. Reconstruction of the original part filter responses via sparse matrix-vector product reduces computation relative to conventional part filter convolutions. Our model is well suited to a parallel implementation, and we report a new GPU DPM implementation that takes advantage of sparse coding of part filters. The speed-up offered by our intermediate representation and parallel computation enable real-time DPM detection of 20 different object classes on a laptop computer.
暂无评论