Object recognition and localization is still a very challenging problem, despite recent advances in deep learning (DL) approaches, especially for objects with varying shapes and appearances. Statistical models, such a...
详细信息
Object recognition and localization is still a very challenging problem, despite recent advances in deep learning (DL) approaches, especially for objects with varying shapes and appearances. Statistical models, such as an Active Shape Model (ASM), rely on a parametric model of the object, allowing an easy incorporation of prior knowledge about shape and appearance in a principled way. To take advantage of these benefits, this paper proposes a new ASM framework that addresses two tasks: (i) comparing the performance of several image features used to extract observations from an input image;and (ii) improving the performance of the model fitting by relying on a probabilistic framework that allows the use of multiple observations and is robust to the presence of outliers. The goal in (i) is to maximize the quality of the observations by exploring a wide set of handcrafted features (HOG, SIFT, and texture templates) and more recent DL-based features. Regarding (ii), we use the generalized expectation-maximization algorithm to deal with outliers and to extend the fitting process to multiple observations. The proposed framework is evaluated in the context of facial landmark fitting and the segmentation of the endocardium of the left ventricle in cardiac magnetic resonance volumes. We experimentally observe that the proposed approach is robust not only to outliers, but also to adverse initialization conditions and to large search regions (from where the observations are extracted from the image). Furthermore, the results of the proposed combination of the ASM with DL-based features are competitive with more recent DL approaches (e.g. FCN [1], U-Net [2] and CNN Cascade [3]), showing that it is possible to combine the benefits of statistical models and DL into a new deep ASM probabilistic framework.
State-space models (SSMs) have been widely used for analyzing time-series data in the fields of economics and bioinformatics to express the dynamic behavior of data. Recently, filtering and smoothing algorithms applie...
详细信息
State-space models (SSMs) have been widely used for analyzing time-series data in the fields of economics and bioinformatics to express the dynamic behavior of data. Recently, filtering and smoothing algorithms applied to linear discrete SSMs with skewed and heavy-tailed measurement noise have been proposed for a more appropriate model because mea-surement noise does not often follow a Gaussian distribution. In this paper, we propose a linear SSM with skew-t measurement noise for predicting blood test values, along with a method for estimating their parameter values to ensure consistency with the data when using a generalized expectation-maximization (EM) algorithm. To validate the effective-ness of the proposed model and method, we analyze time-series blood test data using both Gaussian and skew-t measurement noise and compared their prediction accuracy for future values. Then, we predicted future blood test values of the unhealthy participant under his current and improved lifestyles. By comparing these predicted results under dif-ferent lifestyles, we demonstrate that he will overcome lifestyle-related diseases with the improved lifestyle. (c) 2021 Elsevier Inc. All rights reserved.
Audio inpainting and audio declipping are important problems in audio signal processing, which are encountered in various practical applications. A number of approaches has been proposed in the literature to address t...
详细信息
ISBN:
(纸本)9781479974504
Audio inpainting and audio declipping are important problems in audio signal processing, which are encountered in various practical applications. A number of approaches has been proposed in the literature to address these problems, most successful of which are based on sparsity of the audio signals in certain dictionary representations. Non-negative matrix factorization (NMF) is another powerful tool that has been successfully used in applications such as audio source separation. In this paper we propose a new algorithm that makes use of a low rank NMF model to perform audio inpainting and declipping. In addition to utilizing for the first time the NMF model to perform audio inpainting in presence of arbitrary losses in time domain, the proposed approach also introduces a novel way to enforce additional constraints on the signal magnitude in order to improve the performance in declipping applications. The proposed approach is shown to have a comparable performance with the state of the art dictionary based methods while providing a number of advantages.
Nonnegative matrix or tensor factorization is a very popular approach for audio source separation. One important problem in nonnegative tensor factorization (NTF) in the context of user-guided audio source separation ...
详细信息
ISBN:
(纸本)9781479999880
Nonnegative matrix or tensor factorization is a very popular approach for audio source separation. One important problem in nonnegative tensor factorization (NTF) in the context of user-guided audio source separation is the necessity to manually assign the NTF components to audio sources in order to be able to enforce prior information on the sources during the estimation process. In this paper, two new approaches to NTF based source separation are proposed, which do not require any manual component assignment to the sources, but estimate the underlying assignment automatically. Both algorithms use the prior information on the source samples in the estimation process along with either a limit on the minimum number of components each source uses or with a restriction that each component is used by sparse number of sources. The proposed methods are shown to outperform the classic approach with a manual distribution of the components equally among the sources.
Audio declipping consists in recovering so-called clipped audio samples that are set to a maximum / minimum threshold. Many different approaches were proposed to solve this problem in case of singlechannel (mono) reco...
详细信息
ISBN:
(纸本)9781479999880
Audio declipping consists in recovering so-called clipped audio samples that are set to a maximum / minimum threshold. Many different approaches were proposed to solve this problem in case of singlechannel (mono) recordings. However, while most of audio recordings are multichannel nowadays, there is no method designed specifically for multichannel audio declipping, where the inter-channel correlations may be efficiently exploited for a better declipping result. In this work we propose for the first time such a multichannel audio declipping method. Our method is based on representing a multichannel audio recording as a convolutive mixture of several audio sources, and on modeling the source power spectrograms and mixing filters by nonnegative tensor factorization model and fullrank covariance matrices, respectively. A generalizedexpectationmaximization algorithm is proposed to estimate model parameters. It is shown experimentally that the proposed multichannel audio declipping algorithm outperforms in average and in most cases a stateof-the-art single-channel declipping algorithm applied to each channel independently.
The paradigm of using a very simple encoder and a sophisticated decoder for compression of signals became popular with the theory of distributed coding and it has been exercised for the compression of various types of...
详细信息
ISBN:
(纸本)9781479974504
The paradigm of using a very simple encoder and a sophisticated decoder for compression of signals became popular with the theory of distributed coding and it has been exercised for the compression of various types of signals such as images and video. The theory of compressive sampling later introduced a similar concept but with the focus on guarantees of signal recovery using sparse and low rank priors lying in an incoherent domain to the domain of sampling. In this paper, we bring together the concepts introduced in distributed coding and compressive sampling with the informed source separation, in which the goal is to efficiently compress the audio sources so that they can be decoded with the knowledge of the mixture of the sources. The proposed framework uses a very simple time domain sampling scheme to encode the sources, and a sophisticated decoding algorithm that makes use of the low rank non-negative tensor factorization model of the distribution of short-time Fourier transform coefficients to recover the sources, which is a direct application of the principles of both compressive sampling and distributed coding.
暂无评论