The heart sound signals captured via a digital stethoscope are often distorted by environmental and physiological noise, altering their salient and critical properties. The problem is exacerbated in crowded low-resour...
详细信息
The heart sound signals captured via a digital stethoscope are often distorted by environmental and physiological noise, altering their salient and critical properties. The problem is exacerbated in crowded low-resource hospital settings with high noise levels which degrades the diagnostic performance. In this study, we present a novel deep encoder-decoder-based denoising architecture (LU-Net) to suppress ambient and internal lung sound noises. Training is done using a large benchmark PCG dataset mixed with physiological noise, i.e., breathing sounds. Two different noisy datasets were prepared for experimental evaluation by mixing unseen lung sounds and hospital ambient noises with the clean heart sound recordings. We also used the inherently noisy portion of the PASCAL heart sound dataset for evaluation. The proposed framework showed effective suppression of background noises in both unseen real-world data and synthetically generated noisy heart sound recordings, improving the signal-to-noise ratio (SNR) level by 5.575 dB on an average using only 1.32 M parameters. The proposed model outperforms the current state-of-the-art U-Net model with an average SNR improvement of 5.613 dB and 5.537 dB in the presence of lung sound and unseen hospital noise, respectively. LU-Net also outperformed the state-of-the-art Fully Convolutional Network (FCN) by 1.750 dB and 1.748 dB for lung sound and unseen hospital noise conditions, respectively. In addition, the proposed denoising method model improves classification accuracy by 38.93% in the noisy portion of the PASCAL heart sound dataset. The results presented in the paper indicate that our proposed architecture demonstrated a robust denoising performance on different datasets with diverse levels and characteristics of noise. The proposed deep learning-based PCG denoising approach is a pioneering study that can significantly improve the accuracy of computer-aided auscultation systems for detecting cardiac diseases in noisy, low-
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions. This formulation is based on Bregman distances and a ...
详细信息
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions. This formulation is based on Bregman distances and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions. Instead of estimating the parameters with a combination of first-order optimisation method and back-propagation (as is the state-of-the-art), we propose the use of non-smooth first-order optimisation methods that exploit the specific structure of the novel formulation. We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding compared to more conventional training frameworks.
Deep neural network algorithms have shown promising results for music source signal separation. Most existing methods rely on deep networks, where billions of parameters need to be trained. In this paper, we propose a...
详细信息
Deep neural network algorithms have shown promising results for music source signal separation. Most existing methods rely on deep networks, where billions of parameters need to be trained. In this paper, we propose a novel autoencoder framework with a reduced number of parameters to separate the drum signal component from a music signal mixture. A denoising autoencoder with a U-Net architecture and direct skip connections was employed. A dense block is included in the bottleneck of the autoencoder stage. This technique was tested on both demixing secret data (DSD) and the MUSDB database. The source-to-distortion ratio (SDR) for the proposed method was at par with that of other state-of-the-art methods, whereas the number of parameters required was quite low, making it computationally more efficient. The experiment performed using the proposed method to separate drum signal yielded an average SDR of 5.71 on DSD and 6.45 on MUSDB database while using only 0.32 million parameters.
Context-aware recommender systems are intended primarily to consider the circumstances under which a user encounters an item to provide better-personalized recommendations. Users acquire point-of-interest, movies, pro...
详细信息
Context-aware recommender systems are intended primarily to consider the circumstances under which a user encounters an item to provide better-personalized recommendations. Users acquire point-of-interest, movies, products, and various online resources as suggestions. Classical collaborative filtering algorithms are shown to be satisfactory in a variety of recommendation activities processes, but cannot often capture complicated interactions between item and user, along with sparsity and cold start constraints. Hence it becomes a surge to apply a deep learning-based recommender model owing to its dynamic modeling potential and sustained success in other fields of application. In this work, a trust-based attentive contextual denoising autoencoder (TACDA) for enhanced Top-N context-aware recommendation is proposed. Specifically, the TCADA model takes the sparse preference of the user that is integrated with trust data as input into the autoencoder to prevail over the cold start and sparsity obstacle and efficiently accumulates the context condition into the model via attention framework. Thereby, the attention technique is used to encode context features into a latent space of the user's trust data that is integrated with their preferences, which interconnects personalized context circumstances with the active user's choice to deliver recommendations suited to that active user. Experiments conducted on Epinions, Caio, and LibraryThing datasets make it obvious the efficiency of the TACDA model persistently outperforms the state-of-the-art methods.
Wearable electrocardiogram (ECG) measurement using dry electrodes has a problem with high-intensity noise distortion. Hence, a robust noise reduction method is required. However, overlapping frequency bands of ECG and...
详细信息
ISBN:
(纸本)9789819620708;9789819620715
Wearable electrocardiogram (ECG) measurement using dry electrodes has a problem with high-intensity noise distortion. Hence, a robust noise reduction method is required. However, overlapping frequency bands of ECG and noise make noise reduction difficult. Hence, it is necessary to provide a mechanism that changes the characteristics of the noise based on its intensity and type. This study proposes a convolutional neural network (CNN) model with an additional wavelet transform layer that extracts the specific frequency features in a clean ECG. Testing confirms that the proposed method effectively predicts accurate ECG behavior with reduced noise by accounting for all frequency domains. In an experiment, noisy signals in the signal-to-noise ratio (SNR) range of -10-10 are evaluated, demonstrating that the efficiency of the proposed method is higher when the SNR is small.
The use of computer-aided image analysis for disease diagnosis and prognosis has dramatically increased during the past 10 years. The introduction of computer-assisted image analysis of images produced by equipme...
详细信息
Human behavior anomaly detection in video aims to identify unusual behaviors that are crucial for public safety. Recently, there has been an increase in reconstruction or predictionbased methods that integrate diverse...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
Human behavior anomaly detection in video aims to identify unusual behaviors that are crucial for public safety. Recently, there has been an increase in reconstruction or predictionbased methods that integrate diverse modal features to enhance anomaly detection. However, they use methods that independently or directly fusion multimodal features without fully considering the collaborative potential between multimodal features, which are susceptible to interference from semantic differences, thereby impacting detection performance. In contrast, we design a collaborative framework using multimodal data and adaptive noise for behavior anomaly detection. Our framework detects anomalies by analyzing the contrastive differences between two modalities alongside single-frame reconstruction errors. Specifically, we first learn the correlation between RGB and skeletal modalities for normal behavior through contrastive learning and use inter-modal contrast difference to detect motion anomalies. Additionally, we propose a single-frame reconstruction network that adaptively adds noise based on the importance of foreground features to detect appearance anomalies. Anomalies often occur in the motion foreground, and increasing noise in this area can make it more difficult to reconstruct anomalies. Extensive experiments validate the state-of-the-art performance of our method on three public datasets.
Due to the ubiquitous presence of missing values (MVs) in real-world datasets, the MV imputation problem, aiming to recover MVs, is an important and fundamental data preprocessing step for various data analytics and m...
详细信息
Due to the ubiquitous presence of missing values (MVs) in real-world datasets, the MV imputation problem, aiming to recover MVs, is an important and fundamental data preprocessing step for various data analytics and mining tasks to effectively achieve good performance. To impute MVs, a typical idea is to explore the correlations amongst the attributes of the data. However, those correlations are usually complex and thus difficult to identify. Accordingly, we develop a new deep learning model calledMIssing Data Imputation denoising autoencoder(MIDIA) that effectively imputes the MVs in a given dataset by exploring non-linear correlations between missing values and non-missing values. Additionally, by considering various data missing patterns, we propose two effective MV imputation approaches based on the proposed MIDIA model, namely MIDIA-Sequential and MIDIA-Batch. MIDIA-Sequential imputes the MVs attribute-by-attribute sequentially by training an independent MIDIA model for each incomplete attribute. By contrast, MIDIA-Batch imputes the MVs in one batch by training a uniform MIDIA model. Finally, we evaluate the proposed approaches by experimentation in comparison with existing MV imputation algorithms. The experimental results demonstrate that both MIDIA-Sequential and MIDIA-Batch achieve significantly higher imputation accuracy compared with existing solutions, and the proposed approaches are capable of handling various data missing patterns and data types. Specifically, MIDIA-Sequential performs better than MIDIA-Batch for data with monotone missing pattern, while MIDIA-Batch performs better than MIDIA-Sequential for data with general missing pattern.
The goal of this article is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder ...
详细信息
The goal of this article is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA, we examine the mapping functions of three fundamental DAE-based models in music source separation;one with single-layer encoder and decoder, one with multi-layer encoder and single-layer decoder, and one using skip-filtering connections (SF) with a single-layer encoding and decoding. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structures.
Spoofed speech detection is recently gaining attention of the researchers as speaker verification is shown to be vulnerable to spoofing attacks such as voice conversion, speech synthesis, replay, and impersonation. Al...
详细信息
Spoofed speech detection is recently gaining attention of the researchers as speaker verification is shown to be vulnerable to spoofing attacks such as voice conversion, speech synthesis, replay, and impersonation. Although various different methods have been proposed to detect spoofed speech, their performances decrease dramatically under the mismatched conditions due to the additive or reverberant noises. Conventional speech enhancement methods fail to recover the performance gap, hence more advanced techniques seem to be necessary to solve the noisy spoofed speech detection problem. In this work, denoising autoencoder (DAE) is used to obtain clean estimates of i-vectors from their noisy versions. ASVspoof 2015 database is used in the experiments with five different noise types, added to the original utterances at 0, 10, and 20 dB signal-to-noise ratios (SNR). The experimental results verified that the DAE provides a more robust spoof detection, where the conventional methods fail.
暂无评论