We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions. This formulation is based on Bregman distances and a ...
详细信息
We introduce a novel mathematical formulation for the training of feed-forward neural networks with (potentially non-smooth) proximal maps as activation functions. This formulation is based on Bregman distances and a key advantage is that its partial derivatives with respect to the network's parameters do not require the computation of derivatives of the network's activation functions. Instead of estimating the parameters with a combination of first-order optimisation method and back-propagation (as is the state-of-the-art), we propose the use of non-smooth first-order optimisation methods that exploit the specific structure of the novel formulation. We present several numerical results that demonstrate that these training approaches can be equally well or even better suited for the training of neural network-based classifiers and (denoising) autoencoders with sparse coding compared to more conventional training frameworks.
Deep neural network algorithms have shown promising results for music source signal separation. Most existing methods rely on deep networks, where billions of parameters need to be trained. In this paper, we propose a...
详细信息
Deep neural network algorithms have shown promising results for music source signal separation. Most existing methods rely on deep networks, where billions of parameters need to be trained. In this paper, we propose a novel autoencoder framework with a reduced number of parameters to separate the drum signal component from a music signal mixture. A denoising autoencoder with a U-Net architecture and direct skip connections was employed. A dense block is included in the bottleneck of the autoencoder stage. This technique was tested on both demixing secret data (DSD) and the MUSDB database. The source-to-distortion ratio (SDR) for the proposed method was at par with that of other state-of-the-art methods, whereas the number of parameters required was quite low, making it computationally more efficient. The experiment performed using the proposed method to separate drum signal yielded an average SDR of 5.71 on DSD and 6.45 on MUSDB database while using only 0.32 million parameters.
Context-aware recommender systems are intended primarily to consider the circumstances under which a user encounters an item to provide better-personalized recommendations. Users acquire point-of-interest, movies, pro...
详细信息
Context-aware recommender systems are intended primarily to consider the circumstances under which a user encounters an item to provide better-personalized recommendations. Users acquire point-of-interest, movies, products, and various online resources as suggestions. Classical collaborative filtering algorithms are shown to be satisfactory in a variety of recommendation activities processes, but cannot often capture complicated interactions between item and user, along with sparsity and cold start constraints. Hence it becomes a surge to apply a deep learning-based recommender model owing to its dynamic modeling potential and sustained success in other fields of application. In this work, a trust-based attentive contextual denoising autoencoder (TACDA) for enhanced Top-N context-aware recommendation is proposed. Specifically, the TCADA model takes the sparse preference of the user that is integrated with trust data as input into the autoencoder to prevail over the cold start and sparsity obstacle and efficiently accumulates the context condition into the model via attention framework. Thereby, the attention technique is used to encode context features into a latent space of the user's trust data that is integrated with their preferences, which interconnects personalized context circumstances with the active user's choice to deliver recommendations suited to that active user. Experiments conducted on Epinions, Caio, and LibraryThing datasets make it obvious the efficiency of the TACDA model persistently outperforms the state-of-the-art methods.
Wearable electrocardiogram (ECG) measurement using dry electrodes has a problem with high-intensity noise distortion. Hence, a robust noise reduction method is required. However, overlapping frequency bands of ECG and...
详细信息
ISBN:
(纸本)9789819620708;9789819620715
Wearable electrocardiogram (ECG) measurement using dry electrodes has a problem with high-intensity noise distortion. Hence, a robust noise reduction method is required. However, overlapping frequency bands of ECG and noise make noise reduction difficult. Hence, it is necessary to provide a mechanism that changes the characteristics of the noise based on its intensity and type. This study proposes a convolutional neural network (CNN) model with an additional wavelet transform layer that extracts the specific frequency features in a clean ECG. Testing confirms that the proposed method effectively predicts accurate ECG behavior with reduced noise by accounting for all frequency domains. In an experiment, noisy signals in the signal-to-noise ratio (SNR) range of -10-10 are evaluated, demonstrating that the efficiency of the proposed method is higher when the SNR is small.
The use of computer-aided image analysis for disease diagnosis and prognosis has dramatically increased during the past 10 years. The introduction of computer-assisted image analysis of images produced by equipme...
详细信息
Human behavior anomaly detection in video aims to identify unusual behaviors that are crucial for public safety. Recently, there has been an increase in reconstruction or predictionbased methods that integrate diverse...
详细信息
ISBN:
(纸本)9798350359329;9798350359312
Human behavior anomaly detection in video aims to identify unusual behaviors that are crucial for public safety. Recently, there has been an increase in reconstruction or predictionbased methods that integrate diverse modal features to enhance anomaly detection. However, they use methods that independently or directly fusion multimodal features without fully considering the collaborative potential between multimodal features, which are susceptible to interference from semantic differences, thereby impacting detection performance. In contrast, we design a collaborative framework using multimodal data and adaptive noise for behavior anomaly detection. Our framework detects anomalies by analyzing the contrastive differences between two modalities alongside single-frame reconstruction errors. Specifically, we first learn the correlation between RGB and skeletal modalities for normal behavior through contrastive learning and use inter-modal contrast difference to detect motion anomalies. Additionally, we propose a single-frame reconstruction network that adaptively adds noise based on the importance of foreground features to detect appearance anomalies. Anomalies often occur in the motion foreground, and increasing noise in this area can make it more difficult to reconstruct anomalies. Extensive experiments validate the state-of-the-art performance of our method on three public datasets.
Due to the ubiquitous presence of missing values (MVs) in real-world datasets, the MV imputation problem, aiming to recover MVs, is an important and fundamental data preprocessing step for various data analytics and m...
详细信息
Due to the ubiquitous presence of missing values (MVs) in real-world datasets, the MV imputation problem, aiming to recover MVs, is an important and fundamental data preprocessing step for various data analytics and mining tasks to effectively achieve good performance. To impute MVs, a typical idea is to explore the correlations amongst the attributes of the data. However, those correlations are usually complex and thus difficult to identify. Accordingly, we develop a new deep learning model calledMIssing Data Imputation denoising autoencoder(MIDIA) that effectively imputes the MVs in a given dataset by exploring non-linear correlations between missing values and non-missing values. Additionally, by considering various data missing patterns, we propose two effective MV imputation approaches based on the proposed MIDIA model, namely MIDIA-Sequential and MIDIA-Batch. MIDIA-Sequential imputes the MVs attribute-by-attribute sequentially by training an independent MIDIA model for each incomplete attribute. By contrast, MIDIA-Batch imputes the MVs in one batch by training a uniform MIDIA model. Finally, we evaluate the proposed approaches by experimentation in comparison with existing MV imputation algorithms. The experimental results demonstrate that both MIDIA-Sequential and MIDIA-Batch achieve significantly higher imputation accuracy compared with existing solutions, and the proposed approaches are capable of handling various data missing patterns and data types. Specifically, MIDIA-Sequential performs better than MIDIA-Batch for data with monotone missing pattern, while MIDIA-Batch performs better than MIDIA-Sequential for data with general missing pattern.
The goal of this article is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder ...
详细信息
The goal of this article is to investigate what singing voice separation approaches based on neural networks learn from the data. We examine the mapping functions of neural networks based on the denoising autoencoder (DAE) model that are conditioned on the mixture magnitude spectra. To approximate the mapping functions, we propose an algorithm inspired by the knowledge distillation, denoted the neural couplings algorithm (NCA). The NCA yields a matrix that expresses the mapping of the mixture to the target source magnitude information. Using the NCA, we examine the mapping functions of three fundamental DAE-based models in music source separation;one with single-layer encoder and decoder, one with multi-layer encoder and single-layer decoder, and one using skip-filtering connections (SF) with a single-layer encoding and decoding. We first train these models with realistic data to estimate the singing voice magnitude spectra from the corresponding mixture. We then use the optimized models and test spectral data as input to the NCA. Our experimental findings show that approaches based on the DAE model learn scalar filtering operators, exhibiting a predominant diagonal structure in their corresponding mapping functions, limiting the exploitation of inter-frequency structure of music data. In contrast, skip-filtering connections are shown to assist the DAE model in learning filtering operators that exploit richer inter-frequency structures.
The behavioral symptoms of patients with mild to moderate depression (MMD) are usually not obvious enough, which poses a challenge to MMD recognition research. A three-level feature construction strategy for facial ex...
详细信息
ISBN:
(纸本)9798400716751
The behavioral symptoms of patients with mild to moderate depression (MMD) are usually not obvious enough, which poses a challenge to MMD recognition research. A three-level feature construction strategy for facial expression was proposed to fully characterize the differences in facial activity between MMD patients and healthy control groups (HCs). Level 1: Construct geometric features to describe facial activity preliminarily. Level 2: Input geometric features into a denoising autoencoder (DAE) to generate a new hidden layer representation. Level 3: Gaussian mixture model (GMM) further characterizes the hidden layer features. Meanwhile, a speech feature set was constructed based on the fundamental frequency (F0), pitch intensity, mel frequency cepstral coefficients (MFCCs), and syllable pauses of speech. Finally, facial expression and speech were fused at the feature layer, and MMD recognition was carried out based on four classic classification algorithms. The experimental results show that the MMD recognition accuracy of the male and female groups can reach 70.3% and 68.3%, respectively.
Estimation of Distribution Algorithms (EDAs) are metaheuristics where learning a model and sampling new solutions replaces the variation operators recombination and mutation used in standard Genetic Algorithms. The ch...
详细信息
Estimation of Distribution Algorithms (EDAs) are metaheuristics where learning a model and sampling new solutions replaces the variation operators recombination and mutation used in standard Genetic Algorithms. The choice of these models as well as the corresponding training processes are subject to the bias/variance tradeoff, also known as under- and overfitting: simple models cannot capture complex interactions between problem variables, whereas complex models are susceptible to modeling random noise. This paper suggests using denoising autoencoders (DAEs) as generative models within EDAs (DAE-EDA). The resulting DAE-EDA is able to model complex probability distributions. Furthermore, overfitting is less harmful, since DAEs overfit by learning the identity function. This overfitting behavior introduces unbiased random noise into the samples, which is no major problem for the EDA but just leads to higher population diversity. As a result, DAE-EDA runs for more generations before convergence and searches promising parts of the solution space more thoroughly. We study the performance of DAE-EDA on several combinatorial single-objective optimization problems. In comparison to the Bayesian Optimization Algorithm, DAE-EDA requires a similar number of evaluations of the objective function but is much faster and can be parallelized efficiently, making it the preferred choice especially for large and difficult optimization problems.
暂无评论