The use of deep learning in medical image analysis is hindered by insufficient annotated data and the inability of models to generalize between different imaging settings. We address these problems using a novel varia...
详细信息
ISBN:
(纸本)9783030882105;9783030882099
The use of deep learning in medical image analysis is hindered by insufficient annotated data and the inability of models to generalize between different imaging settings. We address these problems using a novel variational style-transfer neural network that can sample various styles from a computed latent space to generate images from a broader domain than what was observed. We show that using our generative approach for ultrasound data augmentation and domain adaptation during training improves the performance of the resulting deep learning models, even when tested within the observed domain.
Reinforcement Learning (RL) is known to be often unsuccessful in environments with sparse extrinsic rewards. A possible countermeasure is to endow RL agents with an intrinsic reward function, or 'intrinsic motivat...
详细信息
ISBN:
(纸本)9781728162423
Reinforcement Learning (RL) is known to be often unsuccessful in environments with sparse extrinsic rewards. A possible countermeasure is to endow RL agents with an intrinsic reward function, or 'intrinsic motivation', which rewards the agent based on certain features of the current sensor state. An intrinsic reward function based on the principle of empowerment assigns rewards proportional to the amount of control the agent has over its own sensors. We implemented a variation on a recently proposed intrinsically motivated agent, which we refer to as the 'curious' agent, and an empowerment-inspired agent. The former leverages sensor state encoding with a variational autoencoder, while the latter predicts the next sensor state via a variational information bottleneck. We compared the performance of both agents to that of an advantage actor-critic baseline in four sparse reward grid worlds. Both the empowerment agent and its curious competitor seem to benefit to similar extents from their intrinsic rewards. This provides some experimental support to the conjecture that empowerment can be used to drive exploration.
This paper brings together rigid body kinematics and machine learning to create a novel approach to path synthesis of linkage mechanisms under practical constraints, such as location of pivots. We model the coupler cu...
详细信息
ISBN:
(纸本)9780791885444
This paper brings together rigid body kinematics and machine learning to create a novel approach to path synthesis of linkage mechanisms under practical constraints, such as location of pivots. We model the coupler curve and constraints as probability distributions of image pixels and employ a Convolutional Neural Network (CNN) based variational autoencoder (VAE) architecture to capture and predict the features of the mechanism. Plausible solutions are found by performing informed latent space exploration so as to minimize the changes to the input coupler curve while seeking to find user-defined pivot locations. Traditionally, kinematic synthesis problems are solved using precision point approach, wherein the input path is represented as a set of points and a set of equations in terms of design parameters are formulated. Generally, this problem is solved via optimization, wherein a measure of error between the given path and the coupler curve is minimized. A limitation of this approach is that the existing formulations depend on the type of mechanism, do not admit practical constraints in a unified way, and provide a limited number of solutions. However, in the machine design pipeline, kinematic synthesis problems are concept generation problems, where designers care more about a large number of plausible and practical solutions rather than the precision of input or the solutions. The image-based approach proposed in this paper alleviates the difficulty associated with inherently uncertain inputs and constraints.
Text matching is an important method to judge the semantic similarity of different sentences. Improving the efficiency and accuracy of text matching is the most focus in the field of information matching. In recent ye...
详细信息
ISBN:
(纸本)9781665421744
Text matching is an important method to judge the semantic similarity of different sentences. Improving the efficiency and accuracy of text matching is the most focus in the field of information matching. In recent years, deep learning has been widely applied to text matching tasks and achieved good results. However, the different models have different limitations, such as CNN cannot learn global semantic information well, RNN cannot be parallelized well, and large pre-training language models have too many parameters to be deployed on hardware well. To address these problems, this paper propose a self-attention based text matching model with generative pre-training. Self-attention mechanism is adopted to learn the semantic information between words in a sentence, and can achieve better parallelization. We use the deep separable convolution model to obtain local features. In the pretraining stage of this model, a generative model variational autoencoder is used to learn the semantic relationship between similar sentences. And in the downstream text matching model, we employ Siamese Network structure, combine depth-wise separable convolutions and self-attention mechanism for feature extraction, and use attention mechanism for text interaction, in which the parameters in the pre-training phase will be shared. At last, we evaluate our model on three datasets: LCQMC, QQP, and a securities dataset. Experiment results show that our method achieves pretty good performance.
People can easily imagine the potential sound while seeing an event. This natural synchronization between audio and visual signals reveals their intrinsic correlations. To this end, we propose to learn the audio-visua...
详细信息
ISBN:
(纸本)9781728176055
People can easily imagine the potential sound while seeing an event. This natural synchronization between audio and visual signals reveals their intrinsic correlations. To this end, we propose to learn the audio-visual correlations from the perspective of cross-modal generation in a self-supervised manner, the learned correlations can be then readily applied in multiple downstream tasks such as the audio-visual cross-modal localization and retrieval. We introduce a novel variational autoencoder (VAE) framework that consists of Multiple encoders and a Shared decoder (MS-VAE) with an additional Wasserstein distance constraint to tackle the problem. Extensive experiments demonstrate that the optimized latent representation of the proposed MS-VAE can effectively learn the audio-visual correlations and can be readily applied in multiple audio-visual downstream tasks to achieve competitive performance even without any given label information during training.
This paper presents a novel speech emotion recognition (SER) method to capture the uncertainty in predicting emotional attributes using the true distribution of scores provided by annotators as ground truth (i.e., sof...
详细信息
ISBN:
(纸本)9781665400190
This paper presents a novel speech emotion recognition (SER) method to capture the uncertainty in predicting emotional attributes using the true distribution of scores provided by annotators as ground truth (i.e., soft-labels). Reliable, generalizable, and scalable SER systems are important in areas such as healthcare, customer service, security, and defense. A barrier to build these systems is the lack of quality labels due to the expensive annotation process, leading to poor generalization. To address this limitation, this study proposes a semi-supervised generative modeling approach using a variational autoencoder (VAE) with an emotional regressor at the bottleneck trained with soft-labels of emotional attributes. We demonstrate that estimating uncertainties in predicting emotional attribute scores is possible with soft-labels. We analyze the benefits of uncertainty estimation with a reject option formulation, where the model can abstain from predicting emotion when it is less confident. At 60% test coverage, we achieve relative improvements in concordance correlation coefficient (CCC) up to 16.85% for valence, 7.12% for arousal, and 8.01% for dominance. Furthermore, we propose an uncertainty transfer learning strategy where uncertainties learned from one attribute are used as a sample re-ordering criterion for another attribute, achieving additional improvements in prediction performance for valence. We also demonstrate the generalization power of our method in comparison to other uncertainty estimating methods using cross-corpus evaluations. Finally, we demonstrate that our method has lower computational complexity than alternative approaches.
Gastrointestinal (GI) cancer precursors require frequent monitoring for risk stratification of patients. Automated segmentation methods can help to assess risk areas more accurately, and assist in therapeutic procedur...
详细信息
ISBN:
(纸本)9783030871994;9783030871987
Gastrointestinal (GI) cancer precursors require frequent monitoring for risk stratification of patients. Automated segmentation methods can help to assess risk areas more accurately, and assist in therapeutic procedures or even removal. In clinical practice, addition to the conventional white-light imaging (WLI), complimentary modalities such as narrow-band imaging (NBI) and fluorescence imaging are used. While, today most segmentation approaches are supervised and only concentrated on a single modality dataset, this work exploits to use a target-independent unsupervised domain adaptation (UDA) technique that is capable to generalize to an unseen target modality. In this context, we propose a novel UDA-based segmentation method that couples the variational autoencoder and U-Net with a common EfficientNet-B4 backbone, and uses a joint loss for latent-space optimization for target samples. We show that our model can generalize to unseen target NBI (target) modality when trained using only WLI (source) modality. Our experiments on both upper and lower GI endoscopy data show the effectiveness of our approach compared to naive supervised approach and state-of-the-art UDA segmentation methods.
The objective of this paper is to propose a novel deep learning methodology to gain pragmatic insights into the behavior of an insured person using unsupervised variable importance. It lays the groundwork for understa...
详细信息
The objective of this paper is to propose a novel deep learning methodology to gain pragmatic insights into the behavior of an insured person using unsupervised variable importance. It lays the groundwork for understanding how insights can be gained into the fraudulent behavior of an insured person with minimum effort. Starting with a preliminary investigation of the limitations of the existing fraud detection models, we propose a new variable importance methodology incorporated with two prominent unsupervised deep learning models, namely, the autoencoder and the variational autoencoder. Each model's dynamics is discussed to inform the reader on how models can be adapted for fraud detection and how results can be perceived appropriately. Both qualitative and quantitative performance evaluations are conducted, although a greater emphasis is placed on qualitative evaluation. To broaden the scope of reference of fraud detection setting, various metrics are used in the qualitative evaluation.
The growing societal dependence on social media and user generated content for news and information has increased the influence of unreliable sources and fake content, which muddles public discourse and lessens trust ...
详细信息
The growing societal dependence on social media and user generated content for news and information has increased the influence of unreliable sources and fake content, which muddles public discourse and lessens trust in the media. Validating the credibility of such information is a difficult task that is susceptible to confirmation bias, leading to the development of algorithmic techniques to distinguish between fake and real news. However, most existing methods are challenging to interpret, making it difficult to establish trust in predictions, and make assumptions that are unrealistic in many real-world scenarios, e.g., the availability of audiovisual features or provenance. In this work, we focus on fake news detection of textual content using interpretable features and methods. In particular, we have developed a deep probabilistic model that integrates a dense representation of textual news using a variational autoencoder and bi-directional Long Short-Term Memory (LSTM) networks with semantic topic-related features inferred from a Bayesian admixture model. Extensive experimental studies with 3 real-world datasets demonstrate that our model achieves comparable performance to state-of-theart competing models while facilitating model interpretability from the learned topics. Finally, we have conducted model ablation studies to justify the effectiveness and accuracy of integrating neural embeddings and topic features both quantitatively by evaluating performance and qualitatively through separability in lower dimensional embeddings.
The current maintenance of aerospace equipment generally uses regular maintenance, scheduled maintenance, seasonal maintenance, after-the-fact maintenance, and replacement maintenance. These methods are ill-timed, tim...
详细信息
ISBN:
(纸本)9781450397773
The current maintenance of aerospace equipment generally uses regular maintenance, scheduled maintenance, seasonal maintenance, after-the-fact maintenance, and replacement maintenance. These methods are ill-timed, time-consuming, and wasteful of materials. Monitoring the reliability and healthy operating status of each embedded computer electronic component is essential, and maintenance staff will benefit greatly from a data-driven approach to anomaly detection. It can be altered from "repair afterward" to "repair as necessary" and from " repair regularly" to "repair at any time" to solve the practical problems arising from maintenance. The variational autoencoder (VAE), which is based on the component storage aging acceleration data, is used in this paper to model the component's normal operating status and perform anomaly detection. The precision and recall of this anomaly detection method are 0.950 and 0.977. This method evaluates the operating status and reliability of each component, improves the reliability and service life of the computer, and establishes the technological framework for the next generation of computer Prognostics and Health Management (PHM) systems.
暂无评论