Fraud detection and prevention has received a lot of attention from the research community due to its high impact on financial institutions' revenues and reputation. The increased use of the web and the provision ...
详细信息
Fraud detection and prevention has received a lot of attention from the research community due to its high impact on financial institutions' revenues and reputation. The increased use of the web and the provision of online services open up the pathway for exposing these systems to numerous threats and jeopardizing their effective functioning. Naturally, financial frauds are increased in number and form imposing various requirements for their efficient and immediate detection. These requirements are related to the performance of the adopted models as well as the timely response of the decision-making mechanism. Machine learning and data mining are two research domains that can provide a number of techniques/algorithms for fraud detection and setup the road for mitigation actions. However, these methods still need to be improved with respect to the detection of unknown fraud patterns and the incorporation of big data processing mechanisms. This paper presents our attempt to build a hybrid system, i.e., a sequential scheme for combining two deep learning models and efficiently detecting potential financial frauds. We elaborate on the combination of an autoencoder and a Long Short-Term Memory Recurrent Neural Network trained upon datasets which are processed through the use of an oversampling technique. Oversampling is adopted to handle heavily imbalanced datasets which is the 'natural' scenario due to the limited number of frauds compared to the humongous volumes of transactions. The proposed approach tends to capture much more fraud events in comparison with other conventional ML techniques. Our experimental evaluation exposes that our model exhibits a good performance in terms of recall and precision.
Recent advances in remote sensing techniques provide a new horizon for monitoring the spatiotemporal variations of harmful algal blooms (HABs) using hyperspectral data in inland water. In this study, a hierarchical co...
详细信息
Recent advances in remote sensing techniques provide a new horizon for monitoring the spatiotemporal variations of harmful algal blooms (HABs) using hyperspectral data in inland water. In this study, a hierarchical concatenated variational autoencoder (HCVAE) is proposed as an efficient and accurate deep learning (DL) based bio-optical model. To demonstrate its usefulness in retrieving algal pigments, the HCVAE is applied to bloom -prone regions in Daecheong Lake, South Korea. By abstracting the similarity between highly related features using layer-wise clique-based latent-feature extraction, HCVAE reduces the computational loads in deriving outputs while preventing performance degradation. Graph-based clique-detection uses information theory-based criteria to group the related reflectance spectra. Consequently, six latent features were extracted from 79 spectral bands to consist of a multilevel hierarchy of HCVAE that can simultaneously estimate concentrations of chlorophyll-a (Chl-a) and phycocyanin (PC). Despite the parsimonious model architecture, the Chl-a and PC concentrations estimated by HCVAE closely agree with the measured concentrations, with test R2 values of 0.76 and 0.82, respectively. In addition, spatial distribution maps of algal pigments obtained from HCVAE using drone-borne reflectance successfully capture the blooming spots. Based on its multilevel hierarchical architecture, HCVAE can provide the importance of latent features along with their individual wavelengths using Shapley additive explanations. The most important latent features covered the spectral regions associated with both Chl-a and PC. The lightweight neural network DNNsel, which uses only the spectral bands of highest importance in latent-feature extraction, performed comparably to HCVAE. The study results demonstrate the utility of the multilevel hierarchical architecture as a comprehensive assessment model for near-real-time drone-borne sensing of HABs. Moreover, HCVAE is applica
Vehicle driving behavior analysis and detection tasks have become an indispensable part of intelligent transportation systems. Accurate pattern recognition of potential anomalies during the movement of entities is cru...
详细信息
Vehicle driving behavior analysis and detection tasks have become an indispensable part of intelligent transportation systems. Accurate pattern recognition of potential anomalies during the movement of entities is crucial for improving transportation efficiency. Current methods typically analyze vehicle trajectories independently without considering potential interactions among vehicles. To address this limitation, some studies have integrated graph attention mechanisms to capture the influence of neighboring vehicles during the aggregation process. However, Graph Attention Networks (GATs) are constrained by the univariate nature of attention heads and coefficients, thus lacking flexibility. In this work, we not only consider the social dynamics among neighboring vehicles but also delve into the limitations of GAT models. We propose a Vehicular Social Dynamics Anomaly Detection (VSD-AD) model based on the Recurrent Multi-Mask Aggregator (MMA) enabled variational autoencoder (VAE) architecture to maximize the learning of relational embeddings among neighbors in a highway vehicle network. Furthermore, we apply Node Feature Quantisation (NFQ) to the encoder output to mitigate the complexity of neighbor relationships. Our model is flexible and customizable for different highway scenarios, suitable for large-scale highway vehicle video data. To validate real-world applicability, we further assess its performance on both the simulated dataset and real-world traffic dataset, where our model outperforms other mainstream methods in terms of detection performance.
The field of Integrated Computational Materials Engineering (ICME) combines a broad range of methods to study materials' responses over a spectrum of length scales. A relatively unexplored aspect of microstructure...
详细信息
The field of Integrated Computational Materials Engineering (ICME) combines a broad range of methods to study materials' responses over a spectrum of length scales. A relatively unexplored aspect of microstructure -sensitive materials design is uncertainty propagation and quantification (UP/UQ) of materials' microstructure, as well as establishing process-structure-property (PSP) relationships for inverse material design. In this study, an efficient UP technique built on the idea of changing probability measures and a deep generative unsupervised representative machine learning method for microstructure-based design of thermal conductivity of materials is proposed. Probability measures are used to represent microstructure space, and Wasserstein metrics are used to test the efficiency of the UP method. By using deep variational autoencoder (VAE), we identify the correlations between the material/process parameters and the thermal conductivity of heterogeneous dual-phase microstructures. Through high-throughput screening, UP, and the deep-generative VAE method, PSP relationships that are too complex can be revealed by exploiting the materials' design space with an emphasis on microstructures. As a last point, we demonstrate generative machine learning serves as a useful tool for inverse microstructure-centered materials design, and we demonstrate this by examining the inverse design of thermal conductivity in nano-structured materials. The results reveal the effects of morphology, volume fraction, characteristic length scale, and the individual thermal diffusivity of phases on the thermal conductivity of dual-phase alloys. Our findings emphasize the advantages of high-throughput phase -field modeling and generative deep learning for linking PSP and inverse microstructure-centered materials design.
The online quality variables of soft sensors contribute greatly to obtaining immediate process information. The complex correlations between a large number of process variables inherited from the dynamic and nonlinear...
详细信息
The online quality variables of soft sensors contribute greatly to obtaining immediate process information. The complex correlations between a large number of process variables inherited from the dynamic and nonlinear characteristics of chemical processes put more challenges on constructing soft-sensor models. Past developed steady-state soft sensors are not reliable for dynamic operating systems. Unequal sampling rates for the process and quality data cause missing values of quality data at some time points. This paper proposes a semi-supervised latent dynamic variational autoencoder to learn features between the process and quality data. A prediction network is constructed to generate artificial quality values for model training. Then the process and quality data are compressed into the latent space and the temporal relation is modeled in the clean latent space. The proposed method is compared with the conventional method for quality prediction in a numerical case and an industrial case. CO 2022 Elsevier Ltd. All rights reserved.
For discovering uncharted chemical space of ionic liquids(ILs)for CO_(2)dissolution,a reliable generative framework combining re-balanced variational autoencoder(VAE),artificial neural network(ANN),and particle swarm ...
详细信息
For discovering uncharted chemical space of ionic liquids(ILs)for CO_(2)dissolution,a reliable generative framework combining re-balanced variational autoencoder(VAE),artificial neural network(ANN),and particle swarm optimization(PSO)is developed based on a comprehensive experimental solubility database from *** re-balanced VAE transforms the chemical space of ILs into continuous latent space,which is demonstrated by tdistributed stochastic neighbor embedding(t-SNE)visualization and sampled ions of the latent *** is connected with the re-balanced VAE to predict the CO_(2)solubility and the resultant VAE-ANN model achieves a low mean absolute error(MAE)of 0.022 on the test ***,the PSO algorithm is employed to search the latent space for optimal IL structures with the highest predicted solubility.A total of 5120 ILs are generated and optimized through 10 parallel runs of *** CO_(2)solubilities are predicted and compared to those of the 3735 ILs combined with the already-known cations and anions in the CO_(2)solubility database under 298.15 K and 100 *** results demonstrate a notably larger distribution of higher CO_(2)solubility in optimized ILs after PSO,which effectively points out the significance and directions for exploring the wide IL chemical space.
The variational autoencoder (VAE) has garnered extensive attention in the field of soft sensor modeling due to its superior capabilities in probabilistic data description and feature extraction. However, a single-laye...
详细信息
The variational autoencoder (VAE) has garnered extensive attention in the field of soft sensor modeling due to its superior capabilities in probabilistic data description and feature extraction. However, a single-layer VAE is challenging to extract higher-level features in the face of strong nonlinear process data. This paper proposes a gated stacked target-supervised VAE with variable weights (W-GSTVAE) to improve the modeling prediction performance of VAE. First, a stacked VAE is employed to enhance the feature extraction capability. In the pretraining phase, to enhance the correlation between the features and the target variable, feature learning is guided by incorporating the prediction error of target values into the loss function as well as calculating the maximum information coefficient between input and target variables. Meanwhile, in the fine-tuning phase, to make full use of shallow features, gated linear units are used to integrate the output features of each layer, fully exploiting the information from all layers. Finally, the effectiveness and superiority of the proposed model is demonstrated through two real industrial cases.
The raw data utilized in training machine learning models faces a potential threat from membership inference attacks. To mitigate this risk, employing synthetic data instead of real data is proved effective in desensi...
详细信息
The raw data utilized in training machine learning models faces a potential threat from membership inference attacks. To mitigate this risk, employing synthetic data instead of real data is proved effective in desensitizing the information. We introduce a novel generative model, combining variational autoencoder and Generative Adversarial Network, to enhance privacy protection by generating synthetic data. In our approach, discrete variables are encoded by conditional generators, and sampling training is employed to ensure the distribution of synthetic data closely aligning with the real data. The modification of the model structure prompts a refinement of the loss function. We leverage Wasserstein distance with gradient penalty and SNorm to keep the stability of the model training process. Experimental results demonstrate that the efficacy of our model surpasses existing state-of-the-art models in terms of data utility metrics. Notably, in the face of membership inference attacks, the similarity from the results indicates the difficulty when distinguish the real data from synthetic data. It means our model have highlighting capabilities for the privacy protection.
The advent of single-cell RNA sequencing (scRNA-seq) has revolutionized gene expression research at the single-cell level, enabling the study of cellular heterogeneity and identification of rare cell populations. Deep...
详细信息
The advent of single-cell RNA sequencing (scRNA-seq) has revolutionized gene expression research at the single-cell level, enabling the study of cellular heterogeneity and identification of rare cell populations. Deep clustering is crucial for analyzing scRNA-seq datasets by assigning cells into subpopulations. However, inherent sparsity and variability in gene expression pose challenges to clustering accuracy. To address these issues, a novel unsupervised deep clustering approach named single-cell Combined graph Attentional clustering (scCAT) is introduced. The method designs a dual-branch joint dimensionality reduction (JDR) module to learn gene expression. This strategy preserves key variance while capturing complex nonlinear relationships, effectively addressing the high-dimensionality challenges of single-cell data. Additionally, a Zero-inflated negative binomial (ZINB) distribution is integrated within the JDR to tackle significant noise, sparsity, and zero-inflation challenges. A graph attention autoencoder (GATE) then processes the graph-structured data, enhancing the integration of integrating cellular topological relationships and gene expressions. Finally, a kmeans-based self-optimization technique refines clustering while synchronizing with representation learning. Experimental evaluations on eight scRNA-seq datasets demonstrate that scCAT significantly improves clustering performance, mitigates the impact of inherent data defects. The embeddings learned from both linear and non-linear perspectives eliminate inter-component interactions, uncovering complex underlying structures and enhancing the recognition of cellular topological relationships when combined with graph neural networks.
One of the main challenges in deep learning-based underwater image enhancement is the limited availability of high-quality training data. Underwater images are often difficult to capture and typically suffer from dist...
详细信息
One of the main challenges in deep learning-based underwater image enhancement is the limited availability of high-quality training data. Underwater images are often difficult to capture and typically suffer from distortion, colour loss, and reduced contrast, complicating the training of supervised deep learning models on large and diverse datasets. This limitation can adversely affect the performance of the model. In this paper, we propose an alternative approach to supervised underwater image enhancement. Specifically, we introduce a novel framework called Uncertainty Distribution Network (UDnet), which adapts to uncertainty distribution during its unsupervised reference map (label) generation to produce enhanced output images. UDnet enhances underwater images by adjusting contrast, saturation, and gamma correction. It incorporates a statistically guided multicolour space stretch module (SGMCSS) to generate a reference map, which is utilized by a U-Net-like conditional variational autoencoder module (cVAE) for feature extraction. These features are then processed by a Probabilistic Adaptive Instance Normalization (PAdaIN) block that encodes the feature uncertainties for the final image enhancement. The SGMCSS module ensures visual consistency with the input image and eliminates the need for manual human annotation. Consequently, UDnet can learn effectively with limited data and achieve state-of-the-art results. We evaluated UDnet on eight publicly available datasets, and the results demonstrate that it achieves competitive performance compared to other state-of-the-art methods in both quantitative and qualitative metrics. Our code is publicly available at https://***/alzayats/UDnet.
暂无评论