K-nearest neighbor(KNN)is one of the most fundamental methods for unsupervised outlier detection because of its various advantages,e.g.,ease of use and relatively high ***,most data analytic tasks need to deal with hi...
详细信息
K-nearest neighbor(KNN)is one of the most fundamental methods for unsupervised outlier detection because of its various advantages,e.g.,ease of use and relatively high ***,most data analytic tasks need to deal with high-dimensional data,and the KNN-based methods often fail due to“the curse of dimensionality”.autoencoder-based methods have recently been introduced to use reconstruction errors for outlier detection on high-dimensional data,but the direct use of autoencoder typically does not preserve the data proximity relationships well for outlier *** this study,we propose to combine KNN with autoencoder for outlier ***,we propose the Nearest Neighbor autoencoder(NNAE)by persevering the original data proximity in a much lower dimension that is more suitable for performing ***,we propose the K-nearest reconstruction neighbors(K NRNs)by incorporating the reconstruction errors of NNAE with the K-distances of KNN to detect ***,we develop a method to automatically choose better parameters for optimizing the structure of ***,using five real-world datasets,we experimentally show that our proposed approach NNAE+K NRN is much better than existing methods,i.e.,KNN,Isolation Forest,a traditional autoencoder using reconstruction errors(autoencoder-RE),and Robust autoencoder.
Nowadays,the personalized recommendation has become a research hotspot for addressing information *** this,generating effective recommendations from sparse data remains a ***,auxiliary information has been widely used...
详细信息
Nowadays,the personalized recommendation has become a research hotspot for addressing information *** this,generating effective recommendations from sparse data remains a ***,auxiliary information has been widely used to address data sparsity,but most models using auxiliary information are linear and have limited *** to the advantages of feature extraction and no-label requirements,autoencoder-based methods have become quite ***,most existing autoencoder-based methods discard the reconstruction of auxiliary information,which poses huge challenges for better representation learning and model *** address these problems,we propose Serial-autoencoder for Personalized Recommendation(SAPR),which aims to reduce the loss of critical information and enhance the learning of feature ***,we first combine the original rating matrix and item attribute features and feed them into the first autoencoder for generating a higher-level representation of the ***,we use a second autoencoder to enhance the reconstruction of the data representation of the prediciton rating *** output rating information is used for recommendation *** experiments on the MovieTweetings and MovieLens datasets have verified the effectiveness of SAPR compared to state-of-the-art models.
This paper presents an IC implementation of on-chip learning neuromorphic autoencoder unit in a form of rate-based spiking neural network. With a current-mode signaling scheme embedded in a 500 x 500 6b SRAM-based mem...
详细信息
This paper presents an IC implementation of on-chip learning neuromorphic autoencoder unit in a form of rate-based spiking neural network. With a current-mode signaling scheme embedded in a 500 x 500 6b SRAM-based memory, the proposed architecture achieves simultaneous processing of multiplications and accumulations. In addition, a transposable memory read for both forward and backward propagations and a virtual lookup table are also proposed to perform an unsupervised learning of restricted Boltzmann machine. The IC is fabricated using 28-nm CMOS process and is verified in a three-layer network of encoder-decoder pair for training and recovery of images with two-dimensional 16 x 16 pixels. With a dataset of 50 digits, the IC shows a normalized root mean square error of 0.078. Measured energy efficiencies are 4.46 pJ per synaptic operation for inference and 19.26 pJ per synaptic weight update for learning, respectively. The learning performance is also estimated by simulations if the proposed hardware architecture is extended to apply to a batch training of 60 000 MNIST datasets.
Dynamic community detection is significant for controlling and capturing the temporal features of networks. The evolutionary clustering framework provides a temporal smoothness constraint for simultaneously maximizing...
详细信息
Dynamic community detection is significant for controlling and capturing the temporal features of networks. The evolutionary clustering framework provides a temporal smoothness constraint for simultaneously maximizing the clustering quality at the current time step and minimizing the clustering deviation between two successive time steps. Based on this framework, some existing methods, such as the evolutionary spectral clustering and evolutionary nonnegative matrix factorization, aim to look for the low-dimensional representation by mapping reconstruction. However, such reconstruction does not address the nonlinear characteristics of networks. In this paper, we propose a semi-supervised algorithm(sE-autoencoder) to overcome the effects of nonlinear property on the low-dimensional representation. Our proposed method extends the typical nonlinear reconstruction model to the dynamic network by constructing a temporal matrix. More specifically, the potential community characteristics and the previous clustering, as the prior information,are incorporated into the loss function as a regularization term. Experimental results on synthetic and realworld datasets demonstrate that the proposed method is effective and superior to other methods for dynamic community detection.
Due to the increasing cyber-attacks,various Intrusion Detection Systems(IDSs)have been proposed to identify network *** existing machine learning-based IDSs learn patterns from the features extracted from network traf...
详细信息
Due to the increasing cyber-attacks,various Intrusion Detection Systems(IDSs)have been proposed to identify network *** existing machine learning-based IDSs learn patterns from the features extracted from network traffic flows,and the deep learning-based approaches can learn data distribution features from the raw data to differentiate normal and anomalous network *** having been used in the real world widely,the above methods are vulnerable to some types of *** this paper,we propose a novel attack framework,Anti-Intrusion Detection autoencoder(AIDAE),to generate features to disable the *** the proposed framework,an encoder transforms features into a latent space,and multiple decoders reconstruct the continuous and discrete features,***,a generative adversarial network is used to learn the flexible prior distribution of the latent *** correlation between continuous and discrete features can be kept by using the proposed training *** conducted on NSL-KDD,UNSW-NB15,and CICIDS2017 datasets show that the generated features indeed degrade the detection performance of existing IDSs dramatically.
Shield tunneling machines are paramount underground engineering equipment and play a key role in tunnel *** the shield construction process,the“mud cake”formed by the difficult-to-remove clay attached to the cutterh...
详细信息
Shield tunneling machines are paramount underground engineering equipment and play a key role in tunnel *** the shield construction process,the“mud cake”formed by the difficult-to-remove clay attached to the cutterhead severely affects the shield construction efficiency and is harmful to the healthy operation of a shield tunneling *** this study,we propose an enhanced transformer-based detection model for detecting the cutterhead clogging status of shield tunneling ***,the working state data of shield machines are selected from historical excavation data,and a long short-term memory-autoencoder neural network module is constructed to remove ***,variational mode decomposition and wavelet transform are employed to denoise the *** the preprocessing,nonoverlapping rectangular windows are used to intercept the working state data to obtain the time slices used for analysis,and several time-domain features of these periods are *** to the data imbalance in the original dataset,the k-means-synthetic minority oversampling technique algorithm is adopted to oversample the extracted time-domain features of the clogging data in the training set to balance the dataset and improve the model ***,an enhanced transformer-based neural network is constructed to extract essential implicit features and detect cutterhead clogging *** collected from actual tunnel construction projects are used to verify the proposed *** results show that the proposed model achieves accurate detection of shield machine cutterhead clogging status,with 98.85%accuracy and a 0.9786 F1 ***,the proposed model significantly outperforms the comparison models.
Controlling the events occurring in the network traffic and detecting malicious activities are of great importance for the security and sustainability of the system. For this reason, it is necessary to accurately dete...
详细信息
ISBN:
(纸本)9798350343557
Controlling the events occurring in the network traffic and detecting malicious activities are of great importance for the security and sustainability of the system. For this reason, it is necessary to accurately detect different types of attacks that may occur in network traffic. On the other hand, it is very important from the security aspect to be able to distinguish the types of attacks that have not been seen before. In this paper, a two-step procedure is proposed that can both correctly classify known attack types and distinguish unknown attack types. In the first stage, incoming traffic is classified by supervised learning with an autoencoder. In the second stage, the reliability of this classification is checked with the help of the autoencoder and the Extreme Value Theorem (EVT). According to the reliability value, incoming traffic is classified as unknown class. The simulation results were obtained with the IDS 2017 data set, which is widely used in the literature.
In the past decade,recommender systems have been widely used to provide users with personalized products and ***,most traditional recommender systems are still facing a challenge in dealing with the huge volume,comple...
详细信息
In the past decade,recommender systems have been widely used to provide users with personalized products and ***,most traditional recommender systems are still facing a challenge in dealing with the huge volume,complexity,and dynamics of *** tackle this challenge,many studies have been conducted to improve recommender system by integrating deep learning *** an unsupervised deep learning method,autoencoder has been widely used for its excellent performance in data dimensionality reduction,feature extraction,and data ***,recent researches have shown the high efficiency of autoencoder in information retrieval and recommendation *** autoencoder on recommender systems would improve the quality of recommendations due to its better understanding of users,demands and characteristics of *** paper reviews the recent researches on autoencoder-based recommender *** differences between autoencoder-based recommender systems and traditional recommender systems are presented in this *** last,some potential research directions of autoencoder-based recommender systems are discussed.
Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new per...
详细信息
Single-cell RNA sequencing(scRNA-seq)technology has become an effective tool for high-throughout transcriptomic study,which circumvents the averaging artifacts corresponding to bulk RNA-seq technology,yielding new perspectives on the cellular diversity of potential superficially homogeneous *** various sequencing techniques have decreased the amplification bias and improved capture efficiency caused by the low amount of starting material,the technical noise and biological variation are inevitably introduced into experimental process,resulting in high dropout events,which greatly hinder the downstream *** the bimodal expression pattern and the right-skewed characteristic existed in normalized scRNA-seq data,we propose a customized autoencoder based on a twopart-generalized-gamma distribution(AE-TPGG)for scRNAseq data analysis,which takes mixed discrete-continuous random variables of scRNA-seq data into account using a twopart model and utilizes the generalized gamma(GG)distribution,for fitting the positive and right-skewed continuous *** adopted autoencoder enables AE-TPGG to captures the inherent relationship between *** addition to the ability of achieving low-dimensional representation,the AETPGG model also provides a denoised imputation according to statistical characteristic of gene *** on real datasets demonstrate that our proposed model is competitive to current imputation methods and ameliorates a diverse set of typical scRNA-seq data analyses.
In cybersecurity, intrusion detection systems (IDSs) play a crucial role in identifying potential vulnerability exploits, thus reinforcing the network's defense infrastructure. Integrating machine learning models ...
详细信息
In cybersecurity, intrusion detection systems (IDSs) play a crucial role in identifying potential vulnerability exploits, thus reinforcing the network's defense infrastructure. Integrating machine learning models into IDS development has improved detection of complex and evolving intrusion patterns. However, imbalanced training data hampers model effectiveness, leading to classification inaccuracies and false alarms. This study proposes an IDS model using a dual-stage deep learning approach to address class imbalance. Initially, a sparse autoencoder (SAE) detects anomalies and extracts features. The subsequent stage employs a layered deep learning model combining convolutional neural network (CNN) and bidirectional long short-term memory (Bi-LSTM) architectures for multiclass classification. The model uses a cross-entropy loss function with proportional class weights. Evaluation on the NSL-KDD dataset demonstrates significant enhancements in overall accuracy, recall rate, and false positive rate, particularly for minority classes, showcasing its competitiveness against baseline models and other approaches.
暂无评论