Providing timely assistance to students in intelligent tutoring systems is a challenging research problem. In this study, we aim to address this problem by determining when to provide proactive help with autoencoder b...
详细信息
ISBN:
(纸本)9783031363351;9783031363368
Providing timely assistance to students in intelligent tutoring systems is a challenging research problem. In this study, we aim to address this problem by determining when to provide proactive help with autoencoder based feature learning and a deep reinforcement learning (DRL) model. To increase generalizability, we only use domain-independent features for the policy. The proposed pedagogical policy provides next-step proactive hints based on the prediction of the DRL model. We conduct a study to examine the effectiveness of the new policy in an intelligent logic tutor. Our findings provide insight into the use of DRL policies utilizing autoencoder based feature learning to determine when to provide proactive help to students.
This study proposes a method for improved image compression using a combination of discrete cosine transformation (DCT) and autoencoder. Images typically contain large amounts of data that require significant storage ...
详细信息
ISBN:
(纸本)9798350386813;9798350386820
This study proposes a method for improved image compression using a combination of discrete cosine transformation (DCT) and autoencoder. Images typically contain large amounts of data that require significant storage space, making them difficult to store and transmit. Compressing images is a practical solution to this issue as it reduces memory usage and enables faster transmission to the receiver. In this approach, we use DCT as a preprocessing step before training an autoencoder model to compress the image while retaining all essential information. The proposed method involves a convolutional neural network (CNN) that performs down-sampling and up-sampling operations on the input data processed by DCT. The performance of the proposed method is evaluated and compared with traditional image compression techniques such as JPEG, JPEG 2000, and BPG. Experimental results demonstrate that the proposed approach outperforms the traditional techniques in terms of compression ratio and image quality.
Recently, the autoencoder framework has shown great potential in reducing the feedback overhead of the downlink channel state information (CSI). In this work, we further find that the user equipment in practical syste...
详细信息
Recently, the autoencoder framework has shown great potential in reducing the feedback overhead of the downlink channel state information (CSI). In this work, we further find that the user equipment in practical systems occasionally moves in a relatively stable area for a long time, and the corresponding communication environment is relatively stable. A user-centric online training strategy is proposed to further improve CSI feedback performance using the above characteristics. The key idea of the proposed method is to train a new encoder for a specific area without changes to the decoder at the base station. Given that the CSI training samples are insufficient, two data augmentation strategies, including random erasing and random phase shift, are introduced to improve the neural network generalization. In addition, the proposed user-centric online training framework is extended to the multi-user scenario for considerable performance improvement via gossip learning, which is a fully decentralized distributed learning framework and can use crowd intelligence. The simulation results show that the proposed user-centric online gossip training offers a more substantial increase in the feedback accuracy and can considerably improve autoencoder generalization.
The development of an optimized deep learning intruder detection model that could be executed on IoT devices with limited hardware support has several advantages, such as the reduction of communication energy, lowerin...
详细信息
The development of an optimized deep learning intruder detection model that could be executed on IoT devices with limited hardware support has several advantages, such as the reduction of communication energy, lowering latency, and protecting data privacy. Motivated by these benefits, this research aims to design a lightweight autoencoder deep model that has a shallow architecture with a small number of input features and a few hidden neurons. To achieve this objective, an efficient two-layer optimizer is used to evolve a lightweight deep autoencoder model by performing simultaneous selection for the input features, the training instances, and the number of hidden neurons. The optimized deep model is constructed guided by both the accuracy of a K-nearest neighbor (KNN) classifier and the complexity of the autoencoder model. To evaluate the performance of the proposed optimized model, it has been applied for the N-baiot intrusion detection dataset. Reported results showed that the proposed model achieved anomaly detection accuracy of 99% with a lightweight autoencoder model with on average input features around 30 and output hidden neurons of 2 only. In addition, the proposed two-layers optimizer was able to outperform several optimizers such as Arithmetic Optimization Algorithm (AOA), Particle Swarm Optimization (PSO), and Reinforcement Learning-based Memetic Particle Swarm Optimization (RLMPSO).
Emotion recognition from speech has its fair share of applications and consequently extensive research has been done over the past few years in this interesting field. However, many of the existing solutions aren'...
详细信息
Emotion recognition from speech has its fair share of applications and consequently extensive research has been done over the past few years in this interesting field. However, many of the existing solutions aren't yet ready for real time applications. In this work, we propose a compact representation of audio using conventional autoencoders for dimensionality reduction, and test the approach on two benchmark publicly available datasets. Such compact and simple classification systems where the computing cost is low and memory is managed efficiently may be more useful for real time application. System is evaluated on the Ryerson Audio-Visual Database of Emotional Speech and Song (RAVDESS) and the Toronto Emotional Speech Set (TESS). Three classifiers, namely, support vector machines (SVM), decision tree classifier, and convolutional neural networks (CNN) have been implemented to judge the impact of the approach. The results obtained by attempting classification with Alexnet and Resnet50 are also reported. Observations proved that this introduction of autoencoders indeed can improve the classification accuracy of the emotion in the input audio files. It can be concluded that in emotion recognition from speech, the choice and application of dimensionality reduction of audio features impacts the results that are achieved and therefore, by working on this aspect of the general speech emotion recognition model, it may be possible to make great improvements in the future.
Hyperspectral X ray analysis is used in many industrial pipelines, from quality control to detection of low-density contaminants in food. Unfortunately, the signal acquired by X-ray sensors is often affected by a grea...
详细信息
Hyperspectral X ray analysis is used in many industrial pipelines, from quality control to detection of low-density contaminants in food. Unfortunately, the signal acquired by X-ray sensors is often affected by a great amount of noise. This hinders the performance of most of the applications building on top of these acquisitions (e.g., detection of food contaminants). Therefore, a good denoising pipeline is necessary. This article proposes a comparison between three different autoencoder variants: the Variational autoencoder, the Augmented autoencoder, and a plain vanilla autoencoder. All the networks are trained in an unsupervised fashion to denoise a given noisy spectrum. Focusing on the specific application of recognizing possible food contaminants, we force the latent space of the networks to have just two parameters, as suggested by the physical law of Lambert- Beer. We validate our experiments on a synthetic dataset composed of roughly 15 million spectra. Results suggest that the Augmented autoencoder is the best network configuration for this task, showing excellent performance without suffering from the nondeterministic behavior of the Variational autoencoder.
Recent studies verified that a genetic algorithm can discover efficient and innovative wind turbines by using image encoding and decoding techniques. To accelerate the optimization, in this work, ResidualRecursion Aut...
详细信息
Recent studies verified that a genetic algorithm can discover efficient and innovative wind turbines by using image encoding and decoding techniques. To accelerate the optimization, in this work, ResidualRecursion autoencoder (RRAE) is proposed to extract low-dimensional latent codes from rotors' crosssection images while maintaining reconstruction accuracy as high as possible. As a kind of neural network framework, the advantages of using RRAE are threefold: 1) RRAE can wrap over different kinds of autoencoders and improve their performance;2) RRAE is compatible with different kinds of loss functions and works well with very low-dimensional latent codes;3) RRAE is easy to use and efficient in decoding latent codes which is important to the rapid convergence of the genetic algorithm. The experiment results has shown that the reconstruction loss has decreased by 30.56% on a recursive autoencoder, 11.40% to 29.34% on different feedforward autoencoders. Two RRAE-accelerated optimizations have been carried out in this work. One has used only 14% of the calculation required by the baseline method without any deterioration in rotor performance. The other one has used 52.33% and increased the rotor performance by 7.59%.
Fully connected deep neural networks (DNN) often include redundant weights leading to overfitting and high memory requirements. Additionally, in tabular data classification, DNNs are challenged by the often superior p...
详细信息
Fully connected deep neural networks (DNN) often include redundant weights leading to overfitting and high memory requirements. Additionally, in tabular data classification, DNNs are challenged by the often superior performance of traditional machine learning models. This paper proposes periodic perturbations (prune and regrow) of DNN weights, especially at the self-supervised pretraining stage of deep autoencoders. The proposed weight perturbation strategy outperforms dropout learning or weight regularization (L1 or L2) for four out of six tabular data sets in downstream classification tasks. Unlike dropout learning, the proposed weight perturbation routine additionally achieves 15% to 40% sparsity across six tabular data sets, resulting in compressed pretrained models. The proposed pretrained model compression improves the accuracy of downstream classification, unlike traditional weight pruning methods that trade off performance for model compression. Our experiments reveal that a pretrained deep autoencoder with weight perturbation can outperform traditional machine learning in tabular data classification, whereas baseline fully-connected DNNs yield the worst classification accuracy. However, traditional machine learning models are superior to any deep model when a tabular data set contains uncorrelated variables. Therefore, the performance of deep models with tabular data is contingent on the types and statistics of constituent variables.(c) 2022 Elsevier Ltd. All rights reserved.
The self-regulated recognition of human activities from time-series smartphone sensor data is a growing research area in smart and intelligent health care. Deep learning (DL) approaches have exhibited improvements ove...
详细信息
The self-regulated recognition of human activities from time-series smartphone sensor data is a growing research area in smart and intelligent health care. Deep learning (DL) approaches have exhibited improvements over traditional machine learning (ML) models in various domains, including human activity recognition (HAR). Several issues are involved with traditional ML approaches;these include handcrafted feature extraction, which is a tedious and complex task involving expert domain knowledge, and the use of a separate dimensionality reduction module to overcome overfitting problems and hence provide model generalization. In this article, we propose a DL-based approach for activity recognition with smartphone sensor data, i.e., accelerometer and gyroscope data. Convolutional neural networks (CNNs), autoencoders (AEs), and long short-term memory (LSTM) possess complementary modeling capabilities, as CNNs are good at automatic feature extraction, AEs are used for dimensionality reduction and LSTMs are adept at temporal modeling. In this study, we take advantage of the complementarity of CNNs, AEs, and LSTMs by combining them into a unified architecture. We explore the proposed architecture, namely, "ConvAE-LSTM", on four different standard public datasets (WISDM, UCI, PAMAP2, and OPPORTUNITY). The experimental results indicate that our novel approach is practical and provides relative smartphone-based HAR solution performance improvements in terms of computational time, accuracy, F1-score, precision, and recall over existing state-of-the-art methods.
This paper proposes a new image caption generative model for Memes called GUMI-AE. Meme denotes a humorous short sentence suitable for the given image in this paper. An Image caption generative model usually consists ...
详细信息
ISBN:
(纸本)9789819970186;9789819970193
This paper proposes a new image caption generative model for Memes called GUMI-AE. Meme denotes a humorous short sentence suitable for the given image in this paper. An Image caption generative model usually consists of an image encoder and a sentence decoder. Furthermore, most conventional models use a pre-trained neural network model for the image encoder, e.g., ResNet152 trained using ImageNet. However, pre-trained ResNet152 may not be effective as an encoder for extracting features from arbitrary images. Because the training samples for the meme generative model can be obtained from the website "Bokete" (in Japanese) which is a website that provides a system for people to post images and humorous short sentences associated with these images. Images posted on Bokete include a wide variety of images such as illustrations and text-only images which may be outside of the training images of ImageNet. This paper proposes an image caption generative model incorporating autoencoder (AE) as the image encoder. AE can be trained with the training samples obtained from Bokete without the image annotation. This enables the proposed method to generate short sentences with humor for memes. Finally, the proposed model is compared with the conventional one, and the evaluation of the proposed GUMI-AE will be discussed.
暂无评论