To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation-Maximization algorithm for general latent variable models is proposed. For exponential models ...
详细信息
To deal with very large datasets a mini-batch version of the Monte Carlo Markov Chain Stochastic Approximation Expectation-Maximization algorithm for general latent variable models is proposed. For exponential models the algorithm is shown to be convergent under classical conditions as the number of iterations increases. Numerical experiments illustrate the performance of the mini-batch algorithm in various models. In particular, we highlight that mini-batch sampling results in an important speed-up of the convergence of the sequence of estimators generated by the algorithm. Moreover, insights on the effect of the mini-batch size on the limit distribution are presented. Finally, we illustrate how to use mini-batch sampling in practice to improve results when a constraint on the computing time is given.
Machine learning and deep learning advancements have boosted Brain-Computer Interface (BCI) performance, but their wide-scale applicability is limited due to factors like individual health, hardware variations, and cu...
详细信息
Machine learning and deep learning advancements have boosted Brain-Computer Interface (BCI) performance, but their wide-scale applicability is limited due to factors like individual health, hardware variations, and cultural differences affecting neural data. Studies often focus on uniform single-site experiments in uniform settings, leading to high performance that may not translate well to real-world diversity. Deep learning models aim to enhance BCI classification accuracy, and transfer learning has been suggested to adapt models to individual neural patterns using a base model trained on others' data. This approach promises better generalizability and reduced overfitting, yet challenges remain in handling diverse and imbalanced datasets from different equipment, subjects, multiple centres in different countries, and both healthy and patient populations for effective model transfer and tuning. In a setting characterized by maximal heterogeneity, we proposed P300 wave detection in BCIs employing a convolutional neural network fitted with adaptive transfer learning based on Poison sampling Disk (PDS) called Active sampling (AS), which flexibly adjusts the transition from source data to the target domain. Our results reported for subject adaptive with 40% of adaptive fine-tuning that the averaged classification accuracy improved by 5.36% and standard deviation reduced by 12.22% using two distinct, internationally replicated datasets. These results outperformed in classification accuracy, computational time, and training efficiency, mainly due to the proposed Active sampling (AS) method for transfer learning.
The application of intelligent reflecting surface (IRS) depends on the knowledge of channel state information (CSI), and has been hindered by the heavy overhead of channel training, estimation, and feedback in fast-ch...
详细信息
The application of intelligent reflecting surface (IRS) depends on the knowledge of channel state information (CSI), and has been hindered by the heavy overhead of channel training, estimation, and feedback in fast-changing channels. This paper presents a new two-timescale beamforming approach to maximizing the average achievable rate (AAR) of IRS-assisted MIMO systems, where the IRS is configured relatively infrequently based on statistical CSI (S-CSI) and the base station precoder and power allocation are updated frequently based on quickly outdated instantaneous CSI (I-CSI). The key idea is that we first reveal the optimal small-timescale power allocation based on outdated I-CSI yields a water-filling structure. Given the optimal power allocation, a new mini-batch sampling (mbs)-based particle swarm optimization (PSO) algorithm is developed to optimize the large-timescale IRS configuration with reduced channel samples. Another important aspect is that we develop a model-driven PSO algorithm to optimize the IRS configuration, which maximizes a lower bound of the AAR by only using the S-CSI and eliminates the need of channel samples. The model-driven PSO serves as a dependable lower bound for the mbs-PSO. Simulations corroborate the superiority of the new two-timescale beamforming strategy to its alternatives in terms of the AAR and efficiency, with the benefits of the IRS demonstrated.
In-batch contrastive learning is a state-of-the-art self-supervised method that brings semantically-similar instances close while pushing dissimilar instances apart within a mini-batch. Its key to success is the negat...
详细信息
ISBN:
(纸本)9798400701030
In-batch contrastive learning is a state-of-the-art self-supervised method that brings semantically-similar instances close while pushing dissimilar instances apart within a mini-batch. Its key to success is the negative sharing strategy, in which every instance serves as a negative for the others within the mini-batch. Recent studies aim to improve performance by sampling hard negatives within the current mini-batch, whose quality is bounded by the mini-batch itself. In this work, we propose to improve contrastive learning by samplingmini-batches from the input data. We present batchSampler(1) to sample mini-batches of hard-to-distinguish (i.e., hard and true negatives to each other) instances. To make each mini-batch have fewer false negatives, we design the proximity graph of randomly-selected instances. To form the mini-batch, we leverage random walk with restart on the proximity graph to help sample hard-to-distinguish instances. batchSampler is a simple and general technique that can be directly plugged into existing contrastive learning models in vision, language, and graphs. Extensive experiments on datasets of three modalities show that batchSampler can consistently improve the performance of powerful contrastive models, as shown by significant improvements of SimCLR on ImageNet-100, SimCSE on STS (language), and GraphCL and MVGRL on graph datasets.
作者:
Lee, Si WoonKim, Ha YoungAjou Univ
Dept Artificial Intelligence & Data Sci Worldcupro 206 Suwon 16499 South Korea Yonsei Univ
Grad Sch Informat Yonsei Ro 50 Seoul 03722 South Korea
Forecasting stock market indexes is an important issue for market participants, because even a small improvement in forecast accuracy may lead to better trading decisions than those of other participants. Rising inter...
详细信息
Forecasting stock market indexes is an important issue for market participants, because even a small improvement in forecast accuracy may lead to better trading decisions than those of other participants. Rising interest in deep learning has led to its application in stock market forecasting. However, it is still challenging to use market-size time-series data to predict composite index prices. In this study, we propose a new stock market forecasting framework, NuNet, which can successfully learn high-level features from super-high dimensional time-series data. NuNet is an end-to-end integrated neural network framework consisting of two feature extractor modules, a super-high dimensional market information feature extractor and a target index feature extractor. In addition, we propose a mini-batch sampling technique, trend sampling, which probabilistically samples more recent data when training. Furthermore, we propose a novel regularization method, called column-wise random shuffling, which is a data augmentation technique that can be applied to convolutional neural networks. The experiments are comprehensively carried out in three aspects for three indexes, namely S&P500, KOSPI200, and FTSE100. The results demonstrate that the proposed model outperforms all baseline models. Specifically, for the S&P500, KOSPI200, and FTSE100, the overall mean squared error of our proposed model NuNet(DA, T) is 60.79%, 51.29%, and 43.36% lower than that of the baseline model SingleNet(R), respectively. Moreover, we employ trading simulations with realistic transaction costs. Our proposed model outperforms the buy-and-hold strategy being an average of 2.57 times more profitable in three indexes. (c) 2020 Elsevier Ltd. All rights reserved.
In Automatic Speech Recognition (ASR), the acoustic model (AM) is modeled by a Deep Neural Network (DNN). The DNN learns a posterior probability in a supervised fashion utilizing input features and ground-truth labels...
详细信息
ISBN:
(纸本)9783030260606;9783030260613
In Automatic Speech Recognition (ASR), the acoustic model (AM) is modeled by a Deep Neural Network (DNN). The DNN learns a posterior probability in a supervised fashion utilizing input features and ground-truth labels. Current approaches combine a DNN with a Hidden Markov Model (HMM) in a hybrid approach, which achieved good results in the last years. Similar approaches using a discrete version, hence a Discrete Hidden Markov Model (DHMM), have been disregarded in recent past. Our approach revisits the idea of a discrete system, more precisely the so-called Deep Neural Network Quantizer (DNNQ), demonstrating how a DNNQ is created and trained. We introduce a novel approach to train a DNNQ in a supervised fashion with an arbitrary output layer size even though suitable target values are not available. The proposed method provides a mapping function exploiting fixed ground-truth labels. Consequently, we are able to apply a frame-based cross entropy (CE) training. Our experiments demonstrate that the DNNQ reduces the Word Error Rate (WER) by 17.6% on monophones and by 2.2% on tri-phones, respectively, compared to a continuous HMM-Gaussian Mixture Model (GMM) system.
暂无评论