Recently, the abstractive dialogue summarization task has been gaining a lot of attention from researchers. Also, unlike news articles and documents with well-structured text, dialogue differs in the sense that it oft...
详细信息
Recently, the abstractive dialogue summarization task has been gaining a lot of attention from researchers. Also, unlike news articles and documents with well-structured text, dialogue differs in the sense that it often comes from two or more interlocutors,exchanging information with each other and having an inherent hierarchical structure based on the sequence of utterances by different speakers. This paper proposes a simple but effective hybrid approach that consists of two modules and uses transfer learning by leveraging pretrained language models(PLMs) to generate an abstractive summary. The first module highlights important utterances, capturing the utterance level relationship by adapting an auto-encoding model like BERT based on the unsupervised or supervised method. And then, the second module generates a concise abstractive summary by adapting encoder-decoder models like T5,BART, and PEGASUS. Experiment results on benchmark datasets show that our approach achieves a state-of-the-art performance by adapting to dialogue scenarios and can also be helpful in lowresource settings for domain adaptation.
Developing accurate models for batteries, capturing ageing effects and nonlinear behaviors, is critical for development of efficient and effective performance. Due to the inherent difficulties in developing physics-ba...
详细信息
Developing accurate models for batteries, capturing ageing effects and nonlinear behaviors, is critical for development of efficient and effective performance. Due to the inherent difficulties in developing physics-based models, data-driven techniques have been gaining popularity. However, most machine learning methods are black boxes, lacking interpretability and requiring large amounts of labeled data. In this paper, we propose a physics informed encoder-decoder model that learns from unlabeled data to separate slow-changing battery states, such state of charge (SOC) and state of health (SOH), from fast transient responses, thereby increasing interpretability compared to conventional methods. By integrating physics-informed loss functions and modified architectures, we map the encoder output to quantifiable battery states, without needing explicit SOC and SOH labels. Our proposed approach is validated on a lithium-ion battery ageing dataset capturing dynamic discharge profiles that aim to mimic electric vehicle driving profiles. The model is trained and validated on sparse intermittent cycles (6 %-7 % of all cycles), accurately estimating SOC and SOH while providing accurate multistep ahead voltage predictions across single and multiple-cell based training scenarios.
Person re-identification (re-id) is the task of identifying a person across non-overlapping cameras. Most of the current techniques apply deep learning and achieve a significant accuracy. However, learning a deep mode...
详细信息
Person re-identification (re-id) is the task of identifying a person across non-overlapping cameras. Most of the current techniques apply deep learning and achieve a significant accuracy. However, learning a deep model that can generalize well against the challenges of pose variation, occlusion, illumination changes, and low resolution is a difficult task. Toward this, we propose a deep reconstruction re-id network, comprising of an encoder and a multi-resolution decoder, which can learn embeddings invariant to pose, occlusion, illumination, and low resolution. In our model, the encoder acts as a conventional deep re-id network and outputs a discriminative feature embedding. The output feature is then used as an input to the multi-resolution decoder to reconstruct the input images of the same identity under different resolutions, such that they are similar in pose and illumination as well as free from occlusion. We further propose a hybrid sampling strategy to boost the effectiveness of the training loss function. In addition, we propose test set augmentation using the reconstructed images to explicitly transform single query to multi-query setting. In our multi-tasking approach, the feature robustness is enhanced by the multi-resolution decoder, and the overall accuracy is further improved by a sampling strategy and test data augmentation. Furthermore, we empirically show that the proposed network is robust to pose variations, occlusion, and low resolution. We perform rigorous qualitative and quantitative analysis in order to demonstrate that we achieve state-of-the-art person re-id accuracy.
Modeling and predicting stock prices is an important and challenging task in the field of financial market. Due to the high volatility of stock prices, traditional data mining methods cannot identify the most relevant...
详细信息
Modeling and predicting stock prices is an important and challenging task in the field of financial market. Due to the high volatility of stock prices, traditional data mining methods cannot identify the most relevant and critical market data for predicting stock price trend. This paper proposes a stock price trend predictive model (TPM) based on an encoder-decoder framework that predicting the stock price movement and its duration adaptively. This model consists of two phases, first, a dual feature extraction method based on different time spans is proposed to get more information from the market data. While traditional methods only extract features from information at some specific time points, this proposed model applies the PLR method and CNN to extract the long-term temporal features and the short-term spatial features from market data. Then, in the second phase of the proposed TPM, a dual attention mechanism based encoder-decoder framework is used to select and merge relevant dual features and predict the stock price trend. To evaluate our proposed TPM, we collected high-frequency market data for stock indexes CSI300, SSE 50 and CSI 500, and conducted experiments based on these three data sets. The experimental results show that the proposed TPM outperforms the existing state-of-art methods, including SVR, LSTM, CNN, LSTM_CNN and TPM_NC, in terms of prediction accuracy.
In this work, we first analyze the memory behavior in three recurrent neural networks (RNN) cells;namely, the simple RNN (SRN), the long short-term memory (LSTM) and the gated recurrent unit (GRU), where the memory is...
详细信息
In this work, we first analyze the memory behavior in three recurrent neural networks (RNN) cells;namely, the simple RNN (SRN), the long short-term memory (LSTM) and the gated recurrent unit (GRU), where the memory is defined as a function that maps previous elements in a sequence to the current output. Our study shows that all three of them suffer rapid memory decay. Then, to alleviate this effect, we introduce trainable scaling factors that act like an attention mechanism to adjust memory decay adaptively. The new design is called the extended LSTM (ELSTM). Finally, to design a system that is robust to previous erroneous predictions, we propose a dependent bidirectional recurrent neural network (DBRNN). Extensive experiments are conducted on different language tasks to demonstrate the superiority of the proposed ELSTM and DBRNN solutions. The ELTSM has achieved up to 30% increase in the labeled attachment score (LAS) as compared to LSTM and GRU in the dependency parsing (DP) task. Our models also outperform other state-of-the-art models such as bi-attention [1] and convolutional sequence to sequence (convseq2seq) [2] by close to 10% in the LAS. (C) 2019 Elsevier B.V. All rights reserved.
Image noise is an inherent issue in low-dose CT (LDCT). Increasing radiation dose can alleviate this problem to some extent, but it also brings potential risks to the patients. Thus, LDCT denoising has raised increasi...
详细信息
ISBN:
(纸本)9781450388658
Image noise is an inherent issue in low-dose CT (LDCT). Increasing radiation dose can alleviate this problem to some extent, but it also brings potential risks to the patients. Thus, LDCT denoising has raised increasing attention from researchers. Currently, many deep learning based LDCT denoising methods have been proposed with success, such as encoder-decoder. In this paper, we propose a novel multi-scale hierarchy feature fusion based encoder-decoder network within the GAN framework for LDCT denoising. Specifically, a four-stage multi-scale dilated blocks is introduced to integrate low-level features with high-level features. Comparing with the conventional skip connection, which ignores the semantic gap between low-level features and high-level features, the advantage of our method is the effective use of low-level information. In addition, residual learning is also adopted to boost the training of the network. Experimental results on public dataset have demonstrated the superiority of our method over the state-of-the-art methods under comparison in both visual quality and quantitative evaluation.
Analyzing the correlation between two funds can help investors control investment risks and optimize investment portfolios, which has a strong guiding significance for fund investment in reality. Constructing an intel...
详细信息
Analyzing the correlation between two funds can help investors control investment risks and optimize investment portfolios, which has a strong guiding significance for fund investment in reality. Constructing an intelligent investment system with fund correlation analysis capabilities can help investors automatically make profits from financial markets. In previous research, many researchers have built intelligent investment systems using Bayesian networks, support vector machines (SVM), and LSTM models. However, the strong historical dependence between fund data and the high-dimensional and high-noise characteristics of fund data prevent traditional methods from obtaining excellent performance in fund analysis. This paper designs a deep learning-based fund intelligent trading system-DLIFT which has functions such as investment push, income prediction, and risk control. The systems data analysis module is implemented using the Improved RNN model. This model employed encoder-decoder architecture. The encoder is responsible for analyzing the fund's feature, and the decoder is responsible for analyzing the dependency relationship between the historical correlation and the current correlation. LSTM and an attention mechanism are simultaneously applied to the encoder and decoder, which enabled the discovery of the implicit dependence of time series data. This article places the designed system on a historical dataset containing multiple public funds for verification. In specific experiments, the experimental results of the comparative experiments show the superiority of our model. At the same time, the results of the ablation experiment results show that LSTM and attention mechanism play critical role in the proposed system.
This paper addressed the vessel segmentation and disease diagnostic in coronary angiography image and proposed an encoder-decoder architecture of deep learning with End-to-End model, where encoder is based on ResNet, ...
详细信息
This paper addressed the vessel segmentation and disease diagnostic in coronary angiography image and proposed an encoder-decoder architecture of deep learning with End-to-End model, where encoder is based on ResNet, and the deep features are exacted automatically, and the decoder produces the segmentation result by balanced cross-entropy cost function. Furthermore, batch normalization is employed to decrease the gradient vanishing in the training process, so as to reduce the difficulty of training the deep neural network. The experiment results show that the algorithm effectively exacts the feature and edge information, therefore the complex background disturbance is suppressed convincingly, and the vessel segmentation precision is improved effectively, the segmentation precision for three typical vessels are 0.8365, 0.8924 and 0.6297 respectively;and the F-measure are 0.8514, 0.8786 and 0.7298, respectively. In addition, the experiment results show that our proposed can be generalized to the angiography image within limits.
Phase-coding structured light is an important technique in 3 D ***,a great challenge is the wrapped phase that causes geometry *** unwrapping methods such as spatial and temporal approaches face the problem of error p...
详细信息
Phase-coding structured light is an important technique in 3 D ***,a great challenge is the wrapped phase that causes geometry *** unwrapping methods such as spatial and temporal approaches face the problem of error propagation and low *** this paper,we propose to solve the phase unwrapping problem with a deep neural *** be specific,the phase unwrapping problem is cast to a semantic segmentation task,where the wrapped phase is the input and the fringe index for every pixel is the *** encoderdecoder architecture,which is like U-net,is adopted as the *** further propose a combined loss function by considering cross entropy loss,phase consistency loss and edge consistency *** 10000 artificially synthesized samples,the proposed method converges *** results demonstrate that the trained model well predicts fringe orders on both simulation data and real captured *** addition,it unwrapps every pixel independently and avoids phase error propagation,and further achieves accurate 3 D reconstruction.
Despite the application of state-of-the-art fully Convolutional Neural Networks (CNNs) for semantic segmentation of very high-resolution optical imagery, their capacity has not yet been thoroughly examined for the cla...
详细信息
Despite the application of state-of-the-art fully Convolutional Neural Networks (CNNs) for semantic segmentation of very high-resolution optical imagery, their capacity has not yet been thoroughly examined for the classification of Synthetic Aperture Radar (SAR) images. The presence of speckle noise, the absence of efficient feature expression, and the limited availability of labelled SAR samples have hindered the application of the state-of-the-art CNNs for the classification of SAR imagery. This is of great concern for mapping complex land cover ecosystems, such as wetlands, where backscattering/spectrally similar signatures of land cover units further complicate the matter. Accordingly, we propose a new Fully Convolutional Network (FCN) architecture that can be trained in an end-to-end scheme and is specifically designed for the classification of wetland complexes using polarimetric SAR (PoISAR) imagery. The proposed architecture follows an encoder-decoder paradigm, wherein the input data are fed into a stack of convolutional filters (encoder) to extract high-level abstract features and a stack of transposed convolutional filters (decoder) to gradually up-sample the low resolution output to the spatial resolution of the original input image. The proposed network also benefits from recent advances in CNN designs, namely the addition of inception modules and skip connections with residual units. The former component improves multi-scale inference and enriches contextual information, while the latter contributes to the recovery of more detailed information and simplifies optimization. Moreover, an in-depth investigation of the learned features via opening the black box demonstrates that convolutional filters extract discriminative polarimetric features, thus mitigating the limitation of the feature engineering design in PoISAR image processing. Experimental results from full polarimetric RADARSAT-2 imagery illustrate that the proposed network outperforms the convent
暂无评论