The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object *** by the ...
详细信息
The self-attention networks and Transformer have dominated machine translation and natural language processing fields,and shown great potential in image vision tasks such as image classification and object *** by the great progress of Transformer,we propose a novel general and robust voxel feature encoder for 3D object detection based on the traditional *** first investigate the permutation invariance of sequence data of the self-attention and apply it to point cloud *** we construct a voxel feature layer based on the self-attention to adaptively learn local and robust context of a voxel according to the spatial relationship and context information exchanging between all points within the ***,we construct a general voxel feature learning framework with the voxel feature layer as the core for 3D object *** voxel feature with Transformer(VFT)can be plugged into any other voxel-based 3D object detection framework easily,and serves as the backbone for voxel feature *** results on the KITTI dataset demonstrate that our method achieves the state-of-the-art performance on 3D object detection.
This paper proposes a forward attention method for the sequence-to-sequence acoustic modeling of speech synthesis. This method is motivated by the nature of the monotonic alignment from phone sequences to acoustic seq...
详细信息
ISBN:
(纸本)9781538646595
This paper proposes a forward attention method for the sequence-to-sequence acoustic modeling of speech synthesis. This method is motivated by the nature of the monotonic alignment from phone sequences to acoustic sequences. Only the alignment paths that satisfy the monotonic condition are taken into consideration at each decoder timestep. The modified attention probabilities at each timestep are computed recursively using a forward algorithm. A transition agent for forward attention is further proposed, which helps the attention mechanism to make decisions whether to move forward or stay at each decoder timestep. Experimental results show that the proposed forward attention method achieves faster convergence speed and higher stability than the baseline attention method. Besides, the method of forward attention with transition agent can also help improve the naturalness of synthetic speech and control the speed of synthetic speech effectively.
Recently, the abstractive dialogue summarization task has been gaining a lot of attention from researchers. Also, unlike news articles and documents with well-structured text, dialogue differs in the sense that it oft...
详细信息
ISBN:
(纸本)9781450385053
Recently, the abstractive dialogue summarization task has been gaining a lot of attention from researchers. Also, unlike news articles and documents with well-structured text, dialogue differs in the sense that it often comes from two or more interlocutors, exchanging information with each other and having an inherent hierarchical structure based on the sequence of utterances by different speakers. This paper proposes a simple but effective hybrid approach that consists of two modules and uses transfer learning by leveraging pretrained language models (PLMs) to generate an abstractive summary. The first module highlights important utterances, capturing the utterance level relationship by adapting an auto-encoding model like BERT based on the unsupervised or supervised method. And then, the second module generates a concise abstractive summary by adapting encoder-decoder models like T5, BART, and PEGASUS. Experiment results on benchmark datasets show that our approach achieves a state-of-the-art performance by adapting to dialogue scenarios and can also be helpful in low-resource settings for domain adaptation.
Image noise is an inherent issue in low-dose CT (LDCT). Increasing radiation dose can alleviate this problem to some extent, but it also brings potential risks to the patients. Thus, LDCT denoising has raised increasi...
详细信息
ISBN:
(纸本)9781450388658
Image noise is an inherent issue in low-dose CT (LDCT). Increasing radiation dose can alleviate this problem to some extent, but it also brings potential risks to the patients. Thus, LDCT denoising has raised increasing attention from researchers. Currently, many deep learning based LDCT denoising methods have been proposed with success, such as encoder-decoder. In this paper, we propose a novel multi-scale hierarchy feature fusion based encoder-decoder network within the GAN framework for LDCT denoising. Specifically, a four-stage multi-scale dilated blocks is introduced to integrate low-level features with high-level features. Comparing with the conventional skip connection, which ignores the semantic gap between low-level features and high-level features, the advantage of our method is the effective use of low-level information. In addition, residual learning is also adopted to boost the training of the network. Experimental results on public dataset have demonstrated the superiority of our method over the state-of-the-art methods under comparison in both visual quality and quantitative evaluation.
Analyzing the correlation between two funds can help investors control investment risks and optimize investment portfolios, which has a strong guiding significance for fund investment in reality. Constructing an intel...
详细信息
Analyzing the correlation between two funds can help investors control investment risks and optimize investment portfolios, which has a strong guiding significance for fund investment in reality. Constructing an intelligent investment system with fund correlation analysis capabilities can help investors automatically make profits from financial markets. In previous research, many researchers have built intelligent investment systems using Bayesian networks, support vector machines (SVM), and LSTM models. However, the strong historical dependence between fund data and the high-dimensional and high-noise characteristics of fund data prevent traditional methods from obtaining excellent performance in fund analysis. This paper designs a deep learning-based fund intelligent trading system-DLIFT which has functions such as investment push, income prediction, and risk control. The systems data analysis module is implemented using the Improved RNN model. This model employed encoder-decoder architecture. The encoder is responsible for analyzing the fund's feature, and the decoder is responsible for analyzing the dependency relationship between the historical correlation and the current correlation. LSTM and an attention mechanism are simultaneously applied to the encoder and decoder, which enabled the discovery of the implicit dependence of time series data. This article places the designed system on a historical dataset containing multiple public funds for verification. In specific experiments, the experimental results of the comparative experiments show the superiority of our model. At the same time, the results of the ablation experiment results show that LSTM and attention mechanism play critical role in the proposed system.
Most of the image description generation methods in the attention-based encoder-decoder framework extract local features from images. Despite the relatively high semantic level of local features, it still has two prob...
详细信息
ISBN:
(纸本)9781450396899
Most of the image description generation methods in the attention-based encoder-decoder framework extract local features from images. Despite the relatively high semantic level of local features, it still has two problems to be solved, one is object loss, where some important objects may be lost when generating image descriptions, and the other is prediction error, as an object may be identified in the wrong class. In this paper, a G-AoANet model is proposed to solve the above problems. The model uses an attention mechanism to combine global features with local features. In this way, our model can selectively focus on both object and contextual information, improving the quality of the generated descriptions. Experimental results show that the model improves the initially reported best CIDEr-D and SPICE scores on the MS COCO dataset by 9.3% and 5.1% respectively.
With the development of sensor, remote sensing has become an effective way to observe the Earth. Land cover classification is an important application of remote sensing data. Deep convolutional neural networks (DCNN) ...
详细信息
ISBN:
(纸本)9781728123264
With the development of sensor, remote sensing has become an effective way to observe the Earth. Land cover classification is an important application of remote sensing data. Deep convolutional neural networks (DCNN) have good capability to extract effective image feature. In this paper, an automatic land cover classification method using spatial and channel attention was proposed. encoder-decoder structure was adopted as the network architecture. We used "channel attention" to enhance the function of effective channels. By using "spatial attention" to combining high-level semantic features with low-level features, better land objects boundaries were obtained. The proposed method was tested on ISPRS Potsdam 2D-Semantic Segmentation Challenge Dataset. The results showed that our proposed method, which uses spatial and channel attention with atrous spatial pyramid attention networks (SA-ASPA) outperforms the Deeplab_v3+ in terms of mean Intersection over Union (mIoU) for 5%.
In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial netw...
详细信息
In order to solve the problems of artifacts and noise in low-dose computed tomography(CT)images in clinical medical diagnosis,an improved image denoising algorithm under the architecture of generative adversarial network(GAN)was ***,a noise model based on style GAN2 was constructed to estimate the real noise distribution,and the noise information similar to the real noise distribution was generated as the experimental noise data ***,a network model with encoder-decoder architecture as the core based on GAN idea was constructed,and the network model was trained with the generated noise data set until it reached the optimal ***,the noise and artifacts in low-dose CT images could be removed by inputting low-dose CT images into the denoising *** experimental results showed that the constructed network model based on GAN architecture improved the utilization rate of noise feature information and the stability of network training,removed image noise and artifacts,and reconstructed image with rich texture and realistic visual effect.
暂无评论