Machine Translation is one of the sub-fields of computational linguistics which includes broad analytical models enlarged by using extremely advanced semantic *** have converted the Arabic sentences to its correspondi...
详细信息
ISBN:
(纸本)9781728173665
Machine Translation is one of the sub-fields of computational linguistics which includes broad analytical models enlarged by using extremely advanced semantic *** have converted the Arabic sentences to its corresponding Bangla equivalent sentences using a Neural Machine Translation (NMT) system operated with fixed vocabulary by using Sequence to Sequence (Seq2Seq) mechanism which is structured approach for automatically transferring the source sequences to the target sequences. From experiments,we have achieved 69.76% accuracy in this discipline.
Spatial pyramid pooling module or encode-decoder structure are used in deep neural networks for semantic segmentation task. The former networks are able to encode multi-scale contextual information by probing the inco...
详细信息
ISBN:
(数字)9783030012342
ISBN:
(纸本)9783030012342;9783030012335
Spatial pyramid pooling module or encode-decoder structure are used in deep neural networks for semantic segmentation task. The former networks are able to encode multi-scale contextual information by probing the incoming features with filters or pooling operations at multiple rates and multiple effective fields-of-view, while the latter networks can capture sharper object boundaries by gradually recovering the spatial information. In this work, we propose to combine the advantages from both methods. Specifically, our proposed model, DeepLabv3+, extends DeepLabv3 by adding a simple yet effective decoder module to refine the segmentation results especially along object boundaries. We further explore the Xception model and apply the depthwise separable convolution to both Atrous Spatial Pyramid Pooling and decoder modules, resulting in a faster and stronger encoder-decoder network. We demonstrate the effectiveness of the proposed model on PASCAL VOC 2012 and Cityscapes datasets, achieving the test set performance of 89% and 82.1% without any post-processing. Our paper is accompanied with a publicly available reference implementation of the proposed models in Tensorflow at https://***/tensorflow/models/tree/master/research/deeplab.
Deep convolutional neural networks (CNNs) have recently made revolutionary improvements in salient object detection. However, most existing CNN-based models fail to precisely separate the whole salient object(s) from ...
详细信息
Deep convolutional neural networks (CNNs) have recently made revolutionary improvements in salient object detection. However, most existing CNN-based models fail to precisely separate the whole salient object(s) from a cluttered background due to the downsampling effects or the patch-level operation. In this paper, we propose a multi-scale deep encoder-decoder network which learns discriminative saliency cues and computes confidence scores in an end-to-end fashion. The encoder network extracts meaningful and informative features in a global view, and the decoder network recovers lost detailed object structure in a local perspective. By taking multiple resized images as the inputs, the proposed model incorporates multi-scale features from a shared network and predicts a fine-grained saliency map at the pixel level. To easily and efficiently train the whole network, the light-weighted decoder breaks through the limit of conventional symmetric structure. In addition, a two-stage training strategy is designed to encourage the robustness and accuracy of the network. Without any post-processing steps, our method is capable of significantly reducing the computation complexity while densely segmenting foreground objects from an image. Extensive experiments on six challenging datasets demonstrate that the proposed model outperforms other state-of-the-art approaches in terms of various evaluation metrics. (C) 2018 Elsevier B.V. All rights reserved.
Accurate segmentation of cardiac bi-ventricle (CBV) from magnetic resonance (MR) images has a great significance to analyze and evaluate the function of the cardiovascular system. However, the complex structure of CBV...
详细信息
Accurate segmentation of cardiac bi-ventricle (CBV) from magnetic resonance (MR) images has a great significance to analyze and evaluate the function of the cardiovascular system. However, the complex structure of CBV image makes fully automatic segmentation as a well-known challenge. In this paper, we propose an improved end-to-end encoder-decoder network for CBV segmentation from the pixel level view (Cardiac-DeepIED). In our framework, we explicitly solve the high variability of complex cardiac structures through an improved encoder-decoder architecture which consists of Fire dilated modules and D-Fire dilated modules. This improved encoder-decoder architecture has the advantages of being capable of obtaining semantic task-aware representation and preserving fine-grained information. In addition, our method can dynamically capture potential spatiotemporal correlations between consecutive cardiac MR images through specially designed convolutional long-term and short-term memory structure;it can simulate spatiotemporal contexts between consecutive frame images. The combination of these modules enables the entire network to get an accurate, robust segmentation result. The proposed method is evaluated on the 145 clinical subjects with leave-one-out cross-validation. The average dice metric (DM) is up to 0.96 (left ventricle), 0.89 (myocardium), and 0.903 (right ventricle). The performance of our method outperforms state-of-the-art methods. These results demonstrate the effectiveness and advantages of our method for CBV regions segmentation at the pixel-level. It also reveals the proposed automated segmentation system can be embedded into the clinical environment to accelerate the quantification of CBV and expanded to volume analysis, regional wall thickness analysis, and three LV dimensions analysis.
The seismic horizon is a critical input for the structure and stratigraphy modeling of reservoirs. It is extremely hard to automatically obtain an accurate horizon interpretation for seismic data in which the lateral ...
详细信息
The seismic horizon is a critical input for the structure and stratigraphy modeling of reservoirs. It is extremely hard to automatically obtain an accurate horizon interpretation for seismic data in which the lateral continuity of reflections is interrupted by faults and unconformities. The process of seismic horizon interpretation can be viewed as segmenting the seismic traces into different parts and each part is a unique object. Thus, we have considered the horizon interpretation as an object detection problem. We use the encoder-decoder convolutional neural network (CNN) to detect the "objects" contained in the seismic traces. The boundary of the objects is regarded as the horizons. The training data are the seismic traces located on a user-defined coarse grid. We give a unique training label to the time window of seismic traces bounded by two manually picked horizons. To efficiently learn the waveform pattern that is bounded by two adjacent horizons, we use variable sizes for the convolution filters, which is different than current CNN-based image segmentation methods. Two field data examples demonstrate that our method is capable of producing accurate horizons across the fault surface and near the unconformity which is beyond the current capability of horizon picking method.
In many areas images can be corrupted by various types of noise and therefore image denoising is a prerequisite. For example, medical images like the 4D-CT or ultrasound ones, are prone to noise and artifacts that can...
详细信息
ISBN:
(纸本)9783030042240;9783030042233
In many areas images can be corrupted by various types of noise and therefore image denoising is a prerequisite. For example, medical images like the 4D-CT or ultrasound ones, are prone to noise and artifacts that can affect diagnostic confidence. Remote sensing is another field for which image preprocessing is mandatory to improve the quality of source images. Synthetic Aperture Radar (SAR) images are typically corrupted by multiplicative speckle noise. In this paper, a deep neural network able to deal with both additive white Gaussian and multiplicative speckle noises is developed, showing also some blind denoising capacity. The experiments on noisy images show that the proposal, which consists in a encoder-decoder, is efficient and competitive in comparison with state-of-the-art methods.
Interstitial lung diseases (ILD) encompass a large spectrum of diseases sharing similarities in their physiopathology and computed tomography (CT) appearance. In this paper, we propose the adaption of a deep convoluti...
详细信息
ISBN:
(纸本)9781538636367
Interstitial lung diseases (ILD) encompass a large spectrum of diseases sharing similarities in their physiopathology and computed tomography (CT) appearance. In this paper, we propose the adaption of a deep convolutional encoder-decoder (CED) that has shown high accuracy for image segmentation. Such architectures require annotation of the total region with pathological findings. This is difficult to acquire, due to uncertainty in the definition and extent of disease patterns and the need of significant human effort, especially for large datasets. Therefore, often current methods use patch-based implementations of convolutional neural networks, which however tend to produce spatially inhomogeneous segmentations due to their local contextual view. We exploit the advantages of both architectures by using the output of a patch-based classifier as a prior to a CED. Our method could advance the state-of-the-art in lung tissue segmentation using only a small number of newly annotated images.
It has been known for a long time that the classic Hidden-Markov-Model (HMM) derivation for speech recognition contains assumptions such as independence of observation vectors and weak duration modeling that are pract...
详细信息
ISBN:
(纸本)9781510872219
It has been known for a long time that the classic Hidden-Markov-Model (HMM) derivation for speech recognition contains assumptions such as independence of observation vectors and weak duration modeling that are practical but unrealistic. When using the hybrid approach this is amplified by trying to fit a discriminative model into a generative one. Hidden Conditional Random Fields (CRFs) and segmental models (e.g. Semi-Markov CRFs / Segmental CRFs) have been proposed as an alternative, but for a long time have failed to get traction until recently. In this paper we explore different length modeling approaches for segmental models, their relation to attention-based systems. Furthermore we show experimental results on a handwriting recognition task and to the best of our knowledge the first reported results on the Switchboard 300h speech recognition corpus using this approach.
In this paper, we propose a generative recurrent model for human-character interaction. Our model is an encoder-recurrent-decoder network. The recurrent network is composed by multiple layers of long short-term memory...
详细信息
In this paper, we propose a generative recurrent model for human-character interaction. Our model is an encoder-recurrent-decoder network. The recurrent network is composed by multiple layers of long short-term memory (LSTM) and is incorporated with an encoder network and a decoder network before and after the recurrent network. With the proposed model, the virtual character's animation is generated on the fly while it interacts with the human player. The coming animation of the character is automatically generated based on the history motion data of both itself and its opponent. We evaluated our model based on both public motion capture databases and our own recorded motion data. Experimental results demonstrate that the LSTM layers can help the character learn a long history of human dynamics to animate itself. In addition, the encoder-decoder networks can significantly improve the stability of the generated animation. This method can automatically animate a virtual character responding to a human player.
The details of the host workloads in cloud computing environment and the application demands of the real world computing system are becoming so complex that it throws a big challenge to the major cloud infrastructure ...
详细信息
ISBN:
(纸本)9781538640159
The details of the host workloads in cloud computing environment and the application demands of the real world computing system are becoming so complex that it throws a big challenge to the major cloud infrastructure vendors. To achieve service level agreements between users and cloud service vendors, it is essential to apply accurate prediction of future host load, which is also significant to improve the resource allocation and utilization in cloud computing. Although that there were several various methods and models developed, few of them can acquire the long-term temporal dependencies appropriately to make accurate predictions. In this paper, we apply a GRU based encoder-decoder network(GRUED) which contains two gated recurrent neural networks(GRUs) to address these issues. Thorough empirical studies based upon the Google resources usage traces and the traditional Unix system load traces demonstrate that our proposed method outperforms other state-of-the-art approaches for the prediction of multi-step-ahead host workload in cloud computing.
暂无评论