Visual attention is a dynamic search process of acquiring information. However, most previous studies have focused on the prediction of static attended locations. Without considering the temporal relationship of fixat...
详细信息
ISBN:
(纸本)9781509060672
Visual attention is a dynamic search process of acquiring information. However, most previous studies have focused on the prediction of static attended locations. Without considering the temporal relationship of fixations, these models usually cannot explain the dynamic saccadic behavior well. In this paper, an iterative representation learning framework is proposed to predict the saccadic scanpath. Within the proposed framework, saccade can be explained as an iterative process of finding the most uncertain area and updating the representation of scenes. In implementation, a deep autoencoder is employed for representation learning. The current fixation is predicted to be the most salient pixel, with saliency estimated by the reconstruction residual of the deep network. Image patches around this fixation are then sampled to update the network for the selection of subsequent fixations. Compared with existing models, the proposed model shows the state-of-the-art performance on several public data sets.
Deep classification networks have shown great accuracy in classifying inputs. However, they fall prey to adversarial inputs, random inputs chosen to yield a classification with a high confidence. But perception is a t...
详细信息
ISBN:
(纸本)9783319700960;9783319700953
Deep classification networks have shown great accuracy in classifying inputs. However, they fall prey to adversarial inputs, random inputs chosen to yield a classification with a high confidence. But perception is a two-way process, involving the interplay between feedforward sensory input and feedback expectations. In this paper, we construct a predictive estimator (PE) network, incorporating generative (predictive) feedback, and show that the PE network is less susceptible to adversarial inputs. We also demonstrate some other properties of the PE network.
Bone suppression in lung radiographs is an important task, as it improves the results on other related tasks, such as nodule detection or pathologies classification. In this paper, we propose two architectures that su...
详细信息
ISBN:
(纸本)9781467389884
Bone suppression in lung radiographs is an important task, as it improves the results on other related tasks, such as nodule detection or pathologies classification. In this paper, we propose two architectures that suppress bones in radiographs by treating them as noise. In the proposed methods, we create end-to-end learning frameworks that minimize noise in the images while maintaining sharpness and detail in them. Our results show that our proposed noise-cancellation scheme is robust and does not introduce artifacts into the images.
In this paper we analyze the gate activation signals inside the gated recurrent neural networks, and find the temporal structure of such signals is highly correlated with the phoneme boundaries. This correlation is fu...
详细信息
ISBN:
(纸本)9781510848764
In this paper we analyze the gate activation signals inside the gated recurrent neural networks, and find the temporal structure of such signals is highly correlated with the phoneme boundaries. This correlation is further verified by a set of experiments for phoneme segmentation, in which better results compared to standard approaches were obtained.
Nowadays, biometric systems are used for the security and personal recognition. The palmprint trait is one among the most confident physiological modalities that can be used to recognize human identity. The biggest ad...
详细信息
ISBN:
(纸本)9781538606674
Nowadays, biometric systems are used for the security and personal recognition. The palmprint trait is one among the most confident physiological modalities that can be used to recognize human identity. The biggest advantage of the use of palm print consists of weak changes affecting this modality compared to other physical modalities like face. In our paper, we report the problematic of human identity recognition using palmprint. We focus on the texture information that can be extracted using texture based descriptors such as Gabor, Wavelet, Wave Atom, Curvelet, SIFT, CNN, and LBP. Our main contribution is based on the use of the Sparse Auto Encoder in order to represent all combined feature vectors of the palmprint texture. To evaluate our proposed approach of sparse nonlinear representation of features, several experiments were carried out on the IITD palmprint database. The proposed approach has shown promising results using fusion at decision level in order to recognize human identity.
Brain imaging data such as EEG or MEG are high-dimensional spatiotemporal data often degraded by complex, non-Gaussian noise. For reliable analysis of brain imaging data, it is important to extract discriminative, low...
详细信息
ISBN:
(纸本)9781509021758
Brain imaging data such as EEG or MEG are high-dimensional spatiotemporal data often degraded by complex, non-Gaussian noise. For reliable analysis of brain imaging data, it is important to extract discriminative, low-dimensional intrinsic representation of the recorded data. This work proposes a new method to learn the low-dimensional representations from the noise-degraded measurements. In particular, our work proposes a new deep neural network design that integrates graph information such as brain connectivity with fully-connected layers. Our work leverages efficient graph filter design using Chebyshev polynomial and recent work on convolutional nets on graph-structured data. Our approach exploits graph structure as the prior side information, localized graph filter for feature extraction and neural networks for high capacity learning. Experiments on real MEG datasets show that our approach can extract more discriminative representations, leading to improved accuracy in a supervised classification task.
This paper proposes a novel fully automatic diagnosis method for liver cirrhosis based on the reading of high-frequency ultrasound images. The proposed method determines the cirrhosis stage via a deep-learning neural ...
详细信息
ISBN:
(纸本)9781509030507
This paper proposes a novel fully automatic diagnosis method for liver cirrhosis based on the reading of high-frequency ultrasound images. The proposed method determines the cirrhosis stage via a deep-learning neural network. First, we feed an ultrasound image into an autoencoder to generate the capsule-enhanced version of the image and binarize the enhanced image. Then, we employ a partition-clustering algorithm to obtain the top-end largest-area partition cluster, which represents the upper layer of the liver, and thereby locate the final liver capsule based on least-squares polynomial fitting. After separating the parenchymal region from the image, we use the proposed residual neural network to determine the cirrhosis stage. Experimental results demonstrate the high accuracy and effectiveness of the proposed method, which outperforms five other state-of-the-art methods. The proposed method is expected to improve the efficiency and accuracy of the clinical diagnosis of liver cirrhosis.
In this experiment, a phoneme classification model has been developed using a Deep Neural Network based framework. The experiment is conducted in two phases. In the first phase, phoneme classification task has been pe...
详细信息
ISBN:
(纸本)9781538633335
In this experiment, a phoneme classification model has been developed using a Deep Neural Network based framework. The experiment is conducted in two phases. In the first phase, phoneme classification task has been performed. The deep-structured model provided good overall classification accuracy of 87.8%. All the phonemes are classified with precision and recall values. A confusion matrix of all the Bengali phonemes is derived. Using the confusion matrix, the phonemes are classified into nine groups. These nine groups provided better overall classification accuracy of 98.7%, and a new confusion matrix is derived for this nine groups. A lower confusion rate is observed this time. In the second phase of the experiment, the nine groups are reclassified into 15 groups using the manner of articulation based knowledge and the deep-structured model is retrained. The system provided 98.9% of overall classification accuracy this time. This result is almost equal to the overall accuracy which was observed for nine groups. But as the nine groups are redivided into 15 groups, the phoneme confusion in a single group became less which leads to a better phoneme classification model.
Hyperspectral unmixing is a challenging inverse problem that involves determining the fractional abundances of the representive material (endmembers) in each pixel. In this paper, we develop a neural network autoencod...
详细信息
ISBN:
(纸本)9781509049516
Hyperspectral unmixing is a challenging inverse problem that involves determining the fractional abundances of the representive material (endmembers) in each pixel. In this paper, we develop a neural network autoencoder, that dynamically exploits the sparsity of the abundances and enforces the abundance sum constraint (ASC) for hyperspectral unmixing. Instead of using the conventional mean square error (MSE) objective function, we use the spectral information divergence (SID) measure. Experiments are performed using a real hyperspectral dataset and we compare results obtained using both MSE and SID. It is demonstrated by qualitative inspection that using SID gives significantly better results than using MSE.
Deep learning is a very noteworthy technic that is take into consideration in the several fields. One of the most attractive subjects that need more attention in the prediction accuracy is fraud detection. As the deep...
详细信息
ISBN:
(纸本)9781538626405
Deep learning is a very noteworthy technic that is take into consideration in the several fields. One of the most attractive subjects that need more attention in the prediction accuracy is fraud detection. As the deep network can gradually learn the concepts of any complicated problem, using this technic in this realm is very beneficial. To do so, we propose a deep autoencoder to extract best features from the information of the credit card transactions and then append a softmax network to determine the class labels. Regarding the effect of features in such data employing an overcomplete autoencoder can map data to a high dimensional space and using the sparse models leads to be in a discriminative space that is useful for classification aims. The benefit of this method is the generality virtues that we can use such networks in several realms e.g. national intelligence, cyber security, marketing, medical informatics and so on. Another advantage is the ability to facing big datasets. As the learning phase is offline we can use it for a huge amount of data and generalize that is earned. Results can reveal the advantages of proposed method comparing to the state of the arts.
暂无评论