Datasets exist in real life in many formats (audio, music, image,...). In our case, we have them from various sources mixed together. Our mixtures represent noisy audio data that need to be extracted (features), compr...
详细信息
Datasets exist in real life in many formats (audio, music, image,...). In our case, we have them from various sources mixed together. Our mixtures represent noisy audio data that need to be extracted (features), compressed and analysed in order to be presented in a standard way. The resulted data will be used for the Blind Source Separation task. In this paper, we deal with two types of autoencoders: convolutional and denoising. The novelty of our work is to reconstruct the audio signal in the output of the neural network after extracting the meaningful features that present the pure and the powerful information. Simulation results show a great performance, yielding of 87% for the reconstructed signals that will be included in the automated system used for real word applications.
Purpose For many years, deep convolutional neural networks have achieved state-of-the-art results on a wide variety of computer vision tasks. 3D human pose estimation makes no exception and results on public benchmark...
详细信息
Purpose For many years, deep convolutional neural networks have achieved state-of-the-art results on a wide variety of computer vision tasks. 3D human pose estimation makes no exception and results on public benchmarks are impressive. However, specialized domains, such as operating rooms, pose additional challenges. Clinical settings include severe occlusions, clutter and difficult lighting conditions. Privacy concerns of patients and staff make it necessary to use unidentifiable data. In this work, we aim to bring robust human pose estimation to the clinical domain. Methods We propose a 2D-3D information fusion framework that makes use of a network of multiple depth cameras and strong pose priors. In a first step, probabilities of 2D joints are predicted from single depth images. These information are fused in a shared voxel space yielding a rough estimate of the 3D pose. Final joint positions are obtained by regressing into the latent pose space of a pre-trained convolutional autoencoder. Results We evaluate our approach against several baselines on the challenging MVOR dataset. Best results are obtained when fusing 2D information from multiple views and constraining the predictions with learned pose priors. Conclusions We present a robust 3D human pose estimation framework based on a multi-depth camera network in the operating room. Depth images as only input modalities make our approach especially interesting for clinical applications due to the given anonymity for patients and staff.
We investigate the application of convolutional neural networks (CNNs) to accelerate quantum mechanical transport simulations (based on the nonequilibrium Green's function (NEGF) method) of double-gate MOSFETS. In...
详细信息
We investigate the application of convolutional neural networks (CNNs) to accelerate quantum mechanical transport simulations (based on the nonequilibrium Green's function (NEGF) method) of double-gate MOSFETS. In particular. given a potential distribution as input data, we implement the convolutional autoencoder to train and predict the carrier density and local quantum capacitance distributions. The results indicate that the use of a single trained CNN model in the NEGF self-consistent calculation along with Poisson's equation produces accurate potentials for a wide range of the gate lengths, and all within a significantly shorter computational time than the conventional NEGF calculations.
作者:
Alotaibi, AzizTaif Univ
Comp Sci Dept Coll Comp & Informat Technol At Taif 21974 Saudi Arabia
Deep learning has played a huge role in computer vision fields due to its ability to extract underlying and complex features of input images. Deep learning is applied to complex vision tasks to perform image recogniti...
详细信息
Deep learning has played a huge role in computer vision fields due to its ability to extract underlying and complex features of input images. Deep learning is applied to complex vision tasks to perform image recognition and classification. Recently, Apparel classification, is an application of computer vision, has been intensively explored and investigated. This paper proposes an effective framework, called DeepAutoDNN, based on deep learning algorithms for apparel classification. DeepAutoDNN framework combines a deep autoencoder with deep neural networks to extract the complex patterns and high-level features of fashion images in supervised manner. These features are utilized via categorical classifier to predict the given image to the right label. To evaluate the performance and investigate the efficiency of the proposed framework, several experiments have been conducted on the Fashion-MNIST dataset, which consists of 70000 images: 60000 and 10000 images for training and test, respectively. The results have shown that the proposed framework can achieve accuracy of 93.4%. In the future, this framework performance can be improved by utilizing generative adversarial networks and its variant.
The cornerstone of materials design is the design space used in solving materials related optimization problems. Materials design strategies often involve evaluating properties of materials with selected design variab...
详细信息
The cornerstone of materials design is the design space used in solving materials related optimization problems. Materials design strategies often involve evaluating properties of materials with selected design variables. Because a microstructure comprises a high dimensional data, dimensionality reduction using principal component analysis or multi-dimensional scaling has become a common practice in generating low dimensional design variables. Unfortunately, generation of microstructures is not guaranteed using design variables formulated using popular dimensionality reduction techniques. These design variables are limited to the initial dataset used for dimensionality reduction, resulting in a discontinuous design space. This shortcoming severely constrains design flexibility and can hamper the performance of design strategies. To address this limitation, we propose the use of the latent space of a convolutional autoencoder trained with synthetic dual phase (DP) microstructures as low dimensional and continuous design space. Once the design space is established, Bayesian optimization is adopted to search for optimal microstructures that exhibit maximal tensile strength. The strength of each microstructure is approximated using microstructure based finite element method. To take full advantage of the continuous design space, a second Bayesian optimization with a refined search space is adopted. The second Bayesian optimization resulted in higher maximum strength value and fewer number of data necessary to find the optimal microstructure compared to Bayesian optimization without search space refinement. Furthermore, multiple microstructures exhibiting strengths comparable to that of the optimal microstructure can be identified within the refined search space, providing significant flexibility in microstructure design.
Atmospheric turbulence can change the path and direction of light during the imaging of a target in space due to the random motion of the turbulent medium, resulting in severe image distortion. To correct geometric di...
详细信息
Atmospheric turbulence can change the path and direction of light during the imaging of a target in space due to the random motion of the turbulent medium, resulting in severe image distortion. To correct geometric distortion, and reduce spatially and temporally varying blur, this paper proposes a convolutional network for blind deblurring atmospheric turbulence (BDATNet) that includes a feature extraction noise suppression block (FENSB), an asymmetric U-net, and an image reconstruction subnetwork (IRSubnetwork). A deblurring noise suppression block (DNSB) is used instead of the traditional convolution layer for the U-net. The core principle of this model is to suppress noise before deblurring. During convolutional encoding, the FENSB and DNSB can suppress noise and capture rich feature maps. To fuse information obtained from low-level and high-level features, the FENSB and IRSubnetwork are skip-connected to ensure the integrity of the former during image reconstruction. Moreover, the method of gradually increasing the difficulty of data to train the network is used to cause it to gradually converge from simple to complex, so that it can deal with images severely degraded by turbulence. The experimental results of real data and simulation data show that the BDATNet can restore details of the image and sharpen its edges, and can suppress noise. (C) 2020 Elsevier B.V. All rights reserved.
Ways of understanding human creations with artificial intelligence have increased;however, those are still known as one of the most difficult tasks. Our challenge is to find ways to understand four-scene comics by AI....
详细信息
ISBN:
(纸本)9781728150543
Ways of understanding human creations with artificial intelligence have increased;however, those are still known as one of the most difficult tasks. Our challenge is to find ways to understand four-scene comics by AI. To achieve this aim, we used the novel dataset called "Four-scene Comics Story Dataset", which is the first dataset made by researchers and comic artists to develop AI creations. We focused on illustration touches of comics which are determined by comic artists. First, we applied convolutional autoencoder (CAE) models to this dataset to get distributed representations, then applied classifiers in order to classify the different touches. The prediction offers an indirect measure of the distributed representations. We already investigated the experiment using original scenes. To show the advantageous of the dataset, the proposed method is confirmed by computer simulations taking data of various pattern of removing image parts such as "eyes" in scene images of the four-scene comics story dataset structure as examples.
Path planning is an important function for executing autonomous moving robots, and many path planning methods that satisfy various constraints, such as avoiding obstacles and energy efficiency, have been proposed. How...
详细信息
ISBN:
(纸本)9789811303418;9789811303401
Path planning is an important function for executing autonomous moving robots, and many path planning methods that satisfy various constraints, such as avoiding obstacles and energy efficiency, have been proposed. However, these conventional methods have several difficulties for apply to the actual applications, such as the instability, low reproducibility, huge training data set required. Therefore, we propose a novel robot path planning method that combines the rapidly exploring random tree (RRT) and long short-term memory (LSTM) network. In this method, numerous and good paths are generated in the robot configuration space by the RRT method, a convolutional autoencoder and LSTM combination network is trained by them. The proposed method overcomes the difficulty of general methods with neural networks, i.e., "the acquisition of a large amount of training data." Moreover, the difficulty of general random based methods, i.e., "the reproducible path generation" is resolved with high-speed.
Mobile app traffic now accounts for a majority owing to the booming mobile devices and mobile apps. State-of-the-art identification methods, such as DPI and flow-based classifiers, have difficulties in designing featu...
详细信息
ISBN:
(纸本)9781728155845
Mobile app traffic now accounts for a majority owing to the booming mobile devices and mobile apps. State-of-the-art identification methods, such as DPI and flow-based classifiers, have difficulties in designing features and labeling samples manually. Motivated by the excellence of CNNs in visual object recognition, we propose convolutional autoencoder network (CAEN), a deep learning approach to mobile app traffic identification. Our contributions are two-fold. First, we propose a novel method of converting traffic flows into vision-meaningful images, and thus enable the machine to identify the traffic in a human way. Based on the method, we create an open dataset named IMTD. Second, convolutional autoencoder (CAE) algorithm is introduced into the proposed network model, realizing the automatic feature extraction and the learning from massive unlabeled samples. The experimental results show that the identification accuracy of our approach can reach 99.5%, which satisfies the practical requirement.
Cardiovascular disease (CVD) is a leading cause of death in the lung cancer screening population. Chest CT scans made in lung cancer screening are suitable for identification of participants at risk of CVD. Existing m...
详细信息
ISBN:
(纸本)9781510625464
Cardiovascular disease (CVD) is a leading cause of death in the lung cancer screening population. Chest CT scans made in lung cancer screening are suitable for identification of participants at risk of CVD. Existing methods analyzing CT images from lung cancer screening for prediction of CVD events or mortality use engineered features extracted from the images combined with patient information. In this work we propose a method that automatically predicts 5-year cardiovascular mortality directly from chest CT scans without the need for hand-crafting image features. A set of 1,583 participants of the National Lung Screening Trial was included (1,188 survivors, 395 non survivors). Low-dose chest CT images acquired at baseline were analyzed and the follow-up time was 5 years. To limit the analysis to the heart region, the heart was first localized by our previously developed algorithm for organ localization exploiting convolutional neural networks. Thereafter, a convolutional autoencoder was used to encode the identified heart region. Finally, based on the extracted encodings subjects were classified into survivors or non-survivors using a neural network. The performance of the method was assessed in eight cross-validation experiments with 1,433 images used for training, 50 for validation and 100 for testing. The method achieved a performance with an area under the ROC curve of 0.73. The results demonstrate that prediction of cardiovascular mortality directly from low-dose screening chest CT scans, without hand-crafted features, is feasible, allowing identification of subjects at risk of fatal CVD events.
暂无评论