检索结果-内蒙古大学图书馆

作者： Jacob Bryan University of Illinois – Urbana-Champaign

学位级别：博士

This dissertation presents the development of sensorimotor primitives as a means of constructing a language-agnostic model of speech communication. Insights from major theories in speech science and linguistics are used to develop a conceptual framework for sensorimotor primitives in the context of control and information theory. Within this conceptual framework, sensorimotor primitives are defined as a system transformation that simplifies the interface to some high dimensional and/or nonlinear system. In the context of feedback control, sensorimotor primitives take the form of a feedback transformation. In the context of communication, sensorimotor primitives are represented as a channel encoder and decoder pair. Using a high fidelity simulation of articulatory speech synthesis, these realizations of sensorimotor primitives are respectively applied to feedback control of the articulators, and communication via the acoustic speech signal. Experimental results demonstrate the construction of a model of speech communication that is capable of both transmitting and receiving information, and imitating simple utterances.

关键词： speech langauge artificial intelligence speech signal processing speech communication speech articulation sensorimotor primitives speech primitives variational autoencoder inverse channel encoder

来源：评论

学校读者我要写书评

暂无评论

Visual Perception for Autonomous Driving inspired by Convergence-Divergence Zones 11

Visual Perception for Autonomous Driving inspired by Converg...

引用

11th International Symposium on Image and Signal Processing and Analysis (ISPA)

作者： Plebe, Alice Da Lio, Mauro Univ Trento Dept Informat Engn & Comp Sci Trento Italy Univ Trento Dept Ind Engn Trento Italy

ISBN: (纸本)9781728131405

Visual perception is, by large, the main source of information used by humans when driving. Therefore, it is natural and appropriate to rely heavily on vision analysis for autonomous driving, as done in most projects. However, there is a significant difference between the common approach of vision in autonomous driving, and visual perceptions in humans when driving. Essentially, image analysis is often regarded as an isolated and autonomous module, which high level output drives the control modules of the vehicle. The direction here presented is different, we try to take inspiration from the brain architecture that makes humans so effective in learning tasks as complex as the one of driving. There are two key theories about biological perception grounding our development. The first is the view of the thinking activity as a simulation of perceptions and action, as theorized by Hesslow. The second is the Convergence-Divergence Zones (CDZs) mechanism of mental simulation connecting the process of extracting features from a visual scene, to the inverse process of imagining a scene content by decoding features stored in memory. We will show how our model, based on semi-supervised variational autoencoder, is a rather faithful implementation of these two basic neurocognitive theories.

关键词： mental imagery deep learning autonomous driving variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Adversarial Deep Neural Networks Effectively Remove Nonlinear Batch Effects from Gene-Expression Data

Adversarial Deep Neural Networks Effectively Remove Nonlinea...

引用

作者： Dayton, Jonathan Bryan Brigham Young University

学位级别：M.Sc., Master of Science

Gene-expression profiling enables researchers to quantify transcription levels in cells, thus providing insight into functional mechanisms of diseases and other biological processes. However, because of the high dimensionality of these data and the sensitivity of measuring equipment, expression data often contains unwanted confounding effects that can skew analysis. For example, collecting data in multiple runs causes nontrivial differences in the data (known as batch effects), known covariates that are not of interest to the study may have strong effects, and there may be large systemic effects when integrating multiple expression datasets. Additionally, many of these confounding effects represent higher-order interactions that may not be removable using existing techniques that identify linear patterns. We created Confounded to remove these effects from expression data. Confounded is an adversarial variational autoencoder that removes confounding effects while minimizing the amount of change to the input data. We tested the model on artificially constructed data and commonly used gene expression datasets and compared against other common batch adjustment algorithms. We also applied the model to remove cancer-type-specific signal from a pan-cancer expression dataset. Our software is publicly available at https://***/jdayton3/Confounded.

关键词： batch effects batch correction gene expression transcriptomics deep learning adversarial neural network variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Learning Affordance Representations: An Efficient Learning Approach for End-to-End Visuomotor Control

Learning Affordance Representations: An Efficient Learning A...

引用

作者： Hämäläinen, Aleksi Aalto University

学位级别：硕士

The development of data-driven approaches, such as deep learning, has led to the emergence of systems that have achieved human-like performance in wide variety of tasks. For robotic tasks, deep data-driven models are introduced to create adaptive systems without the need of explicitly programming them. These adaptive systems are needed in situations, where task and environment changes remain unforeseen. Convolutional neural networks (CNNs) have become the standard way to process visual data in robotics. End-to-end neural network models that operate the entire control task can perform various complex tasks with little feature engineering. However, the adaptivity of these systems goes hand in hand with the level of variation in the training data. Training end-to-end deep robotic systems requires a lot of domain-, task-, and hardware-specific data, which is often costly to provide. In this work, we propose to tackle this issue by employing a deep neural network with a modular architecture, consisting of separate perception, policy, and trajectory parts. Each part of the system is trained fully on synthetic data or in simulation. The data is exchanged between parts of the system as low-dimensional representations of affordances and trajectories. The performance is then evaluated in a zero-shot transfer scenario using the Franka Panda robotic arm. Results demonstrate that a low-dimensional representation of scene affordances extracted from an RGB image is sufficient to successfully train manipulator policies.

关键词： robotics representation learning end-to-end visuomotor control variational autoencoder zero-shot transfer deep learning

来源：评论

学校读者我要写书评

暂无评论

Language/Dialect Recognition Based on Unsupervised Deep Learning

引用

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING 2018年第5期26卷 873-882页

作者： Zhang, Qian Hansen, John H. L. Univ Texas Dallas Erik Jonsson Sch Engn Ctr Robust Speech Syst Richardson TX 75080 USA

Over the past decade, bottleneck features within an i-Vector framework have been used for state-of-the-art language/dialect identification (LID/DID). However, traditional bottleneck feature extraction requires additional transcribed speech information. Alternatively, two types of unsupervised deep learning methods are introduced in this study. To address this limitation, an unsupervised bottleneck feature extraction approach is proposed, which is derived from the traditional bottleneck structure but trained with estimated phonetic labels. In addition, based on a generative modeling autoencoder, two types of latent variable learning algorithms are introduced for speech feature processing, which have been previous considered for image processing/reconstruction. Specifically, a variational autoencoder and adversarial autoencoder are utilized on alternative phase of speech processing. To demonstrate the effectiveness of the proposed methods, three corpora are evaluated: 1) a four Chinese dialect dataset, 2) a five Arabic dialect corpus, and 3) multigenre broadcast challenge corpus (MGB-3) for arabic DID. The proposed features are shown to outperform traditional acoustic feature MFCCs consistently across three corpora. Taken collectively, the proposed features achieve up to a relative +58% improvement in Cavg for LID/DID without the need of any secondary speech corpora.

关键词： Language/Dialect recognition unsupervised learning variational autoencoder adversarial autoencoder bottleneck feature phonetic label estimation

来源：评论

学校读者我要写书评

暂无评论

Online training of deep neural networks for classification

Online training of deep neural networks for classification

引用

作者： Tumpach, Jiří Charles University of Prague

Deep learning is usually applied to static datasets. If used for classification based on data streams, it is not easy to take into account a non-stationarity. This thesis presents work in progress on a new method for online deep classifi- cation learning in data streams with slow or moderate drift, highly relevant for the application domain of malware detection. The method uses a combination of multilayer perceptron and variational autoencoder to achieve constant mem- ory consumption by encoding past data to a generative model. This can make online learning of neural networks more accessible for independent adaptive sys- tems with limited memory. First results for real-world malware stream data are presented, and they look promising. 1

关键词： neural network variational autoencoder online learning classification

来源：评论

学校读者我要写书评

暂无评论

SAGNet: Structure-aware Generative Network for 3D-Shape Modeling

引用

ACM TRANSACTIONS ON GRAPHICS 2019年第4期38卷 91-91页

作者： Wu, Zhijie Wang, Xiang Lin, Di Lischinski, Dani Cohen-Or, Daniel Huang, Hui Shenzhen Univ Shenzhen Peoples R China Hebrew Univ Jerusalem Jerusalem Israel Shenzhen Univ Coll Comp Sci & Software Engn Shenzhen Peoples R China

We present SAGNet, a structure-aware generative model for 3D shapes. Given a set of segmented objects of a certain class, the geometry of their parts and the pairwise relationships between them (the structure) are jointly learned and embedded in a latent space by an autoencoder. The encoder intertwines the geometry and structure features into a single latent code, while the decoder disentangles the features and reconstructs the geometry and structure of the 3D model. Our autoencoder consists of two branches, one for the structure and one for the geometry. The key idea is that during the analysis, the two branches exchange information between them, thereby learning the dependencies between structure and geometry and encoding two augmented features, which are then fused into a single latent code. This explicit intertwining of information enables separately controlling the geometry and the structure of the generated models. We evaluate the performance of our method and conduct an ablation study. We explicitly show that encoding of shapes accounts for both similarities in structure and geometry. A variety of quality results generated by SAGNet are presented.

关键词： geometric modeling shape analysis data-driven synthesis generative network variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

SDM-NET: Deep Generative Network for Structured Deformable Mesh

引用

ACM TRANSACTIONS ON GRAPHICS 2019年第6期38卷 243-243页

作者： Gao, Lin Yang, Jie Wu, Tong Yuan, Yu-Jie Fu, Hongbo Lai, Yu-Kun Zhang, Hao Chinese Acad Sci Inst Comp Technol Beijing Peoples R China Univ Chinese Acad Sci Beijing Peoples R China City Univ Hong Kong Sch Creat Media Hong Kong Peoples R China Cardiff Univ Sch Comp Sci & Informat Cardiff S Glam Wales Simon Fraser Univ Sch Comp Sci Burnaby BC Canada Chinese Acad Sci Inst Comp Technol Beijing Key Lab Mobile Comp & Pervas Device Beijing Peoples R China

We introduce SDM-NET, a deep generative neural network which produces structured deformable meshes. Specifically, the network is trained to generate a spatial arrangement of closed, deformable mesh parts, which respects the global part structure of a shape collection, e.g., chairs, airplanes, etc. Our key observation is that while the overall structure of a 3D shape can be complex, the shape can usually be decomposed into a set of parts, each homeomorphic to a box, and the finer-scale geometry of the part can be recovered by deforming the box. The architecture of SDM-NET is that of a two-level variational autoencoder (VAE). At the part level, a PartVAE learns a deformable model of part geometries. At the structural level, we train a Structured Parts VAE (SP-VAE), which jointly learns the part structure of a shape collection and the part geometries, ensuring the coherence between global shape structure and surface details. Through extensive experiments and comparisons with the state-of-the-art deep generative models of shapes, we demonstrate the superiority of SDM-NET in generating meshes with visual quality, flexible topology, and meaningful structures, benefiting shape interpolation and other subsequent modeling tasks.

关键词： Shape representation variational autoencoder structure geometric details generation

来源：评论

学校读者我要写书评

暂无评论

Deep clustering of protein folding simulations

引用

BMC BIOINFORMATICS 2018年第Sup18期19卷 47-58页

作者： Bhowmik, Debsindhu Gao, Shang Young, Michael T. Ramanathan, Arvind Oak Ridge Natl Lab Computat Sci & Engn Div One Bethel Valley RdMS6085 Oak Ridge TN 37830 USA

BackgroundWe examine the problem of clustering biomolecular simulations using deep learning techniques. Since biomolecular simulation datasets are inherently high dimensional, it is often necessary to build low dimensional representations that can be used to extract quantitative insights into the atomistic mechanisms that underlie complex biological *** use a convolutional variational autoencoder (CVAE) to learn low dimensional, biophysically relevant latent features from long time-scale protein folding simulations in an unsupervised manner. We demonstrate our approach on three model protein folding systems, namely Fs-peptide (14 s aggregate sampling), villin head piece (single trajectory of 125 s) and - - (BBA) protein (223 + 102 s sampling across two independent trajectories). In these systems, we show that the CVAE latent features learned correspond to distinct conformational substates along the protein folding pathways. The CVAE model predicts, on average, nearly 89% of all contacts within the folding trajectories correctly, while being able to extract folded, unfolded and potentially misfolded states in an unsupervised manner. Further, the CVAE model can be used to learn latent features of protein folding that can be applied to other independent trajectories, making it particularly attractive for identifying intrinsic features that correspond to conformational substates that share similar structural ***, we show that the CVAE model can quantitatively describe complex biophysical processes such as protein folding.

关键词： Deep learning variational autoencoder Protein folding Conformational substates

来源：评论

学校读者我要写书评

暂无评论

GRAINS: Generative Recursive autoencoders for INdoor Scenes

引用

ACM TRANSACTIONS ON GRAPHICS 2019年第2期38卷 12-12页

作者： Li, Manyi Patil, Akshay Gadi Xu, Kai Chaudhuri, Siddhartha Khan, Owais Shamir, Ariel Tu, Changhe Chen, Baoquan Cohen-Or, Daniel Zhang, Hao Shandong Univ Qingdao Shandong Peoples R China Simon Fraser Univ Vancouver BC Canada Natl Univ Def Technol Changsha Hunan Peoples R China Adobe Res Bangalore Karnataka India Indian Inst Technol Mumbai Maharashtra India Interdisciplinary Ctr Herzliyya Israel Peking Univ Beijing Peoples R China Tel Aviv Univ Tel Aviv Israel

We present a generative neural network that enables us to generate plausible 3D indoor scenes in large quantities and varieties, easily and highly efficiently. Our key observation is that indoor scene structures are inherently hierarchical. Hence, our network is not convolutional;it is a recursive neural network, or RvNN. Using a dataset of annotated scene hierarchies, we train a variational recursive autoencoder, or RvNN-VAE, which performs scene object grouping during its encoding phase and scene generation during decoding. Specifically, a set of encoders are recursively applied to group 3D objects based on support, surround, and co-occurrence relations in a scene, encoding information about objects' spatial properties, semantics, and relative positioning with respect to other objects in the hierarchy. By training a variational autoencoder (VAE), the resulting fixed-length codes roughly follow a Gaussian distribution. A novel 3D scene can be generated hierarchically by the decoder from a randomly sampled code from the learned distribution. We coin our method GRAINS, for Generative Recursive autoencoders for INdoor Scenes. We demonstrate the capability of GRAINS to generate plausible and diverse 3D indoor scenes and compare with existing methods for 3D scene synthesis. We show applications of GRAINS including 3D scene modeling from 2D layouts, scene editing, and semantic scene segmentation via PointNet whose performance is boosted by the large quantity and variety of 3D scenes generated by our method.

关键词： 3D indoor scene generation recursive neural network variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：