检索结果-内蒙古大学图书馆

2016 IEEE 13th International Conference on Signal Processing（ICSP2016）

作者： Jiahao Lai Bo Chen Tian Tan Sibo Tong Kai Yu Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Speech Lab Department of Computer Science and Engineering Brain Science and Technology Research Center Shanghai Jiao Tong University

This paper investigates a new voice conversion technique using phone-aware Long Short-Term Memory Recurrent Neural Networks（LSTM-RNNs）. Most existing voice conversion methods, including Joint Density Gaussian Mixture Models（JDGMMs）, Deep Neural Networks（DNNs）and Bidirectional Long Short-Term Memory Recurrent Neural Networks（BLSTM-RNNs）, only take acoustic information of speech as features to train models. We propose to incorporate linguistic information to build voice conversion system by using monophones generated by a speech recognizer as linguistic features. The monophones and spectral features are combined together to train LSTM-RNN based voice conversion models,reinforcing the context-dependency modelling of *** results of the 1st voice conversion challenge shows our system achieves significantly higher performance than baseline（GMM method） and was found among the most competitive scores in similarity test. Meanwhile, the experimental results show phone-aware LSTM-RNN method obtains lower Melcepstral distortion and higher MOS scores than the baseline LSTM-RNNs.

关键词： voice conversion voice conversion challenge long short-term memory recurrent neural networks phone

来源：评论

学校读者我要写书评

暂无评论

An Investigation on Deep Learning with Beta Stabilizer

An Investigation on Deep Learning with Beta Stabilizer

引用

2016 IEEE 13th International Conference on Signal Processing（ICSP2016）

作者： Qi Liu Tian Tan Kai Yu Key Laboratory of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering Speech Lab Department of Computer Science and Engineering Brain Science and Technology Research Center Shanghai Jiao Tong University

Artificial neural networks(ANN) have been used in many applications such like handwriting recognition and speech recognition. It is well-known that learning rate is a crucial value in the training procedure for artificial neural networks. It is shown that the initial value of learning rate can confoundedly affect the final result and this value is always set manually in practice. A new parameter called beta stabilizer has been introduced to reduce the sensitivity of the initial learning rate. But this method has only been proposed for deep neural network(DNN) with sigmoid activation function. In this paper we extended beta stabilizer to long short-term memory(LSTM) and investigated the effects of beta stabilizer parameters on different models, including LSTM and DNN with relu activation *** is concluded that beta stabilizer parameters can reduce the sensitivity of learning rate with almost the same performance on DNN with relu activation function and LSTM. However, it is shown that the effects of beta stabilizer on DNN with relu activation function and LSTM are fewer than the effects on DNN with sigmoid activation function.

关键词： DNN An Investigation on Deep Learning with Beta Stabilizer

来源：评论

学校读者我要写书评

暂无评论

SRAC: Self-Reflective Risk-Aware Artificial Cognitive Models for Robot Response to human Activities

SRAC: Self-Reflective Risk-Aware Artificial Cognitive Models...

引用

IEEE International Conference on Robotics and Automation

作者： Hao Zhang Christopher Reardon Fei Han Lynne E. Parker the Human-Centered Robotics Lab in the Department of Computer Science and Electrical Engineering Colorado School of Mines Golden CO 80401 USA the Distributed Intelligence Lab in the Department of Computer Science and Electrical Engineering University of Tennessee Knoxville TN 37996 USA.

ISBN: (纸本)9781467380270

In human-robot teaming, interpretation of human actions, recognition of new situations, and appropriate decision making are crucial abilities for cooperative robots ("co-robots") to interact intelligently with humans. Given an observation, it is important that human activities are interpreted the same way by co-robots as human peers so that robot actions can be appropriate to the activity at hand. A novel interpretability indicator is introduced to address this issue. When a robot encounters a new scenario, the pretrained activity recognition model, no matter how accurate in a known situation, may not produce the correct information necessary to act appropriately and safely in new situations. To effectively and safely interact with people, we introduce a new generalizability indicator that allows a co-robot to self-reflect and reason about when an observation falls outside the co-robot's learned model. Based on topic modeling and the two novel indicators, we propose a new Self-reflective Risk-aware Artificial Cognitive (SRAC) model, which allows a robot to make better decisions by incorporating robot action risks and identifying new situations. Experiments both using real-world datasets and on physical robots suggest that our SRAC model significantly outperforms the traditional methodology and enables better decision making in response to human behaviors.

关键词： Models Robot Response

来源：评论

学校读者我要写书评

暂无评论

Effects of data presentation and perceptual speed on speed and accuracy in table reading for inventory control

引用

Occupational Ergonomics 2015年第3期12卷 119-129页

作者： Ziefle, Martina Brauner, Philipp Speicher, Frederic Department of Communication Science Human-Computer Interaction Center RWTH Aachen University Campus Boulevard 57 Aachen 52074 Germany

BACKGROUND: The increasing amount of available data in digital working environments raise considerable usability challenges. Beyond the trend for automation of such processes, strategic decisions still depend on humans in the loop who have to perceive, understand and process increasingly complex information and to make quick and correct decisions with considerable consequences for the effectiveness of the production process. OBJECTIVE: This work is concerned with a baseline experiment in which effects of data presentations and information complexity on speed and accuracy were studied taking table reading for inventory control as an example. METHODS: Experimentally, the information complexity (number of lines per table, number of digits, specificity of labels) as well as operators' cognitive ability (perceptual speed) was examined in terms of decision speed and accuracy. In addition, learnability effects were assessed. RESULTS: Results show a significant effect of all factors on task performance. With increasing information complexity decision speed is considerably decreased. Operators' perceptual speed modulates performance. Low perceptual speed in conjunction with insufficient data presentation results in significantly lower task performance. CONCLUSIONS: Usability and user-centered information displaying is of vital importance for efficient operators' performance and to balance mental workload. The findings contribute to an understanding of the effects of single factors in combination for mental workload and may lead to better managerial decisions concerning the design of working conditions (e.g. by automating processes). © 2015 IOS Press and the authors. All rights reserved.

关键词： accuracy inventory control performance speed table reading tabular data Usability

来源：评论

学校读者我要写书评

暂无评论

User reviews of gamepad controllers: A source of user requirements and user experience 15

User reviews of gamepad controllers: A source of user requir...

引用

2nd ACM SIGCHI Annual Symposium on computer-human interaction in Play, CHI PLAY 2015

作者： Merdenyan, Burak Petrie, Helen Human Computer Interaction Research Group Department of Computer Science University of York YO10 5GH United Kingdom

ISBN: (纸本)9781450334662

The development of the digital games industry has motivated game console makers to provide better gamepads for gamers. As gamepads provide the interaction between digital games and gamers, it is important to understand gamers' requirements for these devices. This study used content analysis to investigate whether existing gamepads satisfy gamers' requirements and provide good game experience, with a view to informing new designs. A content analysis of user reviews of four different game consoles was conducted. An emergent coding scheme with 11 categories was developed. 'Comfort' was the most frequently mentioned category, accounting for nearly 25% of all comments in the reviews. 'Material Quality' and 'Responsiveness' yielded the most negative comments. Implications for design improvements are discussed. © Copyright 2015 by the Association for Computing Machinery, Inc. (ACM).

关键词： computer games

来源：评论

学校读者我要写书评

暂无评论

[POSTER] Movable spatial AR on-the-go 14

[POSTER] Movable spatial AR on-the-go

引用

14th IEEE International Symposium on Mixed and Augmented Reality, ISMAR 2015

作者： Lee, Ahyun Lee, Joo-Haeng Kim, Jaehong Computer Software Korea University of Science and Technology Korea Republic of Human Robot Interaction Lab ETRI Korea Republic of

ISBN: (纸本)9781467376600

We present a movable spatial augmented reality (SAR) system that can be easily installed in a user workspace. The proposed system aims to dynamically cover a wider projection area using a portable projector attached to a simple robotic device. It has a clear advantage than a conventional SAR scenario where, for example, a projector should be installe1d with a fixed projection area in the workspace. In the previous research [1], we proposed a data-driven kinematic control method for a movable SAR system. This method targets a SAR system integrated with a user-created robotic (UCR) device where an explicit kinematic configuration such as CAD model is unavailable. Our contribution in this paper is to show the feasibility of the data-driven control method by developing a practical application where dynamic change of projection area matters. We outline the control method and demonstrate an assembly guide example using a casually installed movable SAR system. © 2015 IEEE.

关键词： Kinematics

来源：评论

学校读者我要写书评

暂无评论

EXPLOITING LSTM STRUCTURE IN DEEP NEURAL NETWORKS FOR SPEECH RECOGNITION

EXPLOITING LSTM STRUCTURE IN DEEP NEURAL NETWORKS FOR SPEECH...

引用

IEEE International Conference on Acoustics, Speech and Signal Processing

作者： Tianxing He Jasha Droppot Key Lab. of Shanghai Education Commission for Intelligent Interaction and Cognitive Engineering SpeechLab Department of Computer Science and Engineering Shanghai Jiao Tong University China Microsoft Research Redmond

ISBN: (纸本)9781479999897

The CD-DNN-HMM system has became the state-of-art system for large vocabulary continuous speech recognition (LVCSR) tasks, in which deep neural networks (DNN) plays a key role. However, DNN training suffers from the vanishing gradient problem, limiting training of deep models. In this work, we address this problem by incorporating the successful long-short term memory (LSTM) structure, which has been proposed to help recurrent neural network (RNN) to remember long term dependencies, into DNN. Also, we propose a generalized formulation of the LSTM block, which we name general LSTM(GLSTM). In our experiments, it is shown that our proposed (G)LSTM-DNN scales well with more layers, and achieves 8.2% relative word error rate reduction on the 2000-hour Switchboard data set.

关键词： speech recognition DNN LSTM acoustic model Speech recognition Acoustic models Neural network recurrent neural nets

来源：评论

学校读者我要写书评

暂无评论

A bilingual graph-based semantic model for statistical machine translation 25

A bilingual graph-based semantic model for statistical machi...

引用

25th International Joint Conference on Artificial Intelligence, IJCAI 2016

作者： Rui, Wang Zhao, Hai Ploux, Sabine Lu, Bao-Liang Utiyama, Masao Department of Computer Science and Eng. Shanghai Jiao Tong University Shanghai China Key Lab of Shanghai Education Commission for Intelligent Interaction and Cognitive Eng. Shanghai Jiao Tong University Shanghai China Centre National de la Recherche Scientifique CNRS-L2C2 France National Institute of Information and Communications Technology Kyoto Japan

Most existing bilingual embedding methods for Statistical Machine Translation (SMT) suffer from two obvious drawbacks. First, they only focus on simple context such as word count and cooccurrence in document or sliding window to build word embedding, ignoring latent useful information from selected context. Second, word sense but not word form is supposed to be the minimal semantic unit while most existing works are still for word representation. This paper presents Bilingual Graph-based Semantic Model (BGSM) to alleviate such shortcomings. By means of maximum complete sub-graph (clique) for context selection, BGSM is capable of effectively modeling word sense representation instead of the word form itself. The proposed model is applied to phrase pair translation probability estimation and generation for SMT. The empirical results show that BGSM can enhance SMT both in performance (up to +1.3 BLEU) and efficiency in comparison against existing methods.

关键词： computer aided language translation

来源：评论

学校读者我要写书评

暂无评论

Author Correction: Embodiment in a Child-Like Talking Virtual Body Influences Object Size Perception, Self-Identification, and Subsequent Real Speaking

引用

Scientific reports 2018年第1期8卷 4854页

作者： Ana Tajadura-Jiménez Domna Banakou Nadia Bianchi-Berthouze Mel Slater UCL Interaction Centre (UCLIC) University College London London UK. a.tajadura@ucl.ac.uk. Universidad Loyola Andalucía Department of Psychology Seville Spain. a.tajadura@ucl.ac.uk. Universidad Loyola Andalucía Human Neuroscience Lab Seville Spain. a.tajadura@ucl.ac.uk. Event Lab Department of Clinical Psychology and Psychobiology Faculty of Psychology Barcelona Spain. Institute of Neurosciences University of Barcelona Barcelona Spain. UCL Interaction Centre (UCLIC) University College London London UK. Department of Computer Science University College London London UK. melslater@ub.edu. Event Lab Department of Clinical Psychology and Psychobiology Faculty of Psychology Barcelona Spain. melslater@ub.edu. Institute of Neurosciences University of Barcelona Barcelona Spain. melslater@ub.edu. Institució Catalana de Recerca i Estudis Avançats (ICREA) Barcelona Spain. melslater@ub.edu.

A correction to this article has been published and is linked from the HTML and PDF versions of this paper. The error has not been fixed in the paper.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Public perception of V2X-technology - evaluation of general advantages, disadvantages and reasons for data sharing with connected vehicles

Public perception of V2X-technology - evaluation of general ...

引用

IEEE Symposium on Intelligent Vehicle

作者： T. Schmidt R. Philipsen P. Themann M. Ziefle Human Computer Interaction Center (HCIC) RWTH Aachen University Aachen Germany Head of the department of communication science (CS) at RWTH Aachen RWTH Aachen University Aachen Germany Manages the development of ADAS RWTH Aachen University

ISBN: (纸本)9781509018222

This work aims at an evaluation of vehicle-to-infrastructure (V2X)-technology through the users' perspective. The technical opportunities of connected vehicles are affected by the acceptance of the technology and possible draw-backs on the privacy and data-security side. With a three-tiered research approach, this work identified beforehand argument lines in focus group discussions, which enabled a quantitative approach to evaluate positively and negatively perceived features of V2X-technology. Also gender related differences can be displayed. Further, the results of the second quantitative study indicate that although users who already have experience with driver assistance systems are more willing to share (personal) data to use V2X-technology, the overall sample is very reserved with respect to sharing driver-related data. Future research on user diversity and cultural differences is outlined.

关键词： Vehicles Safety Data privacy Privacy Connected vehicles Data security

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：