We present a movable spatial augmented reality (SAR) system that can be easily installed in a user workspace. The proposed system aims to dynamically cover a wider projection area using a portable projector attached t...
详细信息
A recent trend in normalization of factors extraneous to a speech recognition task has been to explicitly introduce features related to the unwanted variability in the training of Deep Neural Networks (DNN). Typically...
详细信息
ISBN:
(纸本)9781479999897
A recent trend in normalization of factors extraneous to a speech recognition task has been to explicitly introduce features related to the unwanted variability in the training of Deep Neural Networks (DNN). Typically, this is done by either perturbing the training set with models of these extraneous factors such as vocal tract length and environmental noise or augmenting the conventional spectral features with auxiliary information such as i-vector, noise spectrum, etc. Another emerging approach is to derive low dimensional representations of the factors from the hidden layers of DNN and use it for normalization of the acoustic model. Almost all of these approaches focus on either speaker or environment normalization. In this paper we propose a novel approach for estimating a compact joint representation of speakers and environment by training a DNN, with a bottleneck layer, to classify the i-vector features into speaker and environment labels by Multi-Task Learning (MTL). Another novelty is to learn this compact representation while learning to map the i-vector of a noisy utterance into its corresponding clean speaker ivector and noise-only i-vector. Experiments were conducted on an artificially noise-corrupted version of the WSJ corpus. The proposed compact joint speaker-environment representations show promising gains.
Driving can be a lonely activity. While there has been a lot of research and technical inventions concerning car-tocar communication and passenger entertainment, there is still little work concerning connecting driver...
详细信息
ISBN:
(纸本)9781450336734
Driving can be a lonely activity. While there has been a lot of research and technical inventions concerning car-tocar communication and passenger entertainment, there is still little work concerning connecting drivers. Whereas tourism is very much a social activity, drive tourists and road trippers have few options to communicate with fellow travelers. Our study is placed at the intersection of tourism and driving. It aims to enhance the trip experience during driving through social interaction. This paper explores how a mobile application that allows instant messaging between travelers sharing similar context can establish a temporary, ad hoc community and enhance the road trip experience. A prototype was developed and evaluated in various user and field studies. The study's outcomes are relevant for the design of future mobile tourist guides that benefit from community design, social encounters and recommendations.
Stereoscopic displays are promising and have many applications in various fields because they can provide amazing visual effects. However, one of the inevitable problems is that they also cause visual fatigue after du...
详细信息
Tangible User Interfaces (TUIs) can create opportunities to learn programming for children, which have positive effect on children's development. TanProRobot is a tangible system designed for children at grade 1-2...
详细信息
Object-oriented programming is easily accessible by beginners, since it allows for modeling real-world entities as software objects. Storytelling is a natural way to introduce the basic concepts behind object-oriented...
详细信息
ISBN:
(纸本)9781450331463
Object-oriented programming is easily accessible by beginners, since it allows for modeling real-world entities as software objects. Storytelling is a natural way to introduce the basic concepts behind object-oriented programming. To convey object-oriented programming concepts to children, such as object, attribute and etc., we present a new tangible programming tool-TanProStory, for children in 1-3 grades. Children can tell a story by arranging programming blocks to initialize a character and construct a program controlling its action. TanProStory consists of three parts: Programming blocks, Animation Game and Sensor input module. Programming blocks in TanProStory are surface-sensitive, i.e. only the command on the top surface can be detected. We conducted a preliminary user study and analyzed the results, which can guide a better design of TanProStory.
This paper explores three persuasive strategies and their capacity to encourage biking as a low-energy mode of transportation. The strategies were designed based on: (I) triggering messages that harness social influen...
详细信息
Mixed Reality (MR) has the potential to improve the quality of users' experience by immersing users in the virtual world, but the limitations of computervision and 3D graphics techniques have made it difficult to...
详细信息
ISBN:
(纸本)9781450331463
Mixed Reality (MR) has the potential to improve the quality of users' experience by immersing users in the virtual world, but the limitations of computervision and 3D graphics techniques have made it difficult to bring up practical applications. In this paper we present a mixed reality application that combines a mixed reality experience and storytelling to motivate young children to engage more in reading. We describe system design from physical space to software implementation and share our findings from 4 years of deployment. Since the first prototype was deployed at a national children's library headquartered in Korea, the accumulated number of young visitors reached 15000 and 20 additional children's libraries have installed the system. Our results demonstrate that mixed reality applications create a pleasant and engaging user experience for young children combined with storytelling.
Speech enhancement is an essential technique to process degraded audio in various applications. Beamforming to eliminate interferences based on sensor arrays is the most well-known method for this issue. However, trad...
详细信息
Speech enhancement is an essential technique to process degraded audio in various applications. Beamforming to eliminate interferences based on sensor arrays is the most well-known method for this issue. However, traditional beamformers often face magnitude incoherence towards received signals due to directional weighting. Therefore, a novel dual-channel beamformer based on time-delay compensation (TDC) and shifted principal components analysis (PCA) is presented in this work. Firstly, our enhancement algorithm utilizes TDC estimator to preserve binaural cues, including interaural time-delay and intensity difference. Then the estimated cues are comprised to improve the shifted PCA, which can reduce noise by extracting primary components. Finally, the aforehand processed audio are input to a beamformer with post-filter to obtain enhanced speech. Experiments have demonstrated that the proposed algorithm could achieve some superiorities in speech intelligibility compared with the state-of-the-arts against real scenarios.
暂无评论