检索结果-内蒙古大学图书馆

Asia-Pacific-Signal-and-Information-Processing-Association Annual Summit and Conference (APSIPA ASC)

作者： Lo, Chun-Huang Lee, Chung-Nan Natl Sun Yat Sen Univ Dept Comp Sci & Engn Kaohsiung Taiwan

ISBN: (纸本)9798350300673

This paper presents a vision-based unmanned aerial vehicle (UAV) indoor obstacle avoidance using a deep reinforcement learning (DRL). The system consists of two parts a depth map compression and a UAV control. For the depth map compression part, the pre-trained variational autoencoder model is used to improve obstacle avoidance and reduce the training time. The states include the UAV information, gray images and compressed features. A dueling double deep recurrent Q network model is used to control the UAV. The network is trained in the AirSim simulation. To validate the performance of the proposed algorithm some simulations are conducted. The results show that the proposed algorithm can avoid obstacles at a fast speed in a narrow space, and fly through a difficult L-shaped corner in an indoor simulation.

关键词： Unmanned Aerial Vehicle Obstacle Avoidance Deep Reinforcement Learning AirSim variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Multivariate air quality time series analysis via a recurrent variational deep learning model 13

Multivariate air quality time series analysis via a recurren...

引用

Conference on Geospatial Informatics XIII

作者： Loughlin, Cooper Manolakis, Dimitris Ingle, Vinay Northeastern Univ 360 Huntington Ave Boston MA 02115 USA MIT Lincoln Lab 244 Wood St Lexington MA 02421 USA

ISBN: (数字)9781510661653

ISBN: (纸本)9781510661646;9781510661653

Monitoring of air pollutants across space and time is critical in understanding pollution trends and reporting air quality. The Air Quality Index (AQI) is a tool used to communicate air quality that incorporates atmospheric concentrations of five major pollution indicators: ground-level ozone, particulate matter, carbon monoxide, sulfur dioxide, and nitrogen dioxide. The ability to accurately forecast these concentrations and identify unusual levels is of particular importance. In this work, we develop a generative time series model for air quality indicators and use it for long and short-term probabilistic forecasts. Air quality data are multivariate and exhibit high variability across indicators in both space and time. Marginal indicator distributions are typically skewed and contain substantial zeros, while indicator-wise cross-correlations can be highly non-linear. We find that hourly measurements additionally exhibit substantial temporal cross-correlation, long-term dependence, and daily periodicity. To capture these complexities, we employ a recurrent extension of the variational autoencoder (VAE) to sequential data. The VAE is a generative neural network architecture capable of learning complex, high dimensional manifolds on which data are distributed. Furthermore, recurrent architectures can capture non-linear and long-term temporal qualities of time series data. We train the proposed time series model on historical air quality measurements at multiple locations and demonstrate its ability to capture observed indicator-wise and temporal complexities. We additionally use the trained model to compute probabilistic forecasts and credible intervals of air quality indicators.

关键词： air quality index multivariate time series analysis deep latent variable model recurrent neural network variational autoencoder generative time series model

来源：评论

学校读者我要写书评

暂无评论

Generative Slate Recommendation with Reinforcement Learning 23

Generative Slate Recommendation with Reinforcement Learning

引用

16th International Conference on Web Search and Data Mining

作者： Deffayet, Romain Thonet, Thibaut Renders, Jean-Michel de Rijke, Maarten Naver Labs Europe Meylan France Univ Amsterdam Amsterdam Netherlands

ISBN: (纸本)9781450394079

Recent research has employed reinforcement learning (RL) algorithms to optimize long-term user engagement in recommender systems, thereby avoiding common pitfalls such as user boredom and filter bubbles. They capture the sequential and interactive nature of recommendations, and thus offer a principled way to deal with long-term rewards and avoid myopic behaviors. However, RL approaches are intractable in the slate recommendation scenario - where a list of items is recommended at each interaction turn due to the combinatorial action space. In that setting, an action corresponds to a slate that may contain any combination of items. While previous work has proposed well-chosen decompositions of actions so as to ensure tractability, these rely on restrictive and sometimes unrealistic assumptions. Instead, in this work we propose to encode slates in a continuous, low-dimensional latent space learned by a variational auto-encoder. Then, the RL agent selects continuous actions in this latent space, which are ultimately decoded into the corresponding slates. By doing so, we are able to (i) relax assumptions required by previous work, and (ii) improve the quality of the action selection by modeling full slates instead of independent items, in particular by enabling diversity. Our experiments performed on a wide array of simulated environments confirm the effectiveness of our generative modeling of slates over baselines in practical scenarios where the restrictive assumptions underlying the baselines are lifted. Our findings suggest that representation learning using generative models is a promising direction towards generalizable RL-based slate recommendation.

关键词： Slate recommendation Reinforcement learning variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Towards Multi-User Activity Recognition through Facilitated Training Data and Deep Learning for Human-Robot Collaboration Applications

Towards Multi-User Activity Recognition through Facilitated ...

引用

International Joint Conference on Neural Networks (IJCNN)

作者： Semeraro, Francesco Carberry, Jon Cangelosi, Angelo Univ Manchester Manchester Ctr Robot & Manchester Lancs England BAE Syst Plc BAE Syst Operat Ltd Warton England

ISBN: (纸本)9781665488679

Human-robot interaction (HRI) research is progressively addressing multi-party scenarios, where a robot interacts with more than one human user at the same time. Conversely, research is still at an early stage for human-robot collaboration. The use of machine learning techniques to handle such type of collaboration requires data that are less feasible to produce than in a typical HRC setup. This work outlines scenarios of concurrent tasks for non-dyadic HRC applications. Based upon these concepts, this study also proposes an alternative way of gathering data regarding multi-user activity, by collecting data related to single users and merging them in post-processing, to reduce the effort involved in producing recordings of pair settings. To validate this statement, 3D skeleton poses of activity of single users were collected and merged in pairs. After this, such datapoints were used to separately train a long shortterm memory (LSTM) network and a variational autoencoder (VAE) composed of spatio-temporal graph convolutional networks (STGCN) to recognise the joint activities of the pairs of people. The results showed that it is possible to make use of data collected in this way for pair HRC settings and get similar performances compared to using training data regarding groups of users recorded under the same settings, relieving from the technical difficulties involved in producing these data. The related code and collected data are publicly available(1).

关键词： multi-user activity recognition single-user training data concurrent tasks multi-party human-robot collaboration non-dyadic human-robot collaboration deep learning long short-term memory variational autoencoder spatio-temporal graph convolutional network transfer learning

来源：评论

学校读者我要写书评

暂无评论

Accent-VITS: Accent Transfer for End-to-End TTS 1

引用

18th National Conference on Man-Machine Speech Communication-NCMMSC-Annual

作者： Ma, Linhan Zhang, Yongmao Zhu, Xinfa Lei, Yi Ning, Ziqian Zhu, Pengcheng Xie, Lei Northwestern Polytech Univ Sch Comp Sci Audio Speech & Language Proc Grp ASLP NPU Xian Peoples R China NetEase Inc Fuxi AI Lab Hangzhou Peoples R China

ISBN: (数字)9789819706013

ISBN: (纸本)9789819706006;9789819706013

Accent transfer aims to transfer an accent from a source speaker to synthetic speech in the target speaker's voice. The main challenge is how to effectively disentangle speaker timbre and accent which are entangled in speech. This paper presents a VITS-based [7] end-to-end accent transfer model named Accent-VITS. Based on the main structure of VITS, Accent-VITS makes substantial improvements to enable effective and stable accent transfer. We leverage a hierarchical CVAE structure to model accent pronunciation information and acoustic features, respectively, using bottleneck features and mel spectrums as constraints. Moreover, the text-to-wave mapping in VITS is decomposed into text-to-accent and accent-to-wave mappings in Accent-VITS. In this way, the disentanglement of accent and speaker timbre becomes be more stable and effective. Experiments on multi-accent and Mandarin datasets show that Accent-VITS achieves higher speaker similarity, accent similarity and speech naturalness as compared with a strong baseline (Demos: https://***/AccentVITS/).

关键词： Text to speech Accent transfer variational autoencoder Hierarchical

来源：评论

学校读者我要写书评

暂无评论

Data-driven fault diagnosis based on the integrated deep nonlinear dynamic system model 43

Data-driven fault diagnosis based on the integrated deep non...

引用

43rd Chinese Control Conference, CCC 2024

作者： Tang, Xiaochu Tao, Na Zhang, Yi Li, Yuan Shenyang Aerospace University School of Automation Shenyang China Shenyang University of Chemical Technology College of Information Engineering Shenyang China

ISBN: (纸本)9789887581581

To ensure the safety and reliability of complex industrial processes are very important. Therefore, extracting multiple features of data effectively is a great significance to improve the accuracy of modeling for fault diagnosis. Dynamic, uncertainty and nonlinearity are main characteristics of industrial process data. However, it is challenging for modeling to extract multiple features of process data at the same time. In this paper, an integrated nonlinear dynamic system model is proposed for fault diagnosis based on variational autoencoder-linear dynamic system (VAE-LDS). First, the deep learning algorithm variational autoencoder (VAE) is used to extract the nonlinear data feature and learn the potential representation of data. Furthermore, the VAE model is embedded into the linear dynamic system (LDS) so that the dynamics and uncertainty underlying data can be extracted simultaneously. In this way, A comprehensive model integrating multiple features can be established. Finally, the proposed method is applied to the TE process for fault diagnosis comparing with other methods. The results show the proposed method has superior performance. © 2024 Technical Committee on Control Theory, Chinese Association of Automation.

关键词： Data-driven Fault Detection Fault Diagnosis Linear Dynamic System variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Two-stage instrument timbre transfer method using RAVE 26

Two-stage instrument timbre transfer method using RAVE

引用

26th International Symposium on Multimedia, ISM 2024

作者： Hu, Di Ito, Katunobu Hosei University Graduate School of Computer and Information Sciences Tokyo Japan Hosei University Faculty of Computer and Information Sciences Tokyo Japan

ISBN: (纸本)9798331511111

Recently, the real-time audio variational autoencoder (RAVE) method was developed for high-quality audio waveform synthesis. The RAVE method is based on a variational autoencoder and employs a two-stage training strategy. However, the RAVE model still has limitations in timbre transformation, especially when converting between instruments with significantly different timbres. Issues such as pitch instability, inaccurate timbre reproduction, and severe degradation in sound quality can arise. To enhance timbre transfer performance, we propose a two-stage timbre transformation method using RAVE, which involves applying two timbre transfer models to perform a dual transformation on the original input audio. To evaluate the proposed method, we trained the model and tested its performance using audio generated from MIDI and SoundFont2 sound sources. The results demonstrate that the proposed method improves timbre transfer compared to the single-stage RAVE model. © 2024 IEEE.

关键词： audio synthesis timbre transfer variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

An Ultrasound-Based Surveillance System for Bathroom Posture and Location Estimation

An Ultrasound-Based Surveillance System for Bathroom Posture...

引用

2024 IEEE International Conference on Consumer Electronics, ICCE 2024

作者： Sato, Shun Ohara, Ryotaro Kamarulzaman, M. Shahrul Amir Yasuda, Yuto Izumi, Shintaro Kawaguchi, Hiroshi Kobe University School of System Informatics Kobe Japan Kobe University School of Science Technology and Innovation Kobe Japan

ISBN: (纸本)9798350324136

Bathrooms can be slippery, increasing the risk of falling. In addition, because people enter the bathroom alone, it is difficult to detect accidents immediately when they occur. Therefore, a system is required to quickly detect accidents and call for assistance. In this study, the location and the posture of the bather are important attributes. We propose a method to estimate the location of a bather using ultrasound correlation values as input to a deep neural network and to estimate the bather's posture. Therefore, the mean absolute error achieved 11.7 cm for location estimation, and accuracy achieved 95.3% for posture estimation in the experiment involving an individual moving throughout the bathroom. © 2024 IEEE.

关键词： Imaging Location Estimation Posture Estimation Ultrasound variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Modeling semantic and emotional relationship in multi-turn emotional conversations using multi-task learning

引用

APPLIED INTELLIGENCE 2022年第4期52卷 4663-4673页

作者： Cui, Fuwei Di, Hui Shen, Lei Ouchi, Kazushige Liu, Ze Xu, Jinan Beijing Jiaotong Univ Sch Elect Informat Engn Inst Adv Control Syst 3 Shangyuan Rd Haidian Beijing 100044 Peoples R China Toshiba China Co Ltd 19 Dongfang East Rd Beijing 100600 Peoples R China Univ Chinese Acad Sci Chinese Acad Sci Inst Comp Technol Key Lab Intelligent Informat Proc Beijing 100190 Peoples R China Beijing Jiaotong Univ Sch Comp Informat Technol 3 Shangyuan Rd Haidian Beijing 100044 Peoples R China

Recognition and expression of emotion are key factors to the success of multi-turn conversations. Emotion recognition that can help model the relationship between query and response is used to be employed in single-turn conversation models. However, little work focuses on infusing the emotional factor in multi-turn conversation generation so far. To alleviate these problems, we propose Multi-turn Emotional Conversation Model (MECM) by using multi-task learning, which improves the ability to represent emotions in multi-turn conversations. MECM is based on hierarchical latent variable model, that utilizes context hidden to sharing the common information. Besides it also contains an emotion classifier to help the model recognize the emotion in the conversation, and a conversation generator to maintain consistency of content and transformation of emotion. Experimental results show that our model significantly improves the quality of responses in terms of diversity and empathy, and keeps better performance on semantic similarity compared with baseline methods.

关键词： Multi-turn conversation Emotion Multi-task learning variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Deep probabilistic time series forecasting using augmented recurrent input for dynamic systems

引用

MECHANICAL SYSTEMS AND SIGNAL PROCESSING 2022年 177卷 1页

作者： Liu, Haitao Liu, Changjun Jiang, Xiaomo Chen, Xudong Yang, Shuhua Wang, Xiaofang Dalian Univ Technol Sch Energy & Power Engn Dalian 116024 Peoples R China Dalian Univ Technol Digital Twin Lab Ind Equipment Dalian 116024 Peoples R China Shenyang Blower Works Grp Corp Shenyang 110869 Peoples R China

The demand of probabilistic time series forecasting has been recently raised in various dynamic system scenarios, for example, system identification and prognostic and health management of machines. To this end, we combine the advances in both deep generative models and state space model (SSM) to come up with a novel, data-driven deep probabilistic sequence model. Specifically, we follow the popular encoder-decoder generative structure to build the recurrent neural networks (RNN) assisted variational sequence model on an augmented recurrent input space, which could induce rich stochastic sequence dependency. Besides, in order to alleviate the inconsistency issue of the posterior between training and predicting as well as improving the mining of dynamic patterns, we (i) propose using a lagged hybrid output as input for the posterior at next time step, which brings training and predicting into alignment;and (ii) further devise a generalized auto-regressive strategy that encodes all the historical dependencies for the posterior. Thereafter, we first investigate the methodological characteristics of the proposed deep probabilistic sequence model on toy cases, and then comprehensively demonstrate the superiority of our model against existing deep probabilistic SSM models through extensive numerical experiments on eight system identification benchmarks from various dynamic systems. Finally, we apply our sequence model to a real-world centrifugal compressor forecasting problem, and again verify its outstanding performance by quantifying the time series predictive distribution.

关键词： State space model Dynamic system Recurrent neural networks variational inference variational autoencoder Compressor

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：