检索结果-内蒙古大学图书馆

A Specialized variational autoencoder for Cost-Efficient Pedestrian Trajectory Prediction

IEEJ TRANSACTIONS ON ELECTRICAL AND ELECTRONIC ENGINEERING 2025年

作者： Li, Dongchen Lin, Zhimao Hu, Jinglu Waseda Univ Grad Sch Informat Prod & Syst 2-7 Hibikino Kitakyushu Fukuoka 8080135 Japan

The prediction of pedestrian trajectories represents a crucial and widely discussed topic in the field of AI-driven traffic scenarios. The prediction of pedestrian trajectories is constrained by two factors. First, pedestrians do not have the same traffic rule constraints as vehicles. Second, the computational power of in-vehicle systems is limited. This renders the application of traditional methods challenging. Previous methods have been observed to utilize redundant information, which can result in feature imbalance and the potential for model overfitting. In light of these limitations, we propose a lightweight conditional variational autoencoder model with post-process (L-CVAE-P) for pedestrian prediction scenarios. The L-CVAE-P focuses on the efficient interaction of multidimensional features to achieve a comprehensive enhancement of the model for real-world use. The model is tested on two public datasets and achieved state-of-the-art performance, while maintaining efficiency. The experimental results demonstrate that our work has developed and optimized a pedestrian trajectory prediction model for practical applications. (c) 2025 The Author(s). IEEJ Transactions on Electrical and Electronic Engineering published by Institute of Electrical Engineers of Japan and Wiley Periodicals LLC.

关键词： pedestrian trajectory conditional variational autoencoder temporal sequence prediction scenarios interaction multi-dimensional feature balance

来源：评论

学校读者我要写书评

暂无评论

Improving Multi-Agent Trajectory Prediction Using Traffic States on Interactive Driving Scenarios

引用

IEEE ROBOTICS AND AUTOMATION LETTERS 2023年第5期8卷 2708-2715页

作者： Vishnu, Chalavadi Abhinav, Vineel Roy, Debaditya Mohan, C. Krishna Babu, Ch. Sobhan Indian Inst Technol Hyderabad Dept Comp Sci & Engn Hyderabad 502285 Telangana India Inst High Performance Comp IHPC Agcy Sci Technol & Res ASTAR Singapore City 138632 Singapore

Predicting trajectories of multiple agents in interactive driving scenarios such as intersections, and roundabouts are challenging due to the high density of agents, varying speeds, and environmental obstacles. Existing approaches use relative distance and semantic maps of intersections to improve trajectory prediction. However, drivers base their driving decision on the overall traffic state of the intersection and the surrounding vehicles. So, we propose to use traffic states that denote changing spatio-temporal interaction between neighboring vehicles, to improve trajectory prediction. An example of a traffic state is a clump state which denotes that the vehicles are moving close to each other, i.e., congestion is forming. We develop three prediction models with different architectures, namely, Transformer-based (TS-Transformer), Generative Adversarial Network-based (TS-GAN), and conditional variational autoencoder-based (TS-CVAE). We show that traffic state-based models consistently predict better future trajectories than the vanilla models. TS-Transformer produces state-of-the-art results on two challenging interactive trajectory prediction datasets, namely, Eye-on-Traffic (EOT), and INTERACTION. Our qualitative analysis shows that traffic state-based models have better aligned trajectories to the ground truth.

关键词： Trajectory Predictive models History Hidden Markov models Transformers Roads Decoding Trajectory prediction generative adversarial networks conditional variational autoencoder transformers

来源：评论

学校读者我要写书评

暂无评论

Neural Motion Planning for Autonomous Parking

引用

INTERNATIONAL JOURNAL OF CONTROL AUTOMATION AND SYSTEMS 2023年第4期21卷 1309-1318页

作者： Kim, Dongchan Huh, Kunsoo Hanyang Univ Dept Automot Engn Seoul 04763 South Korea

This paper presents a hybrid motion planning strategy that combines a deep generative network with a conventional motion planning method. Existing planning methods such as A* and Hybrid A* are widely used in path planning tasks because of their ability to determine feasible paths even in complex environments;however, they have limitations in terms of efficiency. To overcome these limitations, a path planning algorithm based on a neural network, namely the neural Hybrid A*, is introduced. This paper proposes using a conditional variational autoencoder (CVAE) to guide the search algorithm by exploiting the ability of CVAE to learn information about the planning space given the information of the parking environment. An efficient expansion strategy is utilized based on a distribution of feasible trajectories learned in the demonstrations. The proposed method effectively learns the representations of a given state, and shows improvement in terms of computational time and the number of node expanded related to algorithm performance.

关键词： Autonomous parking conditional variational autoencoder efficient state expansion hybrid A* algorithm neural motion planning

来源：评论

学校读者我要写书评

暂无评论

Seformer: a long sequence time-series forecasting model based on binary position encoding and information transfer regularization

引用

APPLIED INTELLIGENCE 2023年第12期53卷 15747-15771页

作者： Zeng, Pengyu Hu, Guoliang Zhou, Xiaofeng Li, Shuai Liu, Pengjie Chinese Acad Sci Key Lab Networked Control Syst Shenyang 110000 Peoples R China Chinese Acad Sci Shenyang Inst Automat Shenyang 110000 Peoples R China Chinese Acad Sci Institutes Robot & Intelligent Mfg Shenyang 110000 Peoples R China Univ Chinese Acad Sci Beijing 100000 Beijing Peoples R China

Long sequence time-series forecasting (LSTF) problems, such as weather forecasting, stock market forecasting, and power resource management, are widespread in the real world. The LSTF problem requires a model with high prediction accuracy. Recent studies have shown that the transformer model architecture is the most promising model structure for LSTF problems compared with other model architectures. The transformer model has the property of permutation equivalence, which leads to the importance of sequence position encoding, an essential process in model training. Currently, the continuous dynamics models constructed for position encoding using the neural differential equations (neural ODEs) method can model sequence position information well. However, we have found that there are some limitations when neural ODEs are applied to the LSTF problem, including the time cost problem, the baseline drift problem, and the information loss problem;thus, neural ODEs cannot be directly applied to the LSTF problem. To address this problem, we design a binary position encoding-based regularization model for long sequence time-series prediction, named Seformer, which has the following structure: 1) The binary position encoding mechanism, including intrablock and interblock position encoding. For intrablock position encoding, we design a simple ODE method by discretizing the continuum dynamics model, which reduces the time cost required to compute neural ODEs while maintaining their dynamics properties to the maximum extent. In interblock position encoding, a chunked recursive form is adopted to alleviate the baseline drift problem caused by eigenvalue explosion. 2) Information transfer regularization mechanism: By regularizing the model intermediate hidden variables as well as the encoder-decoder connection variables, we can reduce information loss during the model training process while ensuring the smoothness of the position information. Extensive experimental results obtained

关键词： Long sequence time-series forecasting Transformer Position encoding Regularization method conditional variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

A semantic and emotion-based dual latent variable generation model for a dialogue system

引用

CAAI Transactions on Intelligence Technology 2023年第2期8卷 319-330页

作者： Ming Yan Xingrui Lou Chien Aun Chan Yan Wang Wei Jiang State Key Laboratory of Media Convergence and Communication Communication University of ChinaBeijingChina School of Information and Communications Engineering Communication University of ChinaBeijingChina Key Laboratory of Acoustic Visual Technology and Intelligent Control System Communication University of ChinaBeijingChina Department of Electrical and Electronic Engineering The University of MelbourneMelbourneVictoriaAustralia School of Data Science and Intelligent Media Communication University of ChinaBeijingChina

With the development of intelligent agents pursuing humanisation,artificial intelligence must consider emotion,the most basic spiritual need in human *** emotional dialogue systems usually use an external emotional dictionary to select appropriate emotional words to add to the response or concatenate emotional tags and semantic features in the decoding step to generate appropriate ***,selecting emotional words from a fixed emotional dictionary may result in loss of the diversity and consistency of the *** propose a semantic and emotion-based dual latent variable generation model(Dual-LVG)for dialogue systems,which is able to generate appropriate emotional responses without an emotional *** from previous work,the conditional variational autoencoder(CVAE)adopts the standard transformer ***,Dual-LVG regularises the CVAE latent space by introducing a dual latent space of semantics and *** content diversity and emotional accuracy of the generated responses are improved by learning emotion and semantic features ***,the average attention mechanism is adopted to better extract semantic features at the sequence level,and the semi-supervised attention mechanism is used in the decoding step to strengthen the fusion of emotional features of the *** results show that Dual-LVG can successfully achieve the effect of generating different content by controlling emotional factors.

关键词： conditional variational autoencoder dual latent space emotional responses latent variable generation

来源：评论

学校读者我要写书评

暂无评论

Knowledge Base Embedding for Sampling-Based Prediction

引用

ACM TRANSACTIONS ON INFORMATION SYSTEMS 2023年第2期41卷 1-25页

作者： Zhang, Richong Kim, Jaein Mei, Jiajie Mao, Yongyi Beihang Univ Sch Comp Sci & Engn SKLSDE 37 Xueyuan Rd Beijing Peoples R China Univ Ottawa Sch Elect Engn & Comp Sci 75 Laurier Ave East Ottawa ON K1N 6N5 Canada

Each link prediction task requires different degrees of answer diversity. While a link prediction task may expect up to a couple of answers, another may expect nearly a hundred answers. Given this fact, the performance of a link prediction model can be estimated more accurately if a flexible number of obtained answers are estimated instead of a predefined number of answers. Inspired by this, in this article, we analyze two evaluation criteria for link prediction tasks, respectively ranking-based protocol and sampling-based protocol. Furthermore, we study two classes of models on link prediction task, direct model and latent-variable model respectively, to demonstrate that latent-variable model performs better under the sampling-based protocol. We then propose a latent-variable model where the framework of conditional variational autoencoder (CVAE) is applied. Experimental study suggests that the proposed model performs comparably to the current state-of-the-art even under the conventional rank-based protocol. Under the sampling-based protocol, the proposed model is shown to outperform various state-of-the-art models.

关键词： Link prediction Knowledge Base Embedding conditional variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

Operational safety of automated and human driving in mixed traffic environments: A perspective of car-following behavior

引用

PROCEEDINGS OF THE INSTITUTION OF MECHANICAL ENGINEERS PART O-JOURNAL OF RISK AND RELIABILITY 2023年第2期237卷 355-366页

作者： Li, Tao Han, Xu Ma, Jiaqi Ramos, Marilia Lee, Changju Univ Calif Los Angeles 420 Westwood Plaza4731G Boelter Hall Los Angeles CA 90095 USA Virginia Transportat Res Council Charlottesville VA USA

The advent of automated vehicles (AVs) will provide opportunities for safer, smoother, and smarter road transportation. During the transition from the current human-driven vehicle (HV) to a fully AV traffic environment, there will be a mixed traffic flow including both HVs and AVs. The impact of introducing AVs into existing traffic, however, has not yet been fully understood. In this paper, we advance this understanding by conducting mixed traffic safety evaluation from the perspective of car-following behavior using real-world AV operational data of mixed traffic. To understand how the AVs impact other vehicles on the road, we analyzed the operational behaviors of HV-following-HV, AV-following-HV, and HV-following-AV. A selected car-following model is calibrated, and results show that there are significant differences between the HV-following-HV and the other two groups, indicating safe AV behavior and changes in HV behavior (i.e. less aggressive, safer) after the introduction of AVs into the traffic. Additionally, to understand AV behavioral safety, we investigate behavior predictions (one of the most critical inputs for AVs to make car-following decisions) of AVs and their surrounding vehicles using a mature baseline model and a new conditional variational autoencoder (CVAE) framework. The result shows potential risks of inaccurate predictions of the baseline model and the necessity to consider additional factors, such as vehicle interactions and driver behavior, into the prediction for risk mitigation. Arterial vehicle trajectory data from the Lyft Level 5 Dataset is applied to test the proposed methodological framework to understand the car-following safety risks of HVs and AVs in the mixed traffic stream.

关键词： Automated vehicles operational safety trajectory prediction conditional variational autoencoder car following behavior risk

来源：评论

学校读者我要写书评

暂无评论

PREDICTING PEDESTRIAN TRAJECTORIES IN ARCHITECTURAL SPACES: A GRAPH NEURAL NETWORK APPROACH 29th

PREDICTING PEDESTRIAN TRAJECTORIES IN ARCHITECTURAL SPACES: ...

引用

29th International Conference of the Association-for-Computer-Aided-Architectural-Design-Research-in-Asia (CAADRIA)

作者： Yang, Runyu Wang, Weili Gui, Peng Glodon Co Ltd Beijing Peoples R China

ISBN: (纸本)9789887891819

This paper introduces a graph neural network-based model for predicting pedestrian trajectories in architectural spaces. Compared to traditional simulations based on physics-based models, this data-driven model has a stronger ability to learn and predict pedestrian behaviour patterns from real-world data. The model is pre-trained based on Hongqiao Railway Station Dataset, then trained and tested based on the ETH Dataset and the Stanford Drone Dataset, enabling comparisons with other AI models. By creating a more intelligent model, we can establish a digital replica of the real world that can predict pedestrian flow with higher accuracy in daily life or extreme situations such as sudden fires. Our results underscore the critical role of such models in comprehending how architectural spaces are utilized, and thus in improving architectural design and urban planning.

关键词： Multi-agent Simulation Trajectory Prediction Graph Neural Network conditional variational autoencoder Path-finding

来源：评论

学校读者我要写书评

暂无评论

Spherical Image Generation From a Few Normal-Field-of-View Images by Considering Scene Symmetry

引用

IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE 2023年第5期45卷 6339-6353页

作者： Hara, Takayuki Mukuta, Yusuke Harada, Tatsuya Univ Tokyo Res Ctr Adv Sci & Technol Tokyo 1138654 Japan RIKEN Ctr Adv Intelligence Project Tokyo 1030027 Japan

Spherical images taken in all directions (360 degrees by 180 degrees) can represent an entire space including the subject, providing free direction viewing and an immersive experience to viewers. It is convenient and expands the usage scenarios to generate a spherical image from a few normal-field-of-view (NFOV) images, which are partial observations. The primary challenge is generating a plausible image and controlling the high degree of freedom involved in generating a wide area that includes all directions. We focus on scene symmetry, which is a basic property of the global structure of spherical images, such as the rotational and plane symmetries. We propose a method for generating a spherical image from a few NFOV images and controlling the generated regions using scene symmetry. We incorporate the intensity of the symmetry as a latent variable into conditional variational autoencoders to estimate the possible range of symmetry and decode a spherical image whose features are represented through a combination of symmetric transformations of the NFOV image features. Our experiments show that the proposed method can generate various plausible spherical images controlled from asymmetrically to symmetrically, and can reduce the reconstruction errors of the generated images based on the estimated symmetry.

关键词： Image synthesis Image reconstruction Task analysis Gravity Cameras Rendering (computer graphics) Recording Spherical image image generation conditional variational autoencoder symmetry estimation symmetry control

来源：评论

学校读者我要写书评

暂无评论

FedBKD: Heterogenous Federated Learning via Bidirectional Knowledge Distillation for Modulation Classification in IoT-Edge System

引用

IEEE JOURNAL OF SELECTED TOPICS IN SIGNAL PROCESSING 2023年第1期17卷 189-204页

作者： Qi, Peihan Zhou, Xiaoyu Ding, Yuanlei Zhang, Zhengyu Zheng, Shilian Li, Zan Xidian Univ State Key Lab Integrated Serv Networks Xian 710071 Peoples R China 011 Res Ctr Sci & Technol Commun Informat Secur Control Lab Jiaxing 314033 Peoples R China

Benefit from the rapid evolution of artificial intelligence and wireless communication technology, diverse Internet of Things (IoT) devices with edge computing ability have widely penetrated every aspect of daily human life. However, the deviations of private datasets and the heterogeneity of local models caused by the difference in device composition and application scenarios have hampering the aggregation of global recognition model in modulation classification task, thus constraining the classification performance of intelligent IoT-edge devices severely. To address this problem, we propose a heterogenous Federated learning framework based on Bidirectional Knowledge Distillation (FedBKD) for IoT system, which integrates knowledge distillation into the local model upload (client-to-cloud) and global model download (cloud-to-client) steps of federated learning. The client-to-cloud distillation is regarded as a process of multi-teacher knowledge distillation and the global network is regarded as a student network that unifies the heterogeneous knowledge from multiple local teacher networks. A public dataset is generated by conditional variational autoencoder (CVAE) and stored in the cloud server for supporting the obtaining of heterogeneous knowledge without sharing the private data of IoT devices. The cloud-to-client distillation is single-teacher-multiple-students process, which distills the knowledge from the single global model back to multiple heterogeneous local networks and partial knowledge distillation is used in this process. We implement our FedBKD method in the modulation classification task and the simulation results have proven the effectiveness of our proposed method.

关键词： Data models Internet of Things Federated learning Modulation Knowledge engineering Adaptation models Training IoT federated learning model heterogeneity knowledge distillation conditional variational autoencoder

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：