检索结果-内蒙古大学图书馆

arXiv 2025年

作者： Villar-Corrales, Angel Behnke, Sven Computer Science Institute VI – Intelligent Systems and Robotics Center for Robotics The Lamarr Institute for Machine Learning and Artificial Intelligence Germany

Predicting future scene representations is a crucial task for enabling robots to understand and interact with the environment. However, most existing methods rely on video sequences and simulations with precise action annotations, limiting their ability to leverage the large amount of available unlabeled video data. To address this challenge, we propose PlaySlot, an object-centric video prediction model that infers object representations and latent actions from unlabeled video sequences. It then uses these representations to forecast future object states and video frames. PlaySlot allows to generate multiple possible futures conditioned on latent actions, which can be inferred from video dynamics, provided by a user, or generated by a learned action policy, thus enabling versatile and interpretable world modeling. Our results show that PlaySlot outperforms both stochastic and object-centric baselines for video prediction across different environments. Furthermore, we show that our inferred latent actions can be used to learn robot behaviors sample-efficiently from unlabeled video demonstrations. Videos and code are available at https://***/PlaySlot/. © 2025, CC BY.

关键词： Stochastic systems

来源：评论

学校读者我要写书评

暂无评论

A Translation-Tolerant Place Recognition Method by Viewpoint Unification 27

A Translation-Tolerant Place Recognition Method by Viewpoint...

引用

27th IEEE International Conference on Intelligent Transportation Systems, ITSC 2024

作者： Zheng, Linwei Hu, Xiangcheng Ma, Fulong Zhao, Guoyang Qi, Weiqing Ma, Jun Liu, Ming The Hong Kong University of Science and Technology Department of Electronic and Computer Engineering Hong Kong Robotics and Autonomous Systems Thrust Guangzhou China The Hong Kong University of Science and Technology Division of Emerging Interdisciplinary Areas Hong Kong

ISBN: (纸本)9798331505929

Place recognition serves as a fundamental component in tasks like loop closure detection and relocalization for mobile robots. Polar coordinate representations, such as Scan Context, which align with the data structure of range sensors, have become the most common data structure for point cloud descriptors in place recognition. While polar representations demonstrate the rotation invariance, they remain susceptible to translation variations. In this study, we introduce a novel approach: shifting the viewpoint of the original point cloud to construct the unified Scan Context, thereby mitigating translation variance. Our key concept focuses on identifying a stable, unified viewpoint for a given place and then pre-translating the point cloud accordingly. This naturally results in a descriptor devoid of translation variance. Importantly, within a given place, the viewpoint unification process tends to relocate the viewpoint to a similar position, irrespective of the original sensor perspective. In other words, the unified Scan Context becomes more closely associated with the place's structural characteristics than the physical location of the sensor. We validate our method through a series of comprehensive experiments encompassing synthetic scenarios and real-world datasets, showcasing its robustness in effectively handling translation variations. © 2024 IEEE.

关键词： Data structures

来源：评论

学校读者我要写书评

暂无评论

Wekws: A Production First Small-Footprint End-to-End Keyword Spotting Toolkit 48

Wekws: A Production First Small-Footprint End-to-End Keyword...

引用

48th IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2023

作者： Wang, Jie Xu, Menglong Hou, Jingyong Zhang, Binbin Zhang, Xiao-Lei Xie, Lei Pan, Fuping Northwestern Polytechnical University School of Marine Science and Technology Xi'an China WeNet Open Source Community China Horizon Robotics Beijing China School of Computer Science Xi'an China

ISBN: (纸本)9781728163277

Keyword spotting (KWS) enables speech-based user interaction and gradually becomes an indispensable component of smart devices. Recently, end-to-end (E2E) methods have be-come the most popular approach for on-device KWS tasks. However, there is still a gap between the research and deployment of E2E KWS methods. In this paper, we introduce WeKws, a production-quality, easy-to-build, and convenient-to-be-applied E2E KWS toolkit. WeKws contains the implementations of several state-of-the-art backbone networks, making it achieve highly competitive results on three publicly available datasets. To make WeKws a pure E2E toolkit, we utilize a refined max-pooling loss to make the model learn the ending position of the keyword by itself, which significantly simplifies the training pipeline and makes WeKws very efficient to be applied in real-world scenarios. The toolkit is publicly available at https://***/wenet-e2e/wekws. © 2023 IEEE.

关键词： computer vision

来源：评论

学校读者我要写书评

暂无评论

Autonomous 3D Reconstruction with Small Unmanned Aerial Vehicle (UAV) using Structure from Motion: Practical Aspects

Autonomous 3D Reconstruction with Small Unmanned Aerial Vehi...

引用

AIAA science and Technology Forum and Exposition, AIAA SciTech Forum 2025

作者： Rayon, Nathan Stevens, Raymond Sevil, Hakki Erhan Department of Computer Science University of West Florida FL32514 United States Department of Electrical and Computer Engineering University of West Florida FL32514 United States Department of Intelligent Systems & Robotics University of West Florida FL32514 United States

ISBN: (数字)9781624107238

ISBN: (纸本)9781624107238

This paper explores the practical considerations and challenges involved in achieving autonomous 3D reconstruction utilizing small Unmanned Aerial Vehicles (UAVs) through the framework of Structure from Motion (SFM). The utilization of small UAVs equipped with lightweight cameras presents a promising resource for cost-effective and rapid 3D mapping in various domains such as disaster response, infrastructure inspection, and search and rescue. In this study, the critical practical aspects of this process, including height level for taking images, UAV selection, number of images, image acquisition strategies, and computational time. The effectiveness and limitations of autonomous 3D reconstruction with small UAVs through practical insights are summarized, with discussions on potential advancements and areas for future research. This paper serves as a guide for researchers interested in utilizing small UAVs for autonomous 3D reconstruction tasks applications. © 2025, American Institute of Aeronautics and Astronautics Inc, AIAA. All rights reserved.

关键词： Unmanned aerial vehicles (UAV)

来源：评论

学校读者我要写书评

暂无评论

SceneSense: Diffusion Models for 3D Occupancy Synthesis from Partial Observation

SceneSense: Diffusion Models for 3D Occupancy Synthesis from...

引用

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

作者： Alec Reed Brendan Crowe Doncey Albin Lorin Achey Bradley Hayes Christoffer Heckman Department of Computer Science Intelligent Robotics Laboratory University of Colorado Boulder

ISBN: (数字)9798350377705

ISBN: (纸本)9798350377712

When exploring new areas, robotic systems generally exclusively plan and execute controls over geometry that has been directly measured. This planning paradigm can lead to unintuitive exploration or replanning latency when entering areas that were previous obstructed from view. To address this we present SceneSense, a real-time 3D diffusion model for synthesizing 3D occupancy information from partial observations that effectively predicts these occluded or out of view geometries for use in future planning and control frameworks. SceneSense uses a running occupancy map and a single RGB-D camera to generate predicted geometry around the platform at runtime, even when the geometry is occluded or out of view. Our architecture ensures that SceneSense never overwrites observed free or occupied space. By preserving the integrity of the observed map, SceneSense mitigates the risk of corrupting the observed space with generative predictions. While SceneSense is shown to operate well using a single RGB-D camera, the framework is flexible enough to extend to additional modalities. Unlike existing models that necessitate multiple views and offline scene synthesis, or are focused on filling gaps in observed data, our findings demonstrate that SceneSense is an effective approach to estimating unobserved local occupancy information at runtime. Local occupancy predictions from SceneSense are shown to better represent the ground truth occupancy distribution during the test exploration trajectories than the running occupancy map. The source code can be found on our website: https://***/scenesense/

关键词： Geometry Three-dimensional displays Runtime Source coding Diffusion models Cameras Real-time systems Planning Trajectory Intelligent robots

来源：评论

学校读者我要写书评

暂无评论

From News to Forecast: Integrating Event Analysis in LLM-Based Time Series Forecasting with Reflection 38

From News to Forecast: Integrating Event Analysis in LLM-Bas...

引用

38th Conference on Neural Information Processing Systems, NeurIPS 2024

作者： Wang, Xinlei Feng, Maike Qiu, Jing Gu, Jinjin Zhao, Junhua School of Electrical and Computer Engineering The University of Sydney Australia School of Science and Engineering The Chinese University of Hong Kong Shenzhen China Shenzhen Institute of Artificial Intelligence and Robotics for Society China

This paper introduces a novel approach that leverages Large Language Models (LLMs) and Generative Agents to enhance time series forecasting by reasoning across both text and time series data. With language as a medium, our method adaptively integrates social events into forecasting models, aligning news content with time series fluctuations to provide richer insights. Specifically, we utilize LLM-based agents to iteratively filter out irrelevant news and employ human-like reasoning to evaluate predictions. This enables the model to analyze complex events, such as unexpected incidents and shifts in social behavior, and continuously refine the selection logic of news and the robustness of the agent's output. By integrating selected news events with time series data, we fine-tune a pre-trained LLM to predict sequences of digits in time series. The results demonstrate significant improvements in forecasting accuracy, suggesting a potential paradigm shift in time series forecasting through the effective utilization of unstructured news data. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Numerical Method for Complete Solution of the Optimal Control Problem 9

Numerical Method for Complete Solution of the Optimal Contro...

引用

9th International Conference on Control, Decision and Information Technologies, CoDIT 2023

作者： Diveev, Askhat Federal Research Center 'Computer Science and Control' The Russian Academy of Sciences Department of Robotics Control Vavilova str. 44 Moscow119333 Russia

ISBN: (纸本)9798350311402

The work is devoted to the numerical complete solution of the optimal control problem. The complete solution means the solution of the optimal control problem together with the solution of the control synthesis problem to stabilize the movement of the control object along the found optimal trajectory. To solve this problem, evolutionary computations and symbolic regression are used. First, the optimal control problem by an evolutionary algorithm in the classical formulation is solved, after that the control synthesis problem by a method of symbolic regression is solved. The statement of the complete optimal control problem is presented. The computational experiment considers the solution of the complete optimal control problem for a quadcopter. © 2023 IEEE.

关键词： Numerical methods

来源：评论

学校读者我要写书评

暂无评论

Exploring Intermittent Dynamics of Neural Activity in a Single-Pair System of Excitatory and Inhibitory Neurons 3

Exploring Intermittent Dynamics of Neural Activity in a Sing...

引用

3rd International Conference on Emerging Techniques in Computational Intelligence, ICETCI 2023

作者： Sugawara, Akio Nobukawa, Sou Wagatsuma, Nobuhiko Inagaki, Keiichiro Chiba Institute of Technology Department of Computer Science Narashino Japan Toho University Department of Information Science Funabashi Japan Chubu University Department of Artificial Intelligence and Robotics Kasugai Japan

ISBN: (纸本)9798350300604

A long-Tailed property has been observed at various levels of the brain, ranging from neural population activity to the level of cognitive neurodynamics. Specifically, at the cognitive neurodynamics level, a phenomenon called perceptual alternation displays perceptual durations that follow a gamma or log-normal distribution. Moreover, this perceptual alternation becomes nondeterministic. The occurrence of alternation phenomena has also been noticed in neural populations at the cognitive neurodynamics level. Even in a system consisting of a single pair of excitatory and inhibitory neurons, referred to as chaos-chaos intermittency (CCI), a similar intermittent alternation of neural activity emerges, involving intermittent transitions between multiple isolated attractors. In this study, our hypothesis was that nondeterminism and long-Tailed properties can emerge in this single-pair system. To test this hypothesis, we evaluated the determinism of two types of dynamics in the system that couples excitatory and inhibitory neurons: 1) transitions between attractors and 2) behavior within attractors. This evaluation was performed on multiple time scales using multi-scale entropy analysis (MSE) and an iterated amplitude-Adjusted Fourier transform (IAAFT) surrogate. The results demonstrated that the transitions between attractors were nondeterministic, while the behavior within attractors was deterministic. Furthermore, a long-Tailed property manifested in the duration of the transitions between attractors, possibly resulting from long-Term evaluation in the chaotic unstable orbit. The emergence of this long-Tailed property was attributed to the low frequency CCI around the attractor-merging bifurcation. This discovery contributes to the understanding of the long-Tailed properties exhibited by neural activity across multiple levels. © 2023 IEEE.

关键词： Brain

来源：评论

学校读者我要写书评

暂无评论

Securing Mobile Robots Multi-Party Authentication Technique Using Modified Elliptic Curve Cryptography

Securing Mobile Robots Multi-Party Authentication Technique ...

引用

2023 International Conference on Advanced Computing and Communication Technologies, ICACCTech 2023

作者： Haldar, Bilas Jha, Prabin Kumar The Neotia University Computer Science & Engineering West Bengal Sarisha743368 India The Neotia University Robotics & Automation West Bengal Sarisha743368 India

ISBN: (纸本)9798350380880

Mobile robots continue to play a pivotal role in various industries, safeguarding their operations against unauthorized access and cyber-attacks becomes increasingly critical. The burgeoning adoption of ubiquitous services across a variety of sectors has led to an exponential rise in mobile robot users. In this scenario, the authentication of mobile robots becomes a crucial requirement for ensuring data security and confidentiality. To protect the authenticity of confidential information it is necessary to provide strong authentication systems as these robots become key participants for providing a range of services. This work proposed an innovative approach to enhance the security of mobile robots by implementing Multi-Part Authentication (MP A) techniques using a modified Elliptic Curve Cryptography (ECC) algorithm. Traditional authentication methods often prove insufficient when it comes to the unique challenges faced by mobile robots, including remote control vulnerabilities, data integrity concerns, and the potential for physical tampering. The present work is a novel, efficient, and robust cryptography technique for generating keys using modified ECC and arithmetic greatest common divisor for multi-party communications. Additionally, the proposed work also presented a novel methodology of encryption and decryption techniques for the distribution of the authentication keys using a modified ECC algorithm. The results of the work show that the proposed technique ensures that only authorized users are able to control mobile robots as well as provide protection against attacks. © 2023 IEEE.

关键词： Authentication

来源：评论

学校读者我要写书评

暂无评论

Multiphysics Research and Loss Calculation Considering the Fluid Regime of the Airgap Based on High-Speed PMSM 26

Multiphysics Research and Loss Calculation Considering the F...

引用

26th International Conference on Electrical Machines and Systems, ICEMS 2023

作者： Su, Xiangdong Zhao, Hang Li, Fang Robotics and Autonomous Systems Thrust Guangzhou China The Hong Kong University of Science and Technology Department of Electronic & Computer Engineering Hong Kong Hong Kong

ISBN: (纸本)9798350317589

This paper focuses on the calculation of different kinds of mechanical loss and put forward a complete method to calculate the mechanical loss for high-speed PMSM accurately. Windage loss, bearing loss, and cooling fan loss are calculated concretely, and the flow regime of the airgap is analyzed for windage loss calculation. Then, the equivalent thermal conductivity of the airgap in the turbulent regime is calculated, and it turned out to be much larger than that of stationary air, which cannot be ignored in high-fidelity modeling. Later, the two-way electromagnetic-thermal coupling method is applied in the multiphysics design. The results show the flow regime of the airgap cannot be neglected in windage loss calculation, and the mechanical loss calculation method put forward in this paper is more complete than previous research. Therefore, it can predict the electromagnetic-thermal performance of high-speed PMSM more accurately. © 2023 IEEE.

关键词： Multiphysics

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：