检索结果-内蒙古大学图书馆

arXiv 2024年

作者： Braun, Max Jaquier, Noémie Rozo, Leonel Asfour, Tamim Institute for Anthropomatics and Robotics Karlsruhe Institute of Technology Karlsruhe Germany Bosch Center for Artificial Intelligence Renningen Germany

We introduce Riemannian Flow Matching Policies (RFMP), a novel model for learning and synthesizing robot sensorimotor policies. RFMP leverages the efficient training and inference capabilities of flow matching methods. By design, RFMP inherits the strengths of flow matching: the ability to encode high-dimensional multimodal distributions, commonly encountered in robotic tasks, and a very simple and fast inference process. We demonstrate the applicability of RFMP to both state-based and vision-conditioned robot motion policies. Notably, as the robot state resides on a Riemannian manifold, RFMP inherently incorporates geometric awareness, which is crucial for realistic robotic tasks. To evaluate RFMP, we conduct two proof-of-concept experiments, comparing its performance against Diffusion Policies. Although both approaches successfully learn the considered tasks, our results show that RFMP provides smoother action trajectories with significantly lower inference times. Copyright © 2024, The Authors. All rights reserved.

关键词： Adversarial machine learning

来源：评论

学校读者我要写书评

暂无评论

Task-Driven Exploration: Decoupling and Inter-Task Feedback for Joint Moment Retrieval and Highlight Detection

arXiv

引用

arXiv 2024年

作者： Yang, Jin Wei, Ping Li, Huan Ren, Ziyang National Key Laboratory of Human-Machine Hybrid Augmented Intelligence Institute of Artificial Intelligence and Robotics Xi'an Jiaotong University China

Video moment retrieval and highlight detection are two highly valuable tasks in video understanding, but until recently they have been jointly studied. Although existing studies have made impressive advancement recently, they predominantly follow the data-driven bottom-up paradigm. Such paradigm overlooks task-specific and inter-task effects, resulting in poor model performance. In this paper, we propose a novel task-driven top-down framework TaskWeave for joint moment retrieval and highlight detection. The framework introduces a task-decoupled unit to capture task-specific and common representations. To investigate the interplay between the two tasks, we propose an inter-task feedback mechanism, which transforms the results of one task as guiding masks to assist the other task. Different from existing methods, we present a task-dependent joint loss function to optimize the model. Comprehensive experiments and in-depth ablation studies on QVHighlights, TVSum, and Charades-STA datasets corroborate the effectiveness and flexibility of the proposed framework. Codes are available at ***/EdenGabriel/TaskWeave. Copyright © 2024, The Authors. All rights reserved.

关键词： Machine learning

来源：评论

学校读者我要写书评

暂无评论

Mars Planner: Improved Batch Spatio-Temporal Path Planning for Multi-Ackerman Robotic Systems

Mars Planner: Improved Batch Spatio-Temporal Path Planning f...

引用

International Conference on Intelligent Transportation

作者： Song Guo Shen'ao Wang Junjie He Liming Chen Hang Wang Hongbin Sun School of Microelectronics Xi'an Jiaotong University Xi'an Shaanxi China Institute of Artificial Intelligence and Robotics College of Artificial Intelligence Xi'an Jiaotong University Xi'an Shaanxi

ISBN: (数字)9798331505929

ISBN: (纸本)9798331505936

This paper introduces an innovative multi-agent path finding (MAPF) system specifically designed for navigating multi-Ackerman robotic systems in intricate environments. The Mars Planner, the proposed solution, enhances path planning by tackling collision-free path challenges encountered by groups of intelligent agents. Our contributions include the development of two key algorithms: the Fast Batch Path Finding (FBPF) and the Batch Spatio-Temporal Path Refinement (BSTPR). FBPF utilizes a hybrid A* approach to generate preliminary coarse paths within free configuration spaces, while BSTPR refines these paths using topological homotopy strategies to optimize time allocation and effectively resolve internal conflicts. Through simulations and physical experiments, we demonstrate significant enhancements in computational efficiency and path quality compared to existing methods. In conclusion, the Mars Planner stands as an efficient solution capable of managing large-scale complexity in real-world applications. It offers a robust and scalable framework suitable for diverse environments and scenarios.

关键词： Navigation Computational modeling Kinematics Path planning Hybrid power systems Planning Resource management Intelligent agents Robots Optimization

来源：评论

学校读者我要写书评

暂无评论

IPFS Viewer: IoT Surveillance Camera System Using IPFS and MQTT

IPFS Viewer: IoT Surveillance Camera System Using IPFS and M...

引用

2024 IEEE International Conference on Consumer Electronics, ICCE 2024

作者： Kim, Woojae Kwak, Aheun Yoo, Byounghyun Ko, Heedong Korea Institute of Science Technology Center for Artificial Intelligence 5 Hwarangro 14-gil Seongbuk-gu Seoul02792 Korea Republic of University of Science Technology AI-Robotics KIST School Seoul02792 Korea Republic of

ISBN: (纸本)9798350324136

Surveillance cameras play a pivotal role across various domains, encompassing public safety, crime deterrence, and facility maintenance. Nevertheless, these systems entail certain limitations, including high costs, security concerns, legal evidence requirements, and data-safeguarding imperatives. To address these challenges, this paper introduces the Internet of Things (IoT)-based IPFS Viewer system using IPFS and MQTT technologies. The proposed system is designed to utilize Docker containers and Kubernetes for simplified operational efficiency, and explains how it satisfies the ISO/IEC 25010 standard. In conclusion, the proposed system represents a comprehensive initiative to augment the operational efficiency of surveillance camera systems. © 2024 IEEE.

关键词： IoT IPFS IPFS viewer MQTT Surveillance camera

来源：评论

学校读者我要写书评

暂无评论

Quantized Distillation: Optimizing Driver Activity Recognition Models for Resource-Constrained Environments

Quantized Distillation: Optimizing Driver Activity Recogniti...

引用

IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

作者： Calvin Tanama Kunyu Peng Zdravko Marinov Rainer Stiefelhagen Alina Roitberg Institute for Anthropomatics and Robotics Karlsruhe Institute of Technology Institute for Artificial Intelligence University of Stuttgart

Deep learning-based models are at the top of most driver observation benchmarks due to their remarkable accuracies but come with a high computational cost, while the resources are often limited in real-world driving scenarios. This paper presents a lightweight framework for resource- efficient driver activity recognition. We enhance 3D MobileNet, a speed-optimized neural architecture for video classification, with two paradigms for improving the trade-off between model accuracy and computational efficiency: knowledge distillation and model quantization. Knowledge distillation prevents large drops in accuracy when reducing the model size by harvesting knowledge from a large teacher model (I3D) via soft labels instead of using the original ground truth. Quantization further drastically reduces the memory and computation requirements by representing the model weights and activations using lower precision integers. Extensive experiments on a public dataset for in-vehicle monitoring during autonomous driving show that our proposed framework leads to an 3- fold reduction in model size and 1.4-fold improvement in inference time compared to an already speed-optimized architecture. Our code is available at https://***/calvintanama/qd-driver-activity-reco.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Design Hybrid Computing Architecture for Accelerating Point Cloud Registration

Design Hybrid Computing Architecture for Accelerating Point ...

引用

IEEE Symposium on Intelligent Vehicle

作者： Xiao Wang Xiaodong Deng Yingxiang Li Shitao Chen Longjun Liu Nanning Zheng Institute of Artificial Intelligence and Robotics Xi’an Jiaotong University Xi’an China

High-precision simultaneous localization and mapping (SLAM) is one of the core technologies of unmanned driving. LiDAR-based SLAM algorithms are often complex and computationally intensive, and usually are deployed on high performance CPU or GPU computing architecture with high power consumption and low energy efficiency ratio, which is not conducive to vehicle-level applications. In this paper, we design and implement a low power CPU and FPGA hybrid computing architecture for accelerating the key algorithm of LiDAR-based localization scheme. More specifically, we propose a software and hardware co-design strategy: (1) we first propose chain representation as a new type of map representation, which uses the depth discontinuity region as the segmentation location to segment the point cloud data. Our method not only reduces noise issues for down-sampling operation in point cloud representation, but also has the same computational and storage overhead as point cloud representation. (2) We further exploit the inherent parallelism in the algorithms to design a pipeline hardware architecture, which can effectively improve the speed of the algorithm in the embedded platform. Deployed on the Xilinx ZCU102 platform, our system achieves 24.4x and 3.2x speedups compared to the ARM Cortex A53 processor and the Intel i7-10700 processor, respectively, at 4.204W power consumption without severely degrading the final output quality.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Cooperative Multi-source Data Trading

Cooperative Multi-source Data Trading

引用

2024 IEEE Global Communications Conference, GLOBECOM 2024

作者： Cheng, Jin Ding, Ningning Lui, John C. S. Huang, Jianwei The Chinese University of Hong Kong School of Science and Engineering Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen China Hong Kong University of Science and Technology Data Science and Analytics Thrust Information Hub Guangzhou China The Chinese University of Hong Kong Department of Computer Science and Engineering Hong Kong The Chinese University of Hong Kong School of Science and Engineering Shenzhen Institute of Artificial Intelligence and Robotics for Society Shenzhen Key Laboratory of Crowd Intelligence Empowered Low-Carbon Energy Network Csijri Joint Research Centre on Smart Energy Storage Shenzhen China

ISBN: (纸本)9798350351255

In the era of big data, data trading significantly enhances data-driven technologies by facilitating data sharing. Despite the clear advantages often experienced by data users when incorporating multiple sources, the topic of multi-source data trading remains largely unexplored. This paper designs a novel data trading framework, which enables multi-source data trading through multi-source cooperation. The proposed framework aims to improve data usage efficiency and increase seller revenue. In particular, we model data sellers' cooperative decisions through the Nash bargaining framework and systematically outline the interactions between sellers and buyers as a two-stage Stackelberg game. A key contribution of this work is the consideration of coupling among diverse data products, which is essential but often overlooked in prior studies. We properly classify data's utility into endogenous and relational categories to disentangle the coupling. Despite the inherent non-convex nature of the optimization problem, we methodically derive the closed-form optimal solutions by decomposing the problem into several subproblems. Interestingly, we reveal that, under our proposed framework, sellers' revenue initially remains steady with the increase of product coupling level, but begins to rise once the level exceeds a certain threshold due to the substitute effect. Finally, experimental results show that our proposed framework can improve the seller's profit by up to 46.32% compared to traditional data trading methods in the current data market. © 2024 IEEE.

关键词： Marketplaces

来源：评论

学校读者我要写书评

暂无评论

DO GENERATED DATA ALWAYS HELP CONTRASTIVE LEARNING?

arXiv

引用

arXiv 2024年

作者： Wang, Yifei Zhang, Jizhe Wang, Yisen School of Mathematical Sciences Peking University China Institute of Artificial Intelligence and Robotics Xi’an Jiaotong University China National Key Lab of General Artificial Intelligence School of Intelligence Science and Technology Peking University China Institute for Artificial Intelligence Peking University China

Contrastive Learning (CL) has emerged as one of the most successful paradigms for unsupervised visual representation learning, yet it often depends on intensive manual data augmentations. With the rise of generative models, especially diffusion models, the ability to generate realistic images close to the real data distribution has been well recognized. These generated high-equality images have been successfully applied to enhance contrastive representation learning, a technique termed "data inflation". However, we find that the generated data (even from a good diffusion model like DDPM) may sometimes even harm contrastive learning. We investigate the causes behind this failure from the perspective of both data inflation and data augmentation. For the first time, we reveal the complementary roles that stronger data inflation should be accompanied by weaker augmentations, and vice versa. We also provide rigorous theoretical explanations for these phenomena via deriving its generalization bounds under data inflation. Drawing from these insights, we propose Adaptive Inflation (AdaInf), a purely data-centric strategy without introducing any extra computation cost. On benchmark datasets, AdaInf can bring significant improvements for various contrastive learning methods. Notably, without using external data, AdaInf obtains 94.70% linear accuracy on CIFAR-10 with SimCLR, setting a new record that surpasses many sophisticated methods. Code is available at https://***/PKU-ML/adainf. © 2024, CC BY.

关键词： Image enhancement

来源：评论

学校读者我要写书评

暂无评论

Auto Data Augmentation for Image: A Brief Survey 4

Auto Data Augmentation for Image: A Brief Survey

引用

4th International Conference on Electrical, Communication and Computer Engineering, ICECCE 2023

作者： Xia, Xuan Zhang, Jingfei He, Xing Tong, Haoran Zhang, Xiaoguang Li, Nan Ding, Ning Shenzhen Institute of Artificial Intelligence and Robotics for Society Chinese University of Hong Kong Shenzhen Shenzhen China School of Science and Engineering Chinese University of Hong Kong Shenzhen Shenzhen China

ISBN: (纸本)9798350369694

Auto data augmentation has emerged as a promising alternative to the laborious manual parameter tuning involved in data augmentation policies. However, the existing approaches have limitations in terms of their applicability to a restricted range of models, datasets, and tasks. In this brief survey, we identify and discuss four key issues that auto data augmentation needs to tackle. We categorize auto data augmentation into two main types: closed-loop/open-loop auto data augmentation and online/offline auto data augmentation. Each of these categories is examined in detail, with a comprehensive analysis of their respective performance. Furthermore, we highlight the challenges that the field of auto data augmentation faces and offer insights into potential research directions. Our survey serves as a valuable resource for researchers seeking a deeper understanding of the evolving landscape of auto data augmentation, providing inspiration and guidance for future work in this area. © 2023 IEEE.

关键词： Computer vision

来源：评论

学校读者我要写书评

暂无评论

SCTF-Det: Siamese Center-Based Detector with Transformer and Feature Fusion for Object-Level Change Detection

SCTF-Det: Siamese Center-Based Detector with Transformer and...

引用

Chinese Automation Congress (CAC)

作者： Jiaxin Huo Lihang Sun Jianyi Liu Institute of Artificial Intelligence and Robotics Xi'an Jiaotong University Xi'an China

Current Scene Change Detection(SCD) methods are widely used in various subject areas, with detection granularity mostly limited to pixel-level. However, for certain practical applications such as garbage detection and traffic monitoring, the overall changes of object-level instances are more concerned so that fine-grained results may not be necessary, incurring excessive computational redundancy and insufficient real-time performance. To address the issue, we propose a one-stage object-level change detection framework named Siamese Center-Based Detector with Transformer and Feature Fusion (SCTF-Det), aiming at using less computing resources while still obtaining object-level change information, such as appearance or disappearance of objects. We adopt Siamese Vision Transformer to efficiently capture global semantic features and design differential feature fusion and multi-scale fusion to better fuse the features coming from image pairs. Instead of using a segmentation head like most SCD methods, we use a detection head to capture changed objects or regions. Moreover, we introduce a gating mechanism in image pairs and automatically mark the bounding box on the corresponding “Appear” change region. The experiments are conducted on VL-CMU-CD and CDNet2014 datasets, with Fl scores of 78.6% and 83.6% respectively. Our SCTF-Det substantially improves inference speed by 3–5 times compared to the existing methods.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：