Relying on paired synthetic data, existing learning-based Computational Aberration Correction (CAC) methods are confronted with the intricate and multifaceted synthetic-to-real domain gap, which leads to suboptimal pe...
详细信息
vision-based occupancy prediction, also known as 3D Semantic Scene Completion (SSC), presents a significant challenge in computer vision. Previous methods, confined to onboard processing, struggle with simultaneous ge...
详细信息
Online High-Definition (HD) maps have emerged as the preferred option for autonomous driving, overshadowing the counterpart offline HD maps due to flexible update capability and lower maintenance costs. However, conte...
详细信息
Online High-Definition (HD) maps have emerged as the preferred option for autonomous driving, overshadowing the counterpart offline HD maps due to flexible update capability and lower maintenance costs. However, contemporary online HD map models embed parameters of visual sensors into training, resulting in a significant decrease in generalization performance when applied to visual sensors with different parameters. Inspired by the inherent potential of Inverse Perspective Mapping (IPM), where camera parameters are decoupled from the training process, we have designed a universal map generation framework, GenMapping. The framework is established with a triadic synergy architecture, including principal and dual auxiliary branches. When faced with a coarse road image with local distortion translated via IPM, the principal branch learns robust global features under the state space models. The two auxiliary branches are a dense perspective branch and a sparse prior branch. The former exploits the correlation information between static and moving objects, whereas the latter introduces the prior knowledge of OpenStreetMap (OSM). The triple-enhanced merging module is crafted to synergistically integrate the unique spatial features from all three branches. To further improve generalization capabilities, a Cross-View Map Learning (CVML) scheme is leveraged to realize joint learning within the common space. Additionally, a Bidirectional Data Augmentation (BiDA) module is introduced to mitigate reliance on datasets concurrently. A thorough array of experimental results shows that the proposed model surpasses current state-of-the-art methods in both semantic mapping and vectorized mapping, while also maintaining a rapid inference speed. Moreover, in cross-dataset experiments, the generalization of semantic mapping is improved by 17.3% in mIoU, while vectorized mapping is improved by 12.1% in mAP. The source code will be publicly available at https://***/lynn-yu/GenMappin
The success of deep neural networks (DNNs) has promoted the widespread applications of person reidentification (ReID). However, ReID systems inherit the vulnerability of DNNs to malicious attacks of visually inconspic...
详细信息
vision sensors are widely applied in vehicles, robots, and roadside infrastructure. However, due to limitations in hardware cost and system size, camera Field-of-View (FoV) is often restricted and may not provide suff...
详细信息
Simultaneous Localization And Mapping (SLAM) has become a crucial aspect in the fields of autonomous driving and robotics. One crucial component of visual SLAM is the Field-of-View (FoV) of the camera, as a larger FoV...
详细信息
We propose a high-performance glass-plastic hybrid minimalist aspheric panoramic annular lens (ASPAL) to solve several major limitations of the traditional panoramic annular lens (PAL), such as large size, high weight...
详细信息
Learning driving policies using an end-to-end network has been proved a promising solution for autonomous driving. Due to the lack of a benchmark driver behavior dataset that contains both the visual and the LiDAR dat...
详细信息
Learning driving policies using an end-to-end network has been proved a promising solution for autonomous driving. Due to the lack of a benchmark driver behavior dataset that contains both the visual and the LiDAR data, existing works solely focus on learning driving from visual sensors. Besides, most works are limited to predict steering angle yet neglect the more challenging vehicle speed control problem. In this paper, we propose a novel end-to-end network, FlowDriveNet, which takes advantages of sequential visual data and LiDAR data jointly to predict steering angle and vehicle speed. The main challenges of this problem are how to efficiently extract driving-related information from images and point clouds, and how to fuse them effectively. To tackle these challenges, we propose a concept of point flow and declare that image optical flow and LiDAR point flow are significant motion cues for driving policy learning. Specifically, we first create an enhanced dataset that consists of images, point clouds and corresponding human driver behaviors. Then, in FlowDriveNet, a deep but efficient visual feature extraction module and a point feature extraction module are utilized to extract spatial features from optical flow and point flow, respectively. Additionally, a novel temporal fusion and prediction module is designed to fuse temporal information from the extracted spatial feature sequences and predict vehicle driving commands. Finally, a series of ablation experiments verify the importance of optical flow and point flow and comparison experiments show that our flow-based method outperforms the existing image-based approaches on the task of driving policy learning.
This article presents a framework for quadrotors that integrate planning and control, which employs a heuristic depth-first search (HDFS) with data-driven model predictive control (MPC). The proposed framework intends...
详细信息
Temporal information plays a pivotal role in Bird’s-Eye-View (BEV) driving scene understanding, which can alleviate the visual information sparsity. However, the indiscriminate temporal fusion method will cause the b...
详细信息
暂无评论