Operating heavy machinery is challenging and can pose safety hazards for the operator and bystanders. Although commonly used augmented reality (AR) devices, such as head-mounted or head-up displays, can provide occupa...
详细信息
ISBN:
(数字)9798350374490
ISBN:
(纸本)9798350374506
Operating heavy machinery is challenging and can pose safety hazards for the operator and bystanders. Although commonly used augmented reality (AR) devices, such as head-mounted or head-up displays, can provide occupational support to operators, they can also cause problems. Particularly in off-highway scenarios, i.e., when driving machines in bumpy environments, the usefulness of current AR devices and the willingness of operators to wear them are limited. Therefore, we explore how laser-projection-based AR can help the operator facilitate their tasks and enhance safety. For this, we present a compact hardware unit and introduce a flexible and declarative software system. Furthermore, we examine the calibration process to leverage a camera projector setup and outline a process for creating images suitable for display by a laser projector from a set of line segments. Finally, we showcase its ability to provide efficient instructions to operators and bystanders and propose concrete applications for our setup.
For efficient and safe autonomous driving, it is essential that autonomous vehicles can predict the motion of other traffic agents. While highly accurate, current motion prediction models often impose significant chal...
详细信息
ISBN:
(数字)9798350377705
ISBN:
(纸本)9798350377712
For efficient and safe autonomous driving, it is essential that autonomous vehicles can predict the motion of other traffic agents. While highly accurate, current motion prediction models often impose significant challenges in terms of training resource requirements and deployment on embedded hardware. We propose a new efficient motion prediction model, which achieves highly competitive benchmark results while training only a few hours on a single GPU. Due to our lightweight architectural choices and the focus on reducing the required training resources, our model can easily be applied to custom datasets. Furthermore, its low inference latency makes it particularly suitable for deployment in autonomous applications with limited computing resources.
Understanding actions of other agents increases the efficiency of autonomous mobile robots (AMRs) since they encompass intention and indicate future movements. We propose a new method that allows us to infer vehicle a...
详细信息
ISBN:
(数字)9798350384574
ISBN:
(纸本)9798350384581
Understanding actions of other agents increases the efficiency of autonomous mobile robots (AMRs) since they encompass intention and indicate future movements. We propose a new method that allows us to infer vehicle actions using a shallow image-based classification model. The actions are classified via bird’s-eye view scene crops, where we project the detections of a 3D object detection model onto a context map. We learn map context information and aggregate temporal sequence information without requiring object tracking. This results in a highly efficient classification model that can easily be deployed on embedded AMR hardware. To evaluate our approach, we create new large-scale synthetic datasets showing warehouse traffic based on real vehicle models and geometry.
While the majority of recent Multi-View Stereo Networks estimates a depth map per reference image, their performance is then only evaluated on the fused 3D model obtained from all images. This approach makes a lot of ...
While the majority of recent Multi-View Stereo Networks estimates a depth map per reference image, their performance is then only evaluated on the fused 3D model obtained from all images. This approach makes a lot of sense since ultimately the point cloud is the result we are mostly interested in. On the flip side, it often leads to a burdensome manual search for the right fusion parameters in order to score well on the public benchmarks. In this work, we tackle the aforementioned problem with HAMMER, a Hierarchical And Memory-efficient MVSNet with Entropy-filtered Reconstructions. We propose to learn a filtering mask based on entropy, which, in combination with a simple two-view geometric verification, is sufficient to generate high quality 3D models of any input scene. Distinct from existing works, a tedious manual parameter search for the fusion step is not required. Furthermore, we take several precautions to keep the memory requirements for our method very low in the training as well as in the inference phase. Our method only requires 6 GB of GPU memory during training, while 3.6 GB are enough to process 1920×1024 images during inference. Experiments show that HAMMER ranks amongst the top published methods on the DTU and Tanks and Temples benchmarks in the official metrics, especially when keeping the fusion parameters fixed.
Semantic Image Segmentation facilitates a multitude of real-world applications ranging from autonomous driving over industrial process supervision to vision aids for human beings. These models are usually tr...
详细信息
computervision techniques are on the rise for industrial applications, like process supervision and autonomous agents, e.g., in the healthcare domain and dangerous environments. While the general usability of these t...
详细信息
State-of-the-art Multiple Object Tracking (MOT) approaches have shown remarkable performance when trained and evaluated on current benchmarks. However, these benchmarks primarily consist of clear weather scenarios, ov...
详细信息
For efficient and safe autonomous driving, it is essential that autonomous vehicles can predict the motion of other traffic agents. While highly accurate, current motion prediction models often impose significant chal...
详细信息
Virtual Reality (VR) applications constantly strive for more realism, immersion and intuitive user experiences. Traditional VR controllers can hinder full immersion, since they form an additional barrier between the u...
详细信息
Temporal action segmentation in untrimmed videos has gained increased attention recently. However, annotating action classes and frame-wise boundaries is extremely time consuming and cost intensive, especially on larg...
详细信息
暂无评论