检索结果-内蒙古大学图书馆

HAMMER: Learning Entropy Maps to Create Accurate 3D Models in Multi-View Stereo

学校读者我要写书评

暂无评论

HAMMER: Learning Entropy Maps to Create Accurate 3D Models i...

IEEE Workshop on Applications of computer vision (WACV)

作者： Rafael Weilharter Friedrich Fraundorfer Institute of Computer Graphics and Vision Graz University of Technology

While the majority of recent Multi-View Stereo Networks estimates a depth map per reference image, their performance is then only evaluated on the fused 3D model obtained from all images. This approach makes a lot of sense since ultimately the point cloud is the result we are mostly interested in. On the flip side, it often leads to a burdensome manual search for the right fusion parameters in order to score well on the public benchmarks. In this work, we tackle the aforementioned problem with HAMMER, a Hierarchical And Memory-efficient MVSNet with Entropy-filtered Reconstructions. We propose to learn a filtering mask based on entropy, which, in combination with a simple two-view geometric verification, is sufficient to generate high quality 3D models of any input scene. Distinct from existing works, a tedious manual parameter search for the fusion step is not required. Furthermore, we take several precautions to keep the memory requirements for our method very low in the training as well as in the inference phase. Our method only requires 6 GB of GPU memory during training, while 3.6 GB are enough to process 1920×1024 images during inference. Experiments show that HAMMER ranks amongst the top published methods on the DTU and Tanks and Temples benchmarks in the official metrics, especially when keeping the fusion parameters fixed.

关键词：

Into the Fog: Evaluating Robustness of Multiple Object Tracking

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Kirillova, Nadezda Mirza, M. Jehanzeb Bischof, Horst Possegger, Horst Institute of Computer Graphics and Vision Graz University of Technology Graz Austria

State-of-the-art Multiple Object Tracking (MOT) approaches have shown remarkable performance when trained and evaluated on current benchmarks. However, these benchmarks primarily consist of clear weather scenarios, overlooking adverse atmospheric conditions such as fog, haze, smoke and dust. As a result, the robustness of trackers against these challenging conditions remains underexplored. To address this gap, we introduce physics-based volumetric fog simulation method for arbitrary MOT datasets, utilizing frame-by-frame monocular depth estimation and a fog formation optical model. We enhance our simulation by rendering both homogeneous and heterogeneous fog and propose to use the dark channel prior method to estimate atmospheric light, showing promising results even in night and indoor scenes. We present the leading benchmark MOTChallenge (third release) augmented with fog (smoke for indoor scenes) of various intensities and conduct a comprehensive evaluation of MOT methods, revealing their limitations under fog and fog-like challenges. © 2024, CC BY.

关键词： Object tracking

Joint Non-Linear MRI Inversion with Diffusion Priors

学校读者我要写书评

暂无评论

arXiv 2023年

作者： Erlacher, Moritz Zach, Martin Graz Univeristy of Technology Institute of Computer Graphics and Vision Inffeldgasse 16/II Graz8010 Austria

Magnetic resonance imaging (MRI) is a potent diagnostic tool, but suffers from long examination times. To accelerate the process, modern MRI machines typically utilize multiple coils that acquire sub-sampled data in parallel. Data-driven reconstruction approaches, in particular diffusion models, recently achieved remarkable success in reconstructing these data, but typically rely on estimating the coil sensitivities in an off-line step. This suffers from potential movement and misalignment artifacts and limits the application to Cartesian sampling trajectories. To obviate the need for off-line sensitivity estimation, we propose to jointly estimate the sensitivity maps with the image. In particular, we utilize a diffusion model — trained on magnitude images only — to generate high-fidelity images while imposing spatial smoothness of the sensitivity maps in the reverse diffusion. The proposed approach demonstrates consistent qualitative and quantitative performance across different sub-sampling patterns. In addition, experiments indicate a good fit of the estimated coil sensitivities. © 2023, CC BY.

关键词： Magnetic resonance imaging

Efficient Motion Prediction: A Lightweight & Accurate Trajectory Prediction Model With Fast Training and Inference Speed

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Prutsch, Alexander Bischof, Horst Possegger, Horst The Institute of Computer Graphics and Vision Graz University of Technology Austria

For efficient and safe autonomous driving, it is essential that autonomous vehicles can predict the motion of other traffic agents. While highly accurate, current motion prediction models often impose significant challenges in terms of training resource requirements and deployment on embedded hardware. We propose a new efficient motion prediction model, which achieves highly competitive benchmark results while training only a few hours on a single GPU. Due to our lightweight architectural choices and the focus on reducing the required training resources, our model can easily be applied to custom datasets. Furthermore, its low inference latency makes it particularly suitable for deployment in autonomous applications with limited computing resources. © 2024, CC BY.

关键词： Prediction models

TAEC: Unsupervised action segmentation with temporal-Aware embedding and clustering 26

学校读者我要写书评

暂无评论

TAEC: Unsupervised action segmentation with temporal-Aware e...

26th computer vision Winter Workshop, CVWW 2023

作者： Lin, Wei Kukleva, Anna Possegger, Horst Kuehne, Hilde Bischof, Horst Institute of Computer Graphics and Vision Graz University of Technology Austria Christian Doppler Laboratory for Semantic 3D Computer Vision Austria Max-Planck-Institute for Informatics Germany Goethe University Frankfurt Germany

Temporal action segmentation in untrimmed videos has gained increased attention recently. However, annotating action classes and frame-wise boundaries is extremely time consuming and cost intensive, especially on large-scale datasets. To address this issue, we propose an unsupervised approach for learning action classes from untrimmed video sequences. In particular, we propose a temporal embedding network that combines relative time prediction, feature reconstruction, and sequence-To-sequence learning, to preserve the spatial layout and sequential nature of the video features. A two-step clustering pipeline on these embedded feature representations then allows us to enforce temporal consistency within, as well as across videos. Based on the identified clusters, we decode the video into coherent temporal segments that correspond to semantically meaningful action classes. Our evaluation on three challenging datasets shows the impact of each component and, furthermore, demonstrates our state-of-The-Art unsupervised action segmentation results. © 2023 Copyright for this paper by its authors.

关键词： Large dataset

Learned Discretization Schemes for the Second-Order Total Generalized Variation 9th

学校读者我要写书评

暂无评论

Learned Discretization Schemes for the Second-Order Total ...

9th International Conference on Scale Space and Variational Methods in computer vision, SSVM 2023

作者： Bogensperger, Lea Chambolle, Antonin Effland, Alexander Pock, Thomas Institute of Computer Graphics and Vision Graz University of Technology Graz Austria CEREMADE CNRS & Université Paris-Dauphine PSL Paris France Institute for Applied Mathematics University of Bonn Bonn Germany

ISBN: (纸本)9783031319747

The total generalized variation extends the total variation by incorporating higher-order smoothness. Thus, it can also suffer from similar discretization issues related to isotropy. Inspired by the success of novel discretization schemes of the total variation, there has been recent work to improve the second-order total generalized variation discretization, based on the same design idea. In this work, we propose to extend this to a general discretization scheme based on interpolation filters, for which we prove variational consistency. We then describe how to learn these interpolation filters to optimize the discretization for various imaging applications. We illustrate the performance of the method on a synthetic data set as well as for natural image denoising. © 2023, The Author(s), under exclusive license to Springer Nature Switzerland AG.

关键词： Image denoising

DEEP LEARNING-BASED POINT CLOUD REGISTRATION FOR AUGMENTED REALITY-GUIDED SURGERY

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Weber, Maximilian Wild, Daniel Kleesiek, Jens Egger, Jan Gsaxner, Christina Institute of Computer Graphics and Vision Graz University of Technology Austria Germany

Point cloud registration aligns 3D point clouds using spatial transformations. It is an important task in computer vision, with applications in areas such as augmented reality (AR) and medical imaging. This work explores the intersection of two research trends: the integration of AR into image-guided surgery and the use of deep learning for point cloud registration. The main objective is to evaluate the feasibility of applying deep learning-based point cloud registration methods for image-to-patient registration in augmented reality-guided surgery. We created a dataset of point clouds from medical imaging and corresponding point clouds captured with a popular AR device, the HoloLens 2. We evaluate three well-established deep learning models in registering these data pairs. While we find that some deep learning methods show promise, we show that a conventional registration pipeline still outperforms them on our challenging dataset. Copyright © 2024, The Authors. All rights reserved.

关键词： Augmented reality

SAda-Net: A Self-Supervised Adaptive Stereo Estimation CNN For Remote Sensing Image Data

学校读者我要写书评

暂无评论

arXiv 2024年

作者： Hirner, Dominik Fraundorfer, Friedrich Graz University of Technology Institute of Computer Graphics and Vision Austria Germany

Stereo estimation has made many advancements in recent years with the introduction of deep-learning. However the traditional supervised approach to deep-learning requires the creation of accurate and plentiful ground-truth data, which is expensive to create and not available in many situations. This is especially true for remote sensing applications, where there is an excess of available data without proper ground truth. To tackle this problem, we propose a self-supervised CNN with self-improving adaptive abilities. In the first iteration, the created disparity map is inaccurate and noisy. Leveraging the left-right consistency check, we get a sparse but more accurate disparity map which is used as an initial pseudo ground-truth. This pseudo ground-truth is then adapted and updated after every epoch in the training step of the network. We use the sum of inconsistent points in order to track the network convergence. The code for our method is publicly available at: https://***/thedodo/SAda-Net © 2024, CC BY.

关键词： Self-supervised learning

Bigger Isn’t Always Better: Towards a General Prior for Medical Image Reconstruction 46th

学校读者我要写书评

暂无评论

Bigger Isn’t Always Better: Towards a General Prior for M...

46th Annual Conference of the German Association for Pattern Recognition, DAGM-GCPR 2024

作者： Glaszner, Lukas Zach, Martin Institute of Computer Graphics and Vision Graz University of Technology Inffeldgasse 16/II Graz8010 Austria

ISBN: (纸本)9783031851803

Diffusion models have been successfully applied to many inverse problems, including MRI and CT reconstruction. Researchers typically re-purpose models originally designed for unconditional sampling without modifications. Using two different posterior sampling algorithms, we show empirically that such large networks are not necessary. Our smallest model, effectively a ResNet, performs almost as good as an attention U-Net on in-distribution reconstruction, while being significantly more robust towards distribution shifts. Furthermore, we introduce models trained on natural images and demonstrate that they can be used in both MRI and CT reconstruction, out-performing model trained on medical images in out-of-distribution cases. As a result of our findings, we strongly caution against simply re-using very large networks and encourage researchers to adapt the model complexity to the respective task. Moreover, we argue that a key step towards a general diffusion-based prior is training on natural images. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Inverse problems