Leveraging available annotated data is an essential component of many modern methods for medical image analysis. In particular, approaches making use of the "neighbourhood" structure between images for this ...
详细信息
Leveraging available annotated data is an essential component of many modern methods for medical image analysis. In particular, approaches making use of the "neighbourhood" structure between images for this purpose have shown significant potential. Such techniques achieve high accuracy in analysing an image by propagating information from its immediate "neighbours" within an annotated database. Despite their success in certain applications, wide use of these methods is limited due to the challenging task of determining the neighbours for an out-of-sample image. This task is either computationally expensive due to large database sizes and costly distance evaluations, or infeasible due to distance definitions over semantic information, such as ground truth annotations, which is not available for out-of-sample images. This article introduces Neighbourhood Approximation Forests (NAFs), a supervised learning algorithm providing a general and efficient approach for the task of approximate nearest neighbour retrieval for arbitrary distances. Starting from an image training database and a user-defined distance between images, the algorithm learns to use appearance-based features to cluster images approximating the neighbourhood structured induced by the distance. NAF is able to efficiently infer nearest neighbours of an out-of-sample image, even when the original distance is based on semantic information. We perform experimental evaluation in two different scenarios: (i) age prediction from brain MRI and (ii) patch-based segmentation of unregistered, arbitrary field of view CT images. The results demonstrate the performance, computational benefits, and potential of NAF for different image analysis applications. (C) 2013 Elsevier B.V. All rights reserved.
We present a massively parallel algorithm for the fused lasso, powered by a multiple number of graphics processing units (GPUs). Our method is suitable for a class of large-scale sparse regression problems on which a ...
详细信息
We present a massively parallel algorithm for the fused lasso, powered by a multiple number of graphics processing units (GPUs). Our method is suitable for a class of large-scale sparse regression problems on which a two-dimensional lattice structure among the coefficients is imposed. This structure is important in many statistical applications, including image-based regression in which a set of images are used to locate image regions predictive of a response variable such as human behavior. Such large datasets are increasingly common. In our study, we employ the split Bregman method and the fast Fourier transform, which jointly have a high data-level parallelism that is distinct in a two-dimensional setting. Our multi-GPU parallelization achieves remarkably improved speed. Specifically, we obtained as much as 433 times improved speed over that of the reference CPU implementation. We demonstrate the speed and scalability of the algorithm using several datasets, including 8100 samples of 512 x 512 images. Compared to the single GPU counterpart, our method also showed improved computing speed as well as high scalability. We describe the various elements of our study as well as our experience with the subtleties in selecting an existing algorithm for parallelization. It is critical that memory bandwidth be carefully considered for multi-GPU algorithms. Supplementary material for this article is available online.
The Abstractions of the game of football serve as well-known challenges in AI research. A particularly accessible abstraction is the game of Foosball where one team is operated by an AI agent while the other side is c...
详细信息
The Abstractions of the game of football serve as well-known challenges in AI research. A particularly accessible abstraction is the game of Foosball where one team is operated by an AI agent while the other side is controlled by humans. In Foosball, the dynamics can be described by a few descriptive parameters, namely the shift and rotation of the corresponding rods plus the position of the ball. In this work, we present a Computer Vision based real time game state detector in a real-world setup with an automated Foosball table (constructed by Bosch Rexroth AG). More precisely, we train an object detector network based on YOLOX to detect the positions of the figures and an image regressor network based on ResNet18 to predict the rotation angles. For the derivation of the training data we propose a semi-supervised labeling approach based on classical Computer Vision. We evaluate the proposed approach and find that our methodology works in the sense of a proof of concept. The resulting prototype generated promising results with low inference times meeting our real time requirement of 60 fps.
暂无评论