Live demonstration setup. (Left) The setup consists of a DAVIS346B event camera connected to a standard consumer laptop and undergoes some motion. (Right) The motion estimates are plotted in red and, for rotation-like...
详细信息
ISBN:
(纸本)9781665448994
Live demonstration setup. (Left) The setup consists of a DAVIS346B event camera connected to a standard consumer laptop and undergoes some motion. (Right) The motion estimates are plotted in red and, for rotation-like motions, the angular velocities provided by the camera IMU are also plotted in blue. This plot exemplifies an event camera undergoing large rotational motions (up to ~ 1000 deg/s) around the (a) x-axis, (b) y-axis and (c) z-axis. Overall, the incremental motion estimation method follows the IMU measurements. Optionally, the resultant global optical flow can also be shown, as well as the corresponding generated events by accumulating them onto the image plane (bottom left corner).
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set o...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.
The production of thematic maps depicting land cover is one of the most common applications of remote sensing. To this end, several semantic segmentation approaches, based on deep learning, have been proposed in the l...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
The production of thematic maps depicting land cover is one of the most common applications of remote sensing. To this end, several semantic segmentation approaches, based on deep learning, have been proposed in the literature, but land cover segmentation is still considered an open problem due to some specific problems related to remote sensing imaging. In this paper we propose a novel approach to deal with the problem of modelling multiscale contexts surrounding pixels of different land cover categories. The approach leverages the computation of a heteroscedastic measure of uncertainty when classifying individual pixels in an image. This classification uncertainty measure is used to define a set of memory gates between layers that allow a principled method to select the optimal decision for each pixel.
Detecting abnormal events in video sequences is a challenging task that has been broadly investigated over the last decade. The main challenges come from the lack of a clear definition of abnormality and from the scar...
详细信息
ISBN:
(纸本)9781467367592
Detecting abnormal events in video sequences is a challenging task that has been broadly investigated over the last decade. The main challenges come from the lack of a clear definition of abnormality and from the scarcity, of ten absence, of abnormal training samples. To address these two shortages, the computervision community made use of generative models to learn normal behavioral patterns in videos. Then, for each test observation, a (crowd) commotion measure is computed quantifying the deviation from the normal model. In this paper;we evaluated two different families of generative models, namely topic models, representing the standard choice, and the most recent Counting Grids which have never been considered for this task. Moreover;we also extended the 2D Counting Grid, introduced for the analysis of images, to three dimensions, making the model able to capture the spatial-temporal relationships of the videos. In the experimental section, we compared all the approaches on jive challenging sequences showing the superiority of the 3-D counting grid.
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla-rbased bird's-eye view representation. The instance classification head learns pairwise affinity between pillars to determine whether the pillars belong to the same instance or not. We further propose a local clustering algorithm to propagate instance ids by merging semantic segmentation and affinity predictions. Our experiments on nuScenes dataset show that our approach outperforms previous proposal-free methods and is comparable to proposal-based methods which requires extra annotation from object detection.
In this paper we present a unified formulation for a large class of relative pose problems with radial distortion and varying calibration. For minimal cases, we show that one can eliminate the number of parameters dow...
详细信息
ISBN:
(纸本)9781665448994
In this paper we present a unified formulation for a large class of relative pose problems with radial distortion and varying calibration. For minimal cases, we show that one can eliminate the number of parameters down to one to three. The relative pose can then be expressed using varying calibration constraints on the fundamental matrix, with entries that are polynomial in the parameters. We can then apply standard techniques based on the action matrix and Sturm sequences to construct our solvers. This enables efficient solvers for a large class of relative pose problems with radial distortion, using a common framework. We evaluate a number of these solvers for robust two-view inlier and epipolar geometry estimation, used as minimal solvers in RANSAC.
Previous research on localizing a target region in an image referred to by a natural language expression has occurred within an object-centric paradigm. However, in practice, there may not be any easily named or ident...
详细信息
ISBN:
(纸本)9781728193601
Previous research on localizing a target region in an image referred to by a natural language expression has occurred within an object-centric paradigm. However, in practice, there may not be any easily named or identifiable objects near a target location. Instead, references may need to rely on basic visual attributes, such as color or geometric clues. An expression like "a red something beside a blue vertical line" could still pinpoint a target location. As such, we begin to explore the open challenge of computational object-agnostic reference by constructing a novel dataset and by devising a new set of algorithms that can identify a target region in an image when given a referring expression containing only basic conceptual features.
This paper describes the third Affective Behavior Analysis in-the-wild (ABAW) Competition, held in conjunction with ieee International conference on computervision and patternrecognition (CVPR), 2022. The 3rd ABAW C...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
This paper describes the third Affective Behavior Analysis in-the-wild (ABAW) Competition, held in conjunction with ieee International conference on computervision and patternrecognition (CVPR), 2022. The 3rd ABAW Competition is a continuation of the Competitions held at ICCV 2021, ieee FG 2020 and ieee CVPR 2017 conferences, and aims at automatically analyzing affect. This year the Competition encompasses four Challenges: i) uni-task Valence-Arousal Estimation, ii) uni-task Expression Classification, iii) uni-task Action Unit Detection, and iv) MultiTask-Learning. All the Challenges are based on a common benchmark database, Aff-Wild2, which is a large scale in-the-wild database and the first one to be annotated in terms of valence-arousal, expressions and action units. In this paper, we present the four Challenges, with the utilized Competition corpora, we outline the evaluation metrics and present both the baseline systems and the top performing teams' per Challenge. Finally we illustrate the obtained results of the baseline systems and of all participating teams.
In this paper we present an extensive evaluation of instance segmentation in the context of images containing clothes. We propose a multi level evaluation that completes the classical overlapping criteria given by IoU...
详细信息
ISBN:
(纸本)9781665448994
In this paper we present an extensive evaluation of instance segmentation in the context of images containing clothes. We propose a multi level evaluation that completes the classical overlapping criteria given by IoU. In particular, we quantify both the contour and color content accuracy of the the predicted segmentation masks. We demonstrate that the proposed evaluation framework is relevant to obtain meaningful insights on models performance through experiments conducted on five state of the art instance segmentation methods.
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development...
详细信息
ISBN:
(纸本)9798350302493
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development. In this paper, we investigate the impact of image anonymization for training computervision models on key computervision tasks (detection, instance segmentation, and pose estimation). Specifically, we benchmark the recognition drop on common detection datasets, where we evaluate both traditional and realistic anonymization for faces and full bodies. Our comprehensive experiments reflect that traditional image anonymization substantially impacts final model performance, particularly when anonymizing the full body. Furthermore, we find that realistic anonymization can mitigate this decrease in performance, where our experiments reflect a minimal performance drop for face anonymization. Our study demonstrates that realistic anonymization can enable privacy-preserving computervision development with minimal performance degradation across a range of important computervision benchmarks.
暂无评论