Recognizing human-object interactions is challenging due to their spatio-temporal changes. We propose the Spatio-Temporal Interaction Transformer-based (STIT) network to reason such changes. Specifically, spatial tran...
详细信息
ISBN:
(数字)9781665490627
ISBN:
(纸本)9781665490627
Recognizing human-object interactions is challenging due to their spatio-temporal changes. We propose the Spatio-Temporal Interaction Transformer-based (STIT) network to reason such changes. Specifically, spatial transformers learn humans and objects context at specific frame time. Temporal transformer then learns the relations at a higher level between spatial context representations at different time steps, capturing long-term dependencies across frames. We further investigate multiple hierarchy designs in learning human interactions. We achieved superior performance on Charades, Something-Something v1 and CAD-120 datasets, comparing to baseline models without learning human-object relations, or with prior graph-based networks. We also achieved state-of-the-art accuracy of 95.93% on CAD-120 dataset [1] by employing RGB data only.
Irregular pyramids are made of a stack of successively reduced graphs embedded in the plane. Such pyramids are often used within the segmentation and the connected component analysis frameworks to detect meaningful ob...
详细信息
ISBN:
(纸本)3540252703
Irregular pyramids are made of a stack of successively reduced graphs embedded in the plane. Such pyramids are often used within the segmentation and the connected component analysis frameworks to detect meaningful objects together withtheir spatial and topological relationships. the graphs reduced in the pyramid may be region adjacency graphs, dual graphs or combinatorial maps. Using any of these graphs each vertex of a reduced graph encodes a region of the image. Using simple graphs one edge between two vertices encodes the existence of a common boundary between two regions. Using dual graphs and combinatorial maps, each connected boundary segment between two regions is associated to one edge. Moreover, special edges called loops may be used to differentiate a special type of adjacency where one region surrounds the other. We show in this article that the loop information does not allow to distinguish inside and outside of the loop by local computations. We provide a method based on the combinatorial pyramid framework which uses the orientation explicitly encoded by combinatorial maps to determine inside and outside with local calculus.
In this paper we describe modifications of irregular image segmentation pyramids based on user-interaction. We first build a hierarchy of segmentations by the minimum spanning tree based method, then regions from diff...
详细信息
ISBN:
(纸本)9783642208447
In this paper we describe modifications of irregular image segmentation pyramids based on user-interaction. We first build a hierarchy of segmentations by the minimum spanning tree based method, then regions from different (granularity) levels are combined to a final (better) segmentation with user-specified operations guiding the segmentation process. based on these operations the users can produce a final image segmentation that best suits their applications. this work can be used for applications where we need accuracy in image segmentation, in annotating images or creating ground truth among others.
In recent years graph embedding has emerged as a promising solution for enabling the expressive, convenient, powerful but computational expensive graphbasedrepresentations to benefit from mature, less expensive and ...
详细信息
this book constitutes the refereed proceedings of the 12th IAPR-TC-15 internationalworkshop on graph-based Representation in patternrecognition, GbRPR 2019, held in Tours, France, in June 2019.;the 22 full...
详细信息
ISBN:
(数字)9783030200817
ISBN:
(纸本)9783030200800
this book constitutes the refereed proceedings of the 12th IAPR-TC-15 internationalworkshop on graph-based Representation in patternrecognition, GbRPR 2019, held in Tours, France, in June 2019.;the 22 full papers included in this volume together with an invited talk were carefully reviewed and selected from 28 submissions. the papers discuss research results and applications at the intersection of patternrecognition, image analysis, and graphtheory. they cover topics such as graph edit distance, graph matching, machine learning for graph problems, network and graph embedding, spectral graph problems, and parallel algorithms for graph problems.
this book constitutes the refereed proceedings of the 9th IAPR-TC-15 internationalworkshop on graph-basedrepresentations in patternrecognition, GbRPR 2013, held in Vienna, Austria, in May 2013. the 24 papers presen...
详细信息
ISBN:
(数字)9783642382215
ISBN:
(纸本)9783642382208
this book constitutes the refereed proceedings of the 9th IAPR-TC-15 internationalworkshop on graph-basedrepresentations in patternrecognition, GbRPR 2013, held in Vienna, Austria, in May 2013.
the 24 papers presented in this volume were carefully reviewed and selected from 27 submissions. they are organized in topical sections named: finding subregions in graphs; graph matching; classification; graph kernels; properties of graphs; topology; graphrepresentations, segmentation and shape; and search in graphs.
the proceedings contain 113 papers. the special focus in this conference is on Structural Matching, Grammatical Inference and recognition of 2D and 3D Objects. the topics include: Error-tolerant graph matching;semanti...
ISBN:
(纸本)3540648585
the proceedings contain 113 papers. the special focus in this conference is on Structural Matching, Grammatical Inference and recognition of 2D and 3D Objects. the topics include: Error-tolerant graph matching;semantic content based image retrieval using object-process diagrams;patternrecognition methods in image and video databases;efficient matching with invariant local descriptors;integrating numerical and syntactic learning models for patternrecognition;synthesis of function-described graphs;marked subgraph isomorphism of ordered graphs;distance evaluation in pattern matching based on frontier topological graph;syntactic interpolation of fractal sequences;minimizing the topological structure of line images;genetic algorithms for structural editing;the noisy subsequence tree recognition problem;object recognition from large structural libraries;acquisition of 2-d shape models from scenes with overlapping objects using string matching;a taxonomy of occlusion in view signature ii representations;a survey of non-thinning based vectorization methods;a benchmark for raster to vector conversion systems;network-basedrecognition of architectural symbols;recovering image structure by model-based interaction map;an improved scheme to fingerprint classification;character recognition with k-head finite array automata;using semantics in matching cursive chinese handwritten annotations;concavity detection using a binary mask-based approach;structural indexing of line pictures with feature generation models;nonlinear covariance for multi-band image data;a neural network for image smoothing and segmentation;prototyping structural descriptions and neural network based learning of local compatibilities for segment grouping.
We introduce a method for computing homology groups and their generators of a 2D image, using a hierarchical structure i.e. irregular graph pyramid. Starting from an image, a hierarchy of the image is built, by two op...
详细信息
ISBN:
(纸本)9783540729020
We introduce a method for computing homology groups and their generators of a 2D image, using a hierarchical structure i.e. irregular graph pyramid. Starting from an image, a hierarchy of the image is built, by two operations that preserve homology of each region. Instead of computing homology generators in the base where the number of entities (cells) is large, we first reduce the number of cells by a graph pyramid. then homology generators are computed efficiently on the top level of the pyramid, since the number of cells is small, and a top down process is then used to deduce homology generators in any level of the pyramid, including the base level i.e. the initial image. We show that the new method produces valid homology generators and present some experimental results.
A digital image can be perceived as a 2.5D surface consisting of pixel coordinates and the intensity of pixel as height of the point in the surface. Such surfaces can be efficiently represented by the pair of dual pla...
详细信息
ISBN:
(数字)9783030200817
ISBN:
(纸本)9783030200817;9783030200800
A digital image can be perceived as a 2.5D surface consisting of pixel coordinates and the intensity of pixel as height of the point in the surface. Such surfaces can be efficiently represented by the pair of dual plane graphs: neighborhood (primal) graph and its dual. By defining orientation of edges in the primal graph and use of Local Binary Patters (LBPs), we can categorize the vertices corresponding to the pixel into critical (maximum, minimum, saddle) or slope points. Basic operation of contraction and removal of edges in primal graph result in configuration of graphs with different combinations of critical and non-critical points. the faces of graph resemble a slope region after restoration of the continuous surface by successive monotone cubic interpolation. In this paper, we define orientation of edges in the dual graph such that it remains consistent withthe primal graph. Further we deliver the necessary and sufficient conditions for merging of two adjacent slope regions.
We compare different statistical characterizations of a set of strings, for three different histogram-based distances. Given a distance, a set of strings may be characterized by its generalized median, i.e., the strin...
详细信息
ISBN:
(纸本)9783540729020
We compare different statistical characterizations of a set of strings, for three different histogram-based distances. Given a distance, a set of strings may be characterized by its generalized median, i.e., the string -over the set of all possible strings- that minimizes the sum of distances to every string of the set, or by its set median, i.e., the string of the set that minimizes the sum of distances to every other string of the set. For the first two histogram-based distances, we show that the generalized median string can be computed efficiently;for the third one, which biased histograms with individual substitution costs, we conjecture that this is a NP-hard problem, and we introduce two different heuristic algorithms for approximating it. We experimentally compare the relevance of the three histogram-based distances, and the different statistical characterizations of sets of strings, for classifying images that are represented by strings.
暂无评论