In this paper, we focus on face recognition over image sets, where each set is represented by a linear subspace. Linear Discriminant Analysis (LDA) is adopted for discriminative learning. After investigating the relat...
详细信息
ISBN:
(纸本)9781424439942
In this paper, we focus on face recognition over image sets, where each set is represented by a linear subspace. Linear Discriminant Analysis (LDA) is adopted for discriminative learning. After investigating the relation between regularization on Fisher Criterion and Maximum Margin Criterion, we present a unified framework for regularized LDA. With the framework, the ratio-form maximization of regularized Fisher LDA can be reduced to the difference form optimization with an additional constraint. By incorporating the empirical loss as the regularization term, we introduce a generalized Square Loss based Regularized LDA (SLR-LDA) with suggestion on parameter setting. Our approach achieves superior performance to the state-of-the-art methods on face recognition. Its effectiveness is also evidently verified in general object and object category recognition experiments.
In this paper we present our approach to the Track 1 of the 2021 AI City Challenge. The goal of the challenge track is to to analyse footage captured with traffic cameras by counting the number of vehicles performing ...
详细信息
ISBN:
(纸本)9781665448994
In this paper we present our approach to the Track 1 of the 2021 AI City Challenge. The goal of the challenge track is to to analyse footage captured with traffic cameras by counting the number of vehicles performing various pre-defined motions of interest. Our approach is based on the CenterTrack object detection and tracking neural network used in conjunction with a simple IoU-based tracking algorithm. In the public evaluation server our system achieved the S1 score of 0.8449 placing it at the 8th place on the public leaderboard.
This paper introduces a novel dataset for video enhancement and studies the state-of-the-art methods of the NTIRE 2021 challenge on quality enhancement of compressed video. The challenge is the first NTIRE challenge i...
详细信息
ISBN:
(纸本)9781665448994
This paper introduces a novel dataset for video enhancement and studies the state-of-the-art methods of the NTIRE 2021 challenge on quality enhancement of compressed video. The challenge is the first NTIRE challenge in this direction, with three competitions, hundreds of participants and tens of proposed solutions. Our newly collected Large-scale Diverse Video (LDV) dataset is employed in the challenge. In our study, we analyze the solutions of the challenges and several representative methods from previous literature on the proposed LDV dataset. We find that the NTIRE 2021 challenge advances the state-of-theart of quality enhancement on compressed video.
Our objective is to model the visual manifold of object appearance corresponding to geometric transformation. We learn a generative model for object appearance where the appearance of the object at each new frame is a...
详细信息
ISBN:
(纸本)0769523722
Our objective is to model the visual manifold of object appearance corresponding to geometric transformation. We learn a generative model for object appearance where the appearance of the object at each new frame is a function that maps from a conceptual representation of the geometric transformation space into the visual manifold. By learning such generative model we can infer the geometric transformation (track) directly from the tracked object appearance. As a result tracking can be achieved in a closed form and therefore can be done very efficiently.
Recent interest in developing online computervision algorithms is spurred in part by a growth of applications capable of generating large volumes of images and videos. These applications are rich sources of images an...
详细信息
ISBN:
(纸本)9781479943098
Recent interest in developing online computervision algorithms is spurred in part by a growth of applications capable of generating large volumes of images and videos. These applications are rich sources of images and video streams. Online vision algorithms for managing, processing and analyzing these streams need to rely upon streaming concepts, such as pipelines, to ensure timely and incremental processing of data. This paper is a first attempt at defining a formal stream algebra that provides a mathematical description of vision pipelines and describes the distributed manipulation of image and video streams. We also show how our algebra can effectively describe the vision pipelines of two state of the art techniques.
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
In this paper we address the problem of unconstrained Word Spotting in scene images. We train a Fully Convolutional Network to produce heatmaps of all the character classes. Then, we employ the Text Proposals approach and, via a rectangle classifier, detect the most likely rectangle for each query word based on the character attribute maps. We evaluate the proposed method on ICDAR2015 and show that it is capable of identifying and recognizing query words in natural scene images.
作者:
Caglioti, VPolitecn Milan
Dipartimento Elettron & Informazione AI & Robot Project I-20133 Milan Italy
The space requirements for indexing under perspecive projections are addressed. It is known that the surface representing the set of possible images of a model point set within the index space must be three-dimensiona...
详细信息
ISBN:
(纸本)0769506623
The space requirements for indexing under perspecive projections are addressed. It is known that the surface representing the set of possible images of a model point set within the index space must be three-dimensional [1]. Under affine projections, the representing surface can be factorized as the cartesian product of lower-dimensional surfaces: these are obtained by projecting the representing surface onto orthogonal subspaces of the index space [2] [5]. This paper shows that, under perspective, such a factorization does not exist, yielding a negative answer to a question left open in [1]. However, it is shown that there exist subspaces of the index space, onto which the representing surface projection is two-dimensional.
Towards the goal of realizing a generic automatic human activity recognition system, a new formalism is proposed. Activities are described by a chained hierarchical representation using three type of entities: image f...
详细信息
ISBN:
(纸本)0769506623
Towards the goal of realizing a generic automatic human activity recognition system, a new formalism is proposed. Activities are described by a chained hierarchical representation using three type of entities: image features, mobile object properties and scenarios. Taking image features of tracked moving regions from an image sequence as input, mobile object properties are first computed by specific methods ods while noise is suppressed by statistical methods. Scenarios are recognized from mobile object properties based on Bayesian analysis. A sequential occurance several scenarios are recognized by an algorithm using a probabilistic finite-state automation (a variant of structured HMM). The demonstration of the optimality of these recognition method is discussed. Finally, the validity and the effectiveness of our approach is demonstrated on both real-world and perturbed data.
Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using ...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Building footprints (BFP) provide useful visual context for users of digital maps when navigating in space. This paper proposes a method for extracting and symbolizing building footprints from satellite imagery using a convolutional neural network (CNN). The CNN architecture outputs rotated rectangles, providing a symbolized approximation that works well for small buildings. Experiments are conducted on the four cities in the DeepGlobe Challenge dataset (Las Vegas, Paris, Shanghai, Khartoum). Our method performs best on suburbs consisting of individual houses. These experiments show that either large buildings or buildings without clear delineation produce weaker results in terms of precision and recall.
Adversarial Training (AT) is crucial for obtaining deep neural networks that are robust to adversarial attacks, yet recent works found that it could also make models more vulnerable to privacy attacks. In this work, w...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Adversarial Training (AT) is crucial for obtaining deep neural networks that are robust to adversarial attacks, yet recent works found that it could also make models more vulnerable to privacy attacks. In this work, we further reveal this unsettling property of AT by designing a novel privacy attack that is practically applicable to the privacy-sensitive Federated Learning (FL) systems. Using our method, the attacker can exploit AT models in the FL system to accurately reconstruct users' private training images even when the training batch size is large. Code is available at https://***/zjysteven/PrivayAttack_AT_FL.
暂无评论