The aim of this paper is to demonstrate that a state of the art feature matcher (LoFTR) can be made more robust to rotations by simply replacing the backbone CNN with a steerable CNN which is equivariant to translatio...
详细信息
ISBN:
(纸本)9781665487399
The aim of this paper is to demonstrate that a state of the art feature matcher (LoFTR) can be made more robust to rotations by simply replacing the backbone CNN with a steerable CNN which is equivariant to translations and image rotations. It is experimentally shown that this boost is obtained without reducing performance on ordinary illumination and viewpoint matching sequences.
In the context of variational auto-encoders, learning disentangled latent variable representations remains a challenging problem. In this abstract, we consider the semi-supervised setting, in which the factors of vari...
详细信息
ISBN:
(纸本)9781665448994
In the context of variational auto-encoders, learning disentangled latent variable representations remains a challenging problem. In this abstract, we consider the semi-supervised setting, in which the factors of variation are labelled for a small fraction of our samples. We examine how the quality of learned representations is affected by the dimension of the unsupervised component of the latent space. We also consider a variational lower bound for the mutual information between the data and the semi-supervised component of the latent space, and analyze its role in the context of disentangled representation learning.
Self-attention is a corner stone for transformer models. However, our analysis shows that self-attention in vision transformer inference is extremely sparse. When applying a sparsity constraint, our experiments on ima...
详细信息
ISBN:
(纸本)9781665448994
Self-attention is a corner stone for transformer models. However, our analysis shows that self-attention in vision transformer inference is extremely sparse. When applying a sparsity constraint, our experiments on image (ImageNet-1K) and video (Kinetics-400) understanding show we can achieve 95% sparsity on the self-attention maps while maintaining the performance drop to be less than 2 points. This motivates us to rethink the role of self-attention in vision transformer models.
During the performance optimization of a computervision system, developers frequently run into platform-level inefficiencies and bottlenecks that can not be addressed by traditional methods. OpenVX is designed to add...
详细信息
ISBN:
(纸本)9781479943098
During the performance optimization of a computervision system, developers frequently run into platform-level inefficiencies and bottlenecks that can not be addressed by traditional methods. OpenVX is designed to address such system-level issues by means of a graph-based computation model. This approach differs from the traditional acceleration of one-off functions, and exposes optimization possibilities that might not be available or obvious with traditional computervision libraries such as OpenCV.
Event-based vision, as realized by bio-inspired Dynamic vision Sensors (DVS), is gaining more and more popularity due to its advantages of high temporal resolution, wide dynamic range and power efficiency at the same ...
详细信息
ISBN:
(纸本)9781538607336
Event-based vision, as realized by bio-inspired Dynamic vision Sensors (DVS), is gaining more and more popularity due to its advantages of high temporal resolution, wide dynamic range and power efficiency at the same time. Potential applications include surveillance, robotics, and autonomous navigation under uncontrolled environment conditions. In this paper, we deal with event-based vision for 3D reconstruction of dynamic scene content by using two stationary DVS in a stereo configuration. We focus on a cooperative stereo approach and suggest an improvement over a previously published algorithm that reduces the measured mean error by over 50 percent. An available ground truth data set for stereo event data is utilized to analyze the algorithm's sensitivity to parameter variation and for comparison with competing techniques.
This work analyzes the problem of homography estimation for robust target matching in the context of real-time mobile vision. We present a device-friendly implementation of the Gaussian Elimination algorithm and show ...
详细信息
ISBN:
(纸本)9781479943098
This work analyzes the problem of homography estimation for robust target matching in the context of real-time mobile vision. We present a device-friendly implementation of the Gaussian Elimination algorithm and show that our optimized approach can significantly improve the homography estimation step in a hypothesize-and-verify scheme. Experiments are performed on image sequences in which both speed and accuracy are evaluated and compared with conventional homography estimation schemes.
In this paper we present a new approach for the evaluation of event-based Silicon Retina stereo matching results. The evaluation of stereo matching algorithm results is a necessary task for the development, comparison...
详细信息
ISBN:
(纸本)9780769549903
In this paper we present a new approach for the evaluation of event-based Silicon Retina stereo matching results. The evaluation of stereo matching algorithm results is a necessary task for the development, comparison, and improvement of depth generating camera systems. In contrast to conventional frame-based cameras, the silicon retina sensors delivers asynchronous events instead of synchronous intensity or color images. The polarity of the events represents either an increase (on-event) or a decrease (off-event) of the brightness of the projected scene point. This is the reason why existing ground truth data and evaluation platforms are not suitable for testing silicon retina stereo camera systems. For the analysis of the introduced novel evaluation method, we use an area-based (sum of absolute difference) algorithm for the event-driven sensor system. A conventional video camera stereo vision system is used to produce reference data. The results show that the presented method offers new opportunities for the evaluation of stereo matching results computed from silicon retina stereo data.
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emoti...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emotions, and how emotions are perceived from facial expressions. As computer scientists look to incorporate this research into automatic emotion perception systems, it is important to understand the nature and limitations of human emotion perception. These principles of emotion science affect the way datasets are created, methods are implemented, and results are interpreted in automated emotion perception. This paper aims to distill and align prior work in automated and human facial emotion perception to facilitate future discussions and research at the intersection of the two disciplines.
Trajectory prediction is an important task in autonomous driving. State-of-the-art trajectory prediction models often use attention mechanisms to model the interaction between agents. In this paper, we show that the a...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Trajectory prediction is an important task in autonomous driving. State-of-the-art trajectory prediction models often use attention mechanisms to model the interaction between agents. In this paper, we show that the attention information from such models can also be used to measure the importance of each agent with respect to the ego vehicle's future planned trajectory. Our experiment results on the nuPlans dataset show that our method can effectively find and rank surrounding agents by their impact on the ego's plan.
Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-reso...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
Object recognition on the satellite images is one of the most relevant and popular topics in the problem of patternrecognition. This was facilitated by many factors, such as a high number of satellites with high-resolution imagery, the significant development of computervision, especially with a major breakthrough in the field of convolutional neural networks, a wide range of industry verticals for usage and still a quite empty market. Roads are one of the most popular objects for recognition. In this article, we want to present you the combination of work of neural network and postprocessing algorithm, due to which we get not only the coverage mask but also the vectors of all of the individual roads that are present in the image and can be used to address the higher-level tasks in the future. This approach was used to solve the DeepGlobe Road Extraction Challenge.
暂无评论