The NTIRE 2021 workshop features a Multi-modal Aerial View Object Classification Challenge. Its focus is on multi-sensor imagery classification in order to improve the performance of automatic target recognition (ATR)...
详细信息
ISBN:
(纸本)9781665448994
The NTIRE 2021 workshop features a Multi-modal Aerial View Object Classification Challenge. Its focus is on multi-sensor imagery classification in order to improve the performance of automatic target recognition (ATR) systems. In this paper we describe our entry in this challenge, a method focused on efficiency and low computational time, while maintaining a high level of accuracy. The method is a convolutional neural network with 11 convolutions, 1 max pooling layers and 3 residual blocks which has a total of 373.130 parameters. The method ranks 3rd in the Track 2 (SAR+EO) of the challenge.
In stereo algorithms with more than two cameras, the improvement of accuracy is often reported since they are robust against noise. However, another important aspect of the polynocular stereo, that is the ability of o...
详细信息
ISBN:
(纸本)0818672587
In stereo algorithms with more than two cameras, the improvement of accuracy is often reported since they are robust against noise. However, another important aspect of the polynocular stereo, that is the ability of occlusion detection, has been paid less attention. We intensively analyzed the occlusion in the camera matrix stereo (SEA) and developed a simple but effective method to detect the presence of occlusion and to eliminate its effect in the correspondence search. By considering several statistics on the occlusion and the accuracy in the SEA, we derived a few base masks which represent occlusion patterns and are effective for the detection of occlusion. Several experiments using typical indoor scenes showed quite good performance to obtain dense and accurate depth maps even at the occluding boundaries of objects.
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emoti...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emotions, and how emotions are perceived from facial expressions. As computer scientists look to incorporate this research into automatic emotion perception systems, it is important to understand the nature and limitations of human emotion perception. These principles of emotion science affect the way datasets are created, methods are implemented, and results are interpreted in automated emotion perception. This paper aims to distill and align prior work in automated and human facial emotion perception to facilitate future discussions and research at the intersection of the two disciplines.
In this paper we develop a representation for the temporal structure inherent in human actions and demonstrate an effective method for using that representation to detect the occurrence of actions. The temporal struct...
详细信息
ISBN:
(纸本)0818684976
In this paper we develop a representation for the temporal structure inherent in human actions and demonstrate an effective method for using that representation to detect the occurrence of actions. The temporal structure of the action, sub-actions, events, and sensor information is described using a constraint network based on Alien's interval algebra. We map these networks onto a simpler, 3-valued domain (past,now,fut) network - a PNF-network - to allow;fast detection of actions and sub-actions. The occurrence of an action is computed by considering the minimal domain of its PNF-network, under constraints imposed by the current state of the sensors and the previous states of the network. We illustrate the approach with Examples, showing that a major advantage of PNF propagation is the detection and removal of *** situations.
This work analyzes the problem of homography estimation for robust target matching in the context of real-time mobile vision. We present a device-friendly implementation of the Gaussian Elimination algorithm and show ...
详细信息
ISBN:
(纸本)9781479943098
This work analyzes the problem of homography estimation for robust target matching in the context of real-time mobile vision. We present a device-friendly implementation of the Gaussian Elimination algorithm and show that our optimized approach can significantly improve the homography estimation step in a hypothesize-and-verify scheme. Experiments are performed on image sequences in which both speed and accuracy are evaluated and compared with conventional homography estimation schemes.
This paper presents a Markov random field (MRF) model for object recognition in high level vision. The labeling state of a scene in terms of a model object is considered as an MRF or couples MRFs. Within the Bayesian ...
详细信息
ISBN:
(纸本)0818658274
This paper presents a Markov random field (MRF) model for object recognition in high level vision. The labeling state of a scene in terms of a model object is considered as an MRF or couples MRFs. Within the Bayesian framework, the optimal solution is defined as the maximum a posteriori (MAP) estimate of the MRF. The posterior distribution is derived based on sound mathematical principles from theories of MRF and probability, which is in contrast to heuristic formulations. An experimental result is presented.
We describe a monocular real-time computervision system that identifies shopping groups by detecting and tracking multiple people as they wait in a checkout line or service counter. Our system segments each frame int...
详细信息
ISBN:
(纸本)0769512720
We describe a monocular real-time computervision system that identifies shopping groups by detecting and tracking multiple people as they wait in a checkout line or service counter. Our system segments each frame into foreground regions which contains multiple people. Foreground regions are further segmented into individuals using a temporal segmentation of foreground and motion cues. Once a person is detected, an appearance model based on color and edge density in conjunction with a mean-shift tracker is used to recover the person's trajectory. People are grouped together as a shopping group by analyzing interbody distances. The system also monitors the cashier's activities to determine when shopping transactions start and end. Experimental results demonstrate the robustness and real-time performance of the algorithm.
Several papers addressed ellipse detection as a first step for several computervision applications, but most of the proposed solutions are too slow to be applied in real time on large images or with limited hardware ...
详细信息
ISBN:
(纸本)9780769549903
Several papers addressed ellipse detection as a first step for several computervision applications, but most of the proposed solutions are too slow to be applied in real time on large images or with limited hardware resources, as in the case of mobile devices. This demo is based on a novel algorithm for fast and accurate ellipse detection. The proposed algorithm relies on a careful selection of arcs which are candidate to form ellipses and on the use of Hough transform to estimate parameters in a decomposed space. The demo will show it working on a commercial smart-phone.
We present a new approach to the tracking of very non rigid patterns of motion, such as water flowing down a stream. The algorithm is based on a ''disturbance map,'' which is obtained by linearly subtr...
详细信息
ISBN:
(纸本)0780342364
We present a new approach to the tracking of very non rigid patterns of motion, such as water flowing down a stream. The algorithm is based on a ''disturbance map,'' which is obtained by linearly subtracting the temporal average of the previous frames from the new frame. Every local motion creates a disturbance having the form of a wave, with a ''head'' at the present position of the motion and a historical ''tail'' that indicates the previous locations of that motion. These disturbances serve as loci of attraction for ''tracking particles'' that are scattered throughout the image. The algorithm is very fast and can be performed in real time. We provide excellent tracking results on various complex sequences, using both stabilized and moving cameras, showing: a busy ant column, waterfalls. rapids and. flowing streams, shoppers in a mall, and cars in a traffic intersection.
Manufacturing flaws of all types, shapes, and sizes can be exhaustively detected as abnormal pixels, if process and noise variations can be learned at every pixel in the inspection area. This statistical template appr...
详细信息
ISBN:
(纸本)0818684976
Manufacturing flaws of all types, shapes, and sizes can be exhaustively detected as abnormal pixels, if process and noise variations can be learned at every pixel in the inspection area. This statistical template approach to automated visual inspection is extremely fast, effective, and flexible, while achieving false negative rate < 10(-6). Critical to this approach are the following novel features: 1) represent both geometry *** process informations in a model template;2) align 3D surfaces with subpixel accuracy;3) compensate for local deformation and texture;4) estimate bimodal distribution robustly. This novel paradigm was applied to the automatic screening of X-ray images of turbine blades. It has been validated with over 50,000 images and shown to out perform regular inspectors looking at high-pass filtered images.
暂无评论