In this paper we present a practical patternrecognition system that is invariant with respect to translation, scale and rotation of objects. The system is also insensitive to large variations of the threshold used. A...
详细信息
ISBN:
(纸本)0818658274
In this paper we present a practical patternrecognition system that is invariant with respect to translation, scale and rotation of objects. The system is also insensitive to large variations of the threshold used. As feature vectors, Zernike moments are used and we compare them with Hu's seven moment invariants. For a practical machine vision system, three key issues are discussed: pattern normalization, fast computation of Zernike moments, and classification using k-N N rule. As testing results, the system recognizes a set of 62 alphanumeric machine-printed characters with different sizes, at arbitrary orientations, and with different thresholds where the size of the characters varies from 10 × 10 to 512 × 512 pixels.
We propose an automated approach to modeling drainage channels-and, more generally, linear features that lie on the terrain-from multiple images, which results not only in high-resolution, accurate and consistent mode...
详细信息
ISBN:
(纸本)0780342364
We propose an automated approach to modeling drainage channels-and, more generally, linear features that lie on the terrain-from multiple images, which results not only in high-resolution, accurate and consistent models of the features, but also of the surrounding terrain. In our specific case, we have chosen to exploit the fact that rivers flow downhill and lie at the bottom of local depressions in the terrain, valley floors tend to be ''U'' shaped, and the drainage pattern appears as a network of linear features that can be visually detected in single gray level images. Different approaches have explored individual facets of this problem. Ours unifies these elements in a common framework. We accurately model terrain and features as 3-dimensional objects from several information sources that may be in error and inconsistent with one another This approach allows us to generate models that are faithful to sensor data, internally consistent and consistent with physical constraints.
We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCV...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCVRL is capable of learning both semantic and motion patterns. For that, we reformulate the popular shuffling pretext task within a modern contrastive learning paradigm. We show that our transformer-based network has a natural capacity to learn motion in self-supervised settings and achieves strong performance, outperforming CVRL on four benchmarks.
Line art plays a fundamental role in illustration and design, and allows for iteratively polishing designs. However, as they lack color, they can have issues in conveying final designs. In this work, we propose an int...
详细信息
ISBN:
(纸本)9781665448994
Line art plays a fundamental role in illustration and design, and allows for iteratively polishing designs. However, as they lack color, they can have issues in conveying final designs. In this work, we propose an interactive colorization approach based on a conditional generative adversarial network that takes both the line art and color hints as inputs to produce a high-quality colorized image. Our approach is based on a U-net architecture with a multi-discriminator framework. We propose a Concatenation and Spatial Attention module that is able to generate more consistent and higher quality of line art colorization from user given hints. We evaluate on a large-scale illustration dataset and comparison with existing approaches corroborate the effectiveness of our approach.
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla-rbased bird's-eye view representation. The instance classification head learns pairwise affinity between pillars to determine whether the pillars belong to the same instance or not. We further propose a local clustering algorithm to propagate instance ids by merging semantic segmentation and affinity predictions. Our experiments on nuScenes dataset show that our approach outperforms previous proposal-free methods and is comparable to proposal-based methods which requires extra annotation from object detection.
Live demonstration setup. (Left) The setup consists of a DAVIS346B event camera connected to a standard consumer laptop and undergoes some motion. (Right) The motion estimates are plotted in red and, for rotation-like...
详细信息
ISBN:
(纸本)9781665448994
Live demonstration setup. (Left) The setup consists of a DAVIS346B event camera connected to a standard consumer laptop and undergoes some motion. (Right) The motion estimates are plotted in red and, for rotation-like motions, the angular velocities provided by the camera IMU are also plotted in blue. This plot exemplifies an event camera undergoing large rotational motions (up to ~ 1000 deg/s) around the (a) x-axis, (b) y-axis and (c) z-axis. Overall, the incremental motion estimation method follows the IMU measurements. Optionally, the resultant global optical flow can also be shown, as well as the corresponding generated events by accumulating them onto the image plane (bottom left corner).
Distribution shift can have fundamental consequences such as signaling a change in the operating environment or significantly reducing the accuracy of downstream models. Thus, understanding such distribution shifts is...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Distribution shift can have fundamental consequences such as signaling a change in the operating environment or significantly reducing the accuracy of downstream models. Thus, understanding such distribution shifts is critical for examining and hopefully mitigating the effect of such a shift. Most prior work has focused on either natively handling distribution shift (e.g., Domain Generalization) or merely detecting a shift while assuming any detected shift can be understood and handled appropriately by a human operator. For the latter, we hope to aid in these manual mitigation tasks by explaining the distribution shift to an operator. To this end, we suggest two methods: providing a set of interpretable mappings from the original distribution to the shifted one or providing a set of distributional counterfactual examples. We provide preliminary experiments on these two methods, and discuss important concepts and challenges for moving towards a better understanding of image-based distribution shifts.
Honey fraud and adulteration are an increasing concern globally. Hyperspectral imaging and machine learning can detect adulterated honey within a known set of honey, where we have captured data at different sugar conc...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Honey fraud and adulteration are an increasing concern globally. Hyperspectral imaging and machine learning can detect adulterated honey within a known set of honey, where we have captured data at different sugar concentrations. Previous work in this area has used a minimal number of honey types, as sample preparation and data capture is a time-consuming process. This paper develops a new approach using variational autoencoders (VAEs) for generating adulterated honey data for unseen honey types. The results show that the binary adulteration detector can achieve on average 81.3% accuracy on unseen honey types by adding the generated data to the existing training data. Without including the generated data while training, the classifier can only achieve 44% on unseen honey types.
A robust approach for super resolution is presented, which is especially valuable in the presence of outliers. Such outliers may be due to motion err-os, inaccurate blur models, noise, moving objects, motion blur etc....
详细信息
ISBN:
(纸本)0769512720
A robust approach for super resolution is presented, which is especially valuable in the presence of outliers. Such outliers may be due to motion err-os, inaccurate blur models, noise, moving objects, motion blur etc. This robustness is needed since super-resolution methods are very sensitive to such errors. A robust median estimator is combined in an iterative process to achieve a super resolution algorithm. This process can increase resolution even in regions with outliers, where other super. resolution methods actually degrade the image.
Automatic video production of sports aims at producing an aesthetic broadcast of sporting events. We present a new video system able to automatically produce a smooth and pleasant broadcast of Basketball games using a...
详细信息
ISBN:
(数字)9781728193601
ISBN:
(纸本)9781728193601
Automatic video production of sports aims at producing an aesthetic broadcast of sporting events. We present a new video system able to automatically produce a smooth and pleasant broadcast of Basketball games using a single fixed 4K camera. The system automatically detects and localizes players, ball and referees, to recognize main action coordinates and game states yielding to a professional cameraman-like production of the basketball event. We also release a fully annotated dataset consisting of single 4K camera and twelve-camera videos of basketball games.
暂无评论