We propose an automated approach to modeling drainage channels-and, more generally, linear features that lie on the terrain-from multiple images, which results not only in high-resolution, accurate and consistent mode...
详细信息
ISBN:
(纸本)0780342364
We propose an automated approach to modeling drainage channels-and, more generally, linear features that lie on the terrain-from multiple images, which results not only in high-resolution, accurate and consistent models of the features, but also of the surrounding terrain. In our specific case, we have chosen to exploit the fact that rivers flow downhill and lie at the bottom of local depressions in the terrain, valley floors tend to be ''U'' shaped, and the drainage pattern appears as a network of linear features that can be visually detected in single gray level images. Different approaches have explored individual facets of this problem. Ours unifies these elements in a common framework. We accurately model terrain and features as 3-dimensional objects from several information sources that may be in error and inconsistent with one another This approach allows us to generate models that are faithful to sensor data, internally consistent and consistent with physical constraints.
We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCV...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose SCVRL, a novel contrastive-based framework for self-supervised learning for videos. Differently from previous contrast learning based methods that mostly focus on learning visual semantics (e.g., CVRL), SCVRL is capable of learning both semantic and motion patterns. For that, we reformulate the popular shuffling pretext task within a modern contrastive learning paradigm. We show that our transformer-based network has a natural capacity to learn motion in self-supervised settings and achieves strong performance, outperforming CVRL on four benchmarks.
Learned lossy image compression has demonstrated impressive progress via end-to-end neural network training. However, this end-to-end training belies the fact that lossy compression is inherently not differentiable, d...
详细信息
ISBN:
(纸本)9781665448994
Learned lossy image compression has demonstrated impressive progress via end-to-end neural network training. However, this end-to-end training belies the fact that lossy compression is inherently not differentiable, due to the necessity of quantisation. To overcome this difficulty in training, researchers have used various approximations to the quantisation step. However, little work has studied the mechanism of quantisation approximation itself. We address this issue, identifying three gaps arising in the quantisation approximation problem. These gaps are visualised, and show the effect of applying different quantisation approximation methods. Following this analysis, we propose a Soft-STE quantisation approximation method, which closes these gaps and demonstrates better performance than other quantisation approaches on the Kodak dataset.
In this paper, we examine the problem of internet video categorization. Specifically, we explore the representation of a video as a "bag of words" using various combinations of spatial and temporal descripto...
详细信息
ISBN:
(纸本)9781424423392
In this paper, we examine the problem of internet video categorization. Specifically, we explore the representation of a video as a "bag of words" using various combinations of spatial and temporal descriptors. The descriptors incorporate both spatial and temporal gradients as well as optical flow information. We achieve state-of-the-art results on a standard human activity recognition database and demonstrate promising category recognition performance on two new databases of approximately 1000 and 1500 online user-submitted videos, which we will be making available to the community.
Line art plays a fundamental role in illustration and design, and allows for iteratively polishing designs. However, as they lack color, they can have issues in conveying final designs. In this work, we propose an int...
详细信息
ISBN:
(纸本)9781665448994
Line art plays a fundamental role in illustration and design, and allows for iteratively polishing designs. However, as they lack color, they can have issues in conveying final designs. In this work, we propose an interactive colorization approach based on a conditional generative adversarial network that takes both the line art and color hints as inputs to produce a high-quality colorized image. Our approach is based on a U-net architecture with a multi-discriminator framework. We propose a Concatenation and Spatial Attention module that is able to generate more consistent and higher quality of line art colorization from user given hints. We evaluate on a large-scale illustration dataset and comparison with existing approaches corroborate the effectiveness of our approach.
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
We propose a simple yet effective proposal-free architecture for lidar panoptic segmentation. We jointly optimize both semantic segmentation and class-agnostic instance classification in a single network using a pilla-rbased bird's-eye view representation. The instance classification head learns pairwise affinity between pillars to determine whether the pillars belong to the same instance or not. We further propose a local clustering algorithm to propagate instance ids by merging semantic segmentation and affinity predictions. Our experiments on nuScenes dataset show that our approach outperforms previous proposal-free methods and is comparable to proposal-based methods which requires extra annotation from object detection.
When a particular point is fixated by an active stereo system different portions of the world are brought into interocular alignment. This region is known as the horoptor. Through an examination of the horoptor under ...
详细信息
ISBN:
(纸本)0818658274
When a particular point is fixated by an active stereo system different portions of the world are brought into interocular alignment. This region is known as the horoptor. Through an examination of the horoptor under different viewing conditions it is demonstrated that for certain binocular tasks it is desirable to manipulate the horoptor by rotating (torquing) the cameras about their optical axes. This manipulation can be passive for operations such as stereo based obstacle detection for mobile robots, or active for active binocular heads. Techniques for both situations are presented.
Live demonstration setup. (Left) The setup consists of a DAVIS346B event camera connected to a standard consumer laptop and undergoes some motion. (Right) The motion estimates are plotted in red and, for rotation-like...
详细信息
ISBN:
(纸本)9781665448994
Live demonstration setup. (Left) The setup consists of a DAVIS346B event camera connected to a standard consumer laptop and undergoes some motion. (Right) The motion estimates are plotted in red and, for rotation-like motions, the angular velocities provided by the camera IMU are also plotted in blue. This plot exemplifies an event camera undergoing large rotational motions (up to ~ 1000 deg/s) around the (a) x-axis, (b) y-axis and (c) z-axis. Overall, the incremental motion estimation method follows the IMU measurements. Optionally, the resultant global optical flow can also be shown, as well as the corresponding generated events by accumulating them onto the image plane (bottom left corner).
We present a novel algorithm to reconstruct the geometry and photometry of a scene with occlusions from a collection of defocused images. The presence of a finite lens aperture allows us to recover portions of the sce...
详细信息
We present a novel algorithm to reconstruct the geometry and photometry of a scene with occlusions from a collection of defocused images. The presence of a finite lens aperture allows us to recover portions of the scene that would be occluded in a pin-hole projection, thus "uncovering" the occlusion. We estimate the shape of each object (a surface, including the occluding boundaries), and its radiance (a positive function defined on the surface, including portions that are occluded by other objects).
Distribution shift can have fundamental consequences such as signaling a change in the operating environment or significantly reducing the accuracy of downstream models. Thus, understanding such distribution shifts is...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Distribution shift can have fundamental consequences such as signaling a change in the operating environment or significantly reducing the accuracy of downstream models. Thus, understanding such distribution shifts is critical for examining and hopefully mitigating the effect of such a shift. Most prior work has focused on either natively handling distribution shift (e.g., Domain Generalization) or merely detecting a shift while assuming any detected shift can be understood and handled appropriately by a human operator. For the latter, we hope to aid in these manual mitigation tasks by explaining the distribution shift to an operator. To this end, we suggest two methods: providing a set of interpretable mappings from the original distribution to the shifted one or providing a set of distributional counterfactual examples. We provide preliminary experiments on these two methods, and discuss important concepts and challenges for moving towards a better understanding of image-based distribution shifts.
暂无评论