The land cover classification task of the DeepGlohe Challenge presents significant obstacles even to state of the art segmentation models due to a small amount of data, incomplete and sometimes incorrect labeling, and...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
The land cover classification task of the DeepGlohe Challenge presents significant obstacles even to state of the art segmentation models due to a small amount of data, incomplete and sometimes incorrect labeling, and highly imbalanced classes. In this work, we show an approach based on the U-Net architecture with the Lovcisz-Softmax loss that successfully alleviates these problems: we compare several different convolutional architectures for U-Net encoders.
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set o...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emoti...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Understanding the complex relationship between emotions and facial expressions is important for both psychologists and computer scientists. A large body of research in psychology investigates facial expressions, emotions, and how emotions are perceived from facial expressions. As computer scientists look to incorporate this research into automatic emotion perception systems, it is important to understand the nature and limitations of human emotion perception. These principles of emotion science affect the way datasets are created, methods are implemented, and results are interpreted in automated emotion perception. This paper aims to distill and align prior work in automated and human facial emotion perception to facilitate future discussions and research at the intersection of the two disciplines.
Existing computervision research in artwork struggles with artwork's fine-grained attributes recognition and lack of curated annotated datasets due to their costly creation. In this work, we use CLIP (Contrastive...
详细信息
ISBN:
(纸本)9781665448994
Existing computervision research in artwork struggles with artwork's fine-grained attributes recognition and lack of curated annotated datasets due to their costly creation. In this work, we use CLIP (Contrastive Language-Image Pre-Training) [12] for training a neural network on a variety of art images and text pairs, being able to learn directly from raw descriptions about images, or if available, curated labels. Model's zero-shot capability allows predicting the most relevant natural language description for a given image, without directly optimizing for the task. Our approach aims to solve 2 challenges: instance retrieval and fine-grained artwork attribute recognition. We use the iMet Dataset [20], which we consider the largest annotated artwork dataset. Our code and models will be available at https://***/KeremTurgutlu/clip_art
The use of 3D technologies to represent elements and interact with them is an open and interesting research area. In this article we discuss a novel human computer interaction method that integrates mobile computing a...
详细信息
ISBN:
(纸本)9780769549903
The use of 3D technologies to represent elements and interact with them is an open and interesting research area. In this article we discuss a novel human computer interaction method that integrates mobile computing and 3D visualization techniques with applications on free viewpoint visualization and 3D rendering for interactive and realistic environments. Especially this approach is focused on augmented reality and home entertainment and it was developed and tested on mobiles and particularly on tablet computers. Finally, an evaluation mechanism on the accuracy of this interaction system is presented.
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development...
详细信息
ISBN:
(纸本)9798350302493
Image anonymization is widely adapted in practice to comply with privacy regulations in many regions. However, anonymization often degrades the quality of the data, reducing its utility for computervision development. In this paper, we investigate the impact of image anonymization for training computervision models on key computervision tasks (detection, instance segmentation, and pose estimation). Specifically, we benchmark the recognition drop on common detection datasets, where we evaluate both traditional and realistic anonymization for faces and full bodies. Our comprehensive experiments reflect that traditional image anonymization substantially impacts final model performance, particularly when anonymizing the full body. Furthermore, we find that realistic anonymization can mitigate this decrease in performance, where our experiments reflect a minimal performance drop for face anonymization. Our study demonstrates that realistic anonymization can enable privacy-preserving computervision development with minimal performance degradation across a range of important computervision benchmarks.
This paper addresses large-displacement-diffeomorphic mapping registration from an optimal control perspective. This viewpoint leads to two complementary formulations. One approach requires the explicit computation of...
详细信息
ISBN:
(纸本)9781424439942
This paper addresses large-displacement-diffeomorphic mapping registration from an optimal control perspective. This viewpoint leads to two complementary formulations. One approach requires the explicit computation of coordinate maps, whereas the other is formulated strictly in the image domain (thus making it also applicable to manifolds which require multiple coordinate charts). We discuss their intrinsic relation as well as the advantages and disadvantages of the two approaches. Further we propose a novel formulation for unbiased image registration, which naturally extends to the case of time-series of images. We discuss numerical implementation details and carefully evaluate the properties of the alternative algorithms.
暂无评论