We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set o...
详细信息
ISBN:
(数字)9781538661000
ISBN:
(纸本)9781538661000
We introduce the first benchmark for a new problem - recognizing human action adverbs (HAA): "Adverbs Describing Human Actions" (ADHA). We demonstrate some key features of ADHA: a semantically complete set of adverbs describing human actions, a set of common, describable human actions, and an exhaustive labelling of simultaneously emerging actions in each video. We commit an in-depth analysis on the implementation of current effective models in action recognition and image captioning on adverb recognition, and the results reveal that such methods are unsatisfactory. Furthermore, we propose a novel three-stream hybrid model to tackle the HAA problem, which achieves better performances and receives relatively promising results.
While makeup virtual-try-on is now widespread, parametrizing a computer graphics rendering engine for synthesizing images of a given cosmetics product remains a challenging task. In this paper, we introduce an inverse...
详细信息
ISBN:
(纸本)9781665448994
While makeup virtual-try-on is now widespread, parametrizing a computer graphics rendering engine for synthesizing images of a given cosmetics product remains a challenging task. In this paper, we introduce an inverse computer graphics method for automatic makeup synthesis from a reference image, by learning a model that maps an example portrait image with makeup to the space of rendering parameters. This method can be used by artists to automatically create realistic virtual cosmetics image samples, or by consumers, to virtually try-on a makeup extracted from their favorite reference image.
Irrigation systems can vary widely in scale, from smallscale subsistence farming to large commercial agriculture (see Fig. 1 ). The heterogeneity in irrigation practices and systems across different regions adds to th...
详细信息
AI City Challenge 2021 Task 5: The Natural Language-Based Vehicle Tracking is a Natural Language-based Vehicle Retrieval task, which requires retrieving a single-camera track using a set of three natural language desc...
详细信息
ISBN:
(纸本)9781665448994
AI City Challenge 2021 Task 5: The Natural Language-Based Vehicle Tracking is a Natural Language-based Vehicle Retrieval task, which requires retrieving a single-camera track using a set of three natural language descriptions of the specific targets. In this paper, we present our methods to tackle the difficulties of the provided task. Experiments with our approaches on the competitive dataset from AICity Challenge 2021 show that our techniques achieve Mean Reciprocal Rank score of 0.1701 on the public test dataset and 0.1571 on the private test dataset.
Most modern approaches for multiple people tracking rely on human appearance to exploit similarity between person detections. In this work, we propose an alternative tracking method that does not depend on visual appe...
详细信息
ISBN:
(纸本)9781728125060
Most modern approaches for multiple people tracking rely on human appearance to exploit similarity between person detections. In this work, we propose an alternative tracking method that does not depend on visual appearance and is still capable to deal with very dynamic motions and long-term occlusions. We make this feasible by: (i) incorporating additional information from body-worn inertial sensors, (ii) designing a neural network to relate person detections to orientation measurements and (iii) formulating a graph labeling problem to obtain a tracking solution that is globally consistent with the video and inertial recordings. We evaluate our approach on several challenging tracking sequences and achieve a very high IDF1 score of 91.2%. We outperform appearance-based baselines in scenarios where appearance is less informative and are on-par in situations with discriminative people appearance.
Fashion-on-demand is becoming an important concept for fashion industries. Many attempts have been made to leverage machine learning methods to generate fashion designs tailored to customers' tastes. However, how ...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Fashion-on-demand is becoming an important concept for fashion industries. Many attempts have been made to leverage machine learning methods to generate fashion designs tailored to customers' tastes. However, how to assemble items together (e.g., compatibility) is crucial in designing high-quality outfits for synthesis images. Here we propose a fashion generation model, named OutfitGAN, which contains two core modules: a Generative Adversarial Network and a Compatibility Network. The generative module is able to generate new realistic high quality fashion items from a specific category, while the compatibility network ensures reasonable compatibility among all items. The experimental results show the superiority of our OutfitGAN.
We define a new representation for immersed surfaces in R-3 by combining the SRNF and the induced surface metric. Using the L-2 metric on the space of SRNFs and the DeWitt metric on the space of surface metrics, we ob...
详细信息
ISBN:
(纸本)9781728193601
We define a new representation for immersed surfaces in R-3 by combining the SRNF and the induced surface metric. Using the L-2 metric on the space of SRNFs and the DeWitt metric on the space of surface metrics, we obtain a 3-parameter family of metrics that corresponds to the family of "elastic metrics" proposed by Jermyn et al. in [19] on the space of immersed surfaces. Similar to the original SRNF representation this new representation results in an extrinsic distance function on the space of immersed surfaces that is easy to compute as it is given by an explicit formula. In addition to avoiding the degeneracy of the SRNF it allows for a data-driven choice of the parameters of the metric, while still providing for fast and accurate registration of surfaces.
Being able to detect irrelevant test examples with respect to deployed deep learning models is paramount to properly and safely using them. In this paper, we address the problem of rejecting such out-of-distribution (...
详细信息
ISBN:
(纸本)9781665448994
Being able to detect irrelevant test examples with respect to deployed deep learning models is paramount to properly and safely using them. In this paper, we address the problem of rejecting such out-of-distribution (OOD) samples in a fully sample-free way, i.e., without requiring any access to in-distribution or OOD samples. We propose several indicators which can be computed alongside the prediction with little additional cost, assuming white-box access to the network. These indicators prove useful, stable and complementary for OOD detection on frequently-used architectures. We also introduce a surprisingly simple, yet effective summary OOD indicator. This indicator is shown to perform well across several networks and datasets and can furthermore be easily tuned as soon as samples become available. Lastly, we discuss how to exploit this summary in real-world settings.
Cross-view image generation has been recently proposed to generate images of one view from another dramatically different view. In this paper, we investigate third-person (exocentric) view to first-person (egocentric)...
详细信息
ISBN:
(数字)9781665487399
ISBN:
(纸本)9781665487399
Cross-view image generation has been recently proposed to generate images of one view from another dramatically different view. In this paper, we investigate third-person (exocentric) view to first-person (egocentric) view image generation. This is a challenging task since egocentric view sometimes is remarkably different from exocentric view. Thus, transforming the appearances across the two views is a non-trivial task. To this end, we propose a novel Parallel Generative Adversarial Network (P-GAN) with a novel cross-cycle loss to learn the shared information for generating egocentric images from exocentric view. We also incorporate a novel contextual feature loss in the learning procedure to capture the contextual information in images. Extensive experiments on the Exo-Ego datasets [5] show that our model outperforms the state-of-the-art approaches.
We propose a flexible person generation framework called Dressing in Order (DiOr), which supports 2D pose transfer, virtual try-on, and several fashion editing tasks. The key to DiOr is a novel recurrent generation pi...
详细信息
ISBN:
(纸本)9781665448994
We propose a flexible person generation framework called Dressing in Order (DiOr), which supports 2D pose transfer, virtual try-on, and several fashion editing tasks. The key to DiOr is a novel recurrent generation pipeline to sequentially put garments on a person, so that trying on the same garments in different orders will result in different looks. Our system can produce dressing effects not achievable by existing work, including different interactions of garments (e.g., wearing a top tucked into the bottom or over it), as well as layering of multiple garments of the same type (e.g., jacket over shirt over t-shirt). DiOr explicitly encodes the shape and texture of each garment, enabling these elements to be edited separately. Extensive evaluations show that DiOr outperforms other recent methods like ADGAN [18] in terms of output quality, and handles a wide range of editing functions for which there is no direct supervision.
暂无评论