In data analytics applications, join is a general and time consuming operation. Optimizing join algorithms can benefit the query processing significantly. The emerging of GPUs provides a massive parallelism solution f...
详细信息
There are two types of deep generative models: explicit and implicit. The former defines an explicit density form that allows likelihood inference; while the latter targets a flexible transformation from random noise ...
ISBN:
(纸本)9781713845393
There are two types of deep generative models: explicit and implicit. The former defines an explicit density form that allows likelihood inference; while the latter targets a flexible transformation from random noise to generated samples. While the two classes of generative models have shown great power in many applications, both of them, when used alone, suffer from respective limitations and drawbacks. To take full advantages of both models and enable mutual compensation, we propose a novel joint training framework that bridges an explicit (unnormalized) density estimator and an implicit sample generator via Stein discrepancy. We show that our method 1) induces novel mutual regularization via kernel Sobolev norm penalization and Moreau-Yosida regularization, and 2) stabilizes the training dynamics. Empirically, we demonstrate that proposed method can facilitate the density estimator to more accurately identify data modes and guide the generator to output higher-quality samples, comparing with training a single counterpart. The new approach also shows promising results when the training samples are contaminated or limited.
Human beings not only have the ability to recognize novel unseen classes, but also can incrementally incorporate the new classes to existing knowledge preserved. However, zero-shot learning models assume that all seen...
详细信息
Retrieving unlabeled videos by textual queries, known as Ad-hoc Video Search (AVS), is a core theme in multimedia data management and retrieval. The success of AVS counts on cross-modal representation learning that en...
详细信息
Real-world 3D structured data like point clouds and skeletons often can be represented as data in a 3D rotation group (denoted as SO(3)). However, most existing neural networks are tailored for the data in the Euclide...
详细信息
Graph convolutional networks (GCNs) are a powerful deep learning approach for graph-structured data. Recently, GCNs and subsequent variants have shown superior performance in various application areas on real-world da...
详细信息
Despite a large amount of effort in dealing with heavy-tailed error in machine learning, little is known when moments of the error can become non-existential: the random noise η satisfies Pr[|η| > |y|] ≤ 1/|y|α...
ISBN:
(纸本)9781713845393
Despite a large amount of effort in dealing with heavy-tailed error in machine learning, little is known when moments of the error can become non-existential: the random noise η satisfies Pr[|η| > |y|] ≤ 1/|y|α for some α > 0. We make the first attempt to actively handle such super heavy-tailed noise in bandit learning problems: We propose a novel robust statistical estimator, mean of medians, which estimates a random variable by computing the empirical mean of a sequence of empirical medians. We then present a generic reductionist algorithmic framework for solving bandit learning problems (including multi-armed and linear bandit problem): the mean of medians estimator can be applied to nearly any bandit learning algorithm as a black-box filtering for its reward signals and obtain similar regret bound as if the reward is sub-Gaussian. We show that the regret bound is near-optimal even with very heavy-tailed noise. We also empirically demonstrate the effectiveness of the proposed algorithm, which further corroborates our theoretical results.
Machine learning classifiers' capability is largely dependent on the scale of available training data and limited by the model overfitting in data-scarce learning tasks. To address this problem, this work proposes...
详细信息
High-resolution natural image matting plays an important role in image editing, film-making and remote sensing due to its ability of accurately extract the foreground from a natural background. However, due to the com...
详细信息
High-resolution natural image matting plays an important role in image editing, film-making and remote sensing due to its ability of accurately extract the foreground from a natural background. However, due to the complexity brought about by the proliferation of resolution, the existing image matting methods cannot obtain high-quality alpha mattes on high-resolution images in reasonable time. To overcome this challenge, we introduce a high-resolution image matting framework based on alpha matte refinement from low-resolution to high-resolution (HRIMF-AMR). The proposed framework transforms the complex high-resolution image matting problem into low-resolution image matting problem and high-resolution alpha matte refinement problem. While the first problem is solved by adopting an existing image matting method, the latter is addressed by applying the Detail Difference Feature Extractor (DDFE) designed as a part of our work. The DDFE extracts detail difference features from high-resolution images by measuring the image feature difference between high-resolution images and low-resolution images. The low-resolution alpha matte is refined according to the extracted detail difference feature, providing the high-resolution alpha matte. In addition, the Matte Detail Resolution Difference (MDRD) loss is introduced to train the DDFE, which imposes an additional constraint on the extraction of detail difference features with mattes. Experimental results show that integrating HRIMF-AMR significantly enhances the performance of existing matting methods on high-resolution images of Transparent-460 and Alphamatting. Project page: https://***/yexianmin/HRAMR-Matting.
On account of a large scale of dataset need to be annotated to fit for specific tasks, Zero-Shot Learning(ZSL) has invoked so much attention and got significant progress in recent research due to the prevalence of dee...
详细信息
ISBN:
(数字)9781728169262
ISBN:
(纸本)9781728169279
On account of a large scale of dataset need to be annotated to fit for specific tasks, Zero-Shot Learning(ZSL) has invoked so much attention and got significant progress in recent research due to the prevalence of deep neural networks. At present, ZSL is mainly solved through the utilization of auxiliary information, such as semantic attributes and text descriptions. And then, we can employ the mapping method to bridge the gap between visual and semantic space. However, due to the lack of effective use of auxiliary information, this problem has not been solved well. Inspired by previous work, we consider that visual space can be used as the embedding space to get a stronger ability to express the precise characteristics of semantic information. Meanwhile, we take into account that there are some noise attributes in the annotated information of public datasets that need to be processed. Based on these considerations, we propose an end-to-end method with convolutional architecture, instead of conventionally linear projection, to provide a deep representation for semantic information to solve ZSL. Semantic features would express more detailed and precise information after being feed into our method. Besides, we use word embedding to generate some superclasses for original classes and propose a new loss function for these superclasses to assist in training. Experiments show that our method can get decent improvements for ZSL and Generalized Zero-Shot Learning(GZSL) on several public datasets.
暂无评论