Despite remarkable advances in image captioning, existing models still lack the ability to generate controllable and diverse captions. As a solution, controllable image captioning (CIC) has recently gained attention, ...
详细信息
ISBN:
(纸本)9798400706028
Despite remarkable advances in image captioning, existing models still lack the ability to generate controllable and diverse captions. As a solution, controllable image captioning (CIC) has recently gained attention, with the goal of generating image captions that satisfy the constraints of the given control signals. Current CIC methods have two main limitations: (1) They can only handle one specific control signal and lack the ability to handle combinations of multiple control signals. (2) They depend on costly supervised learning from task-specific data, which becomes impractical with increasing model size. To this end, we propose an energy-basedsampling method for controllable image captioning, named SamCap. Specifically, by combining various constraint functions with the log likelihood of the image captioner into an energy function, we can generate captions that satisfy the specified constraints through gradient-based sampling. SamCap provides a learning-free and plug-and-play solution, that can integrate with any existing image captioner without task-specific fine-tuning. Extensive results demonstrate that SamCap not only matches the performance of SOTA signal-specific CIC models for single control signals, but also shows significant advantages in handling combinations of multiple control signals.
Dear editor,Recent years have witnessed a rapid growth of distributed design in multi-agent networks because of the scalability, robustness and low cost. Compared with the conventional centralized and parallel design,...
详细信息
Dear editor,Recent years have witnessed a rapid growth of distributed design in multi-agent networks because of the scalability, robustness and low cost. Compared with the conventional centralized and parallel design, all agents in fully distributed design aim to achieve the global goal only based on the local measurement and information sharing with
Predicting urban morphology based on local attributes is an important issue in urban science research. The deep generative models represented by generative adversarial network (GAN) models have achieved impressive res...
详细信息
Predicting urban morphology based on local attributes is an important issue in urban science research. The deep generative models represented by generative adversarial network (GAN) models have achieved impressive results in this area. However, in such methods, the urban morphology is assumed to follow a specific probability distribution and be able to directly approximate the distribution via GAN models, which is not a realistic strategy. As demonstrated by the score-based model, a better strategy is to learn the gradient of the probability distribution and implicitly approximate the distribution. Therefore, in this paper, an urban morphology prediction method based on the conditional diffusion model is proposed. Implementing this approach results in the decomposition of the attribute-based urban morphology prediction task into two subproblems: estimating the gradient of the conditional distribution, and gradient-based sampling. During the training stage, the gradient of the conditional distribution is approximated by using a conditional diffusion model to predict the noise added to the original urban morphology. In the generation stage, the corresponding conditional distribution is parameterized based on the noise predicted by the conditional diffusion model, and the final prediction result is generated through iterative sampling. The experimental results showed that compared with GAN-based methods, our method demonstrated improvements of 5.5%, 5.9%, and 13.2% in the metrics of low-level pixel features, shallow structural features, and deep structural features, respectively.
Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under ...
详细信息
Factor analysis provides a canonical framework for imposing lower-dimensional structure such as sparse covariance in high-dimensional data. High-dimensional data on the same set of variables are often collected under different conditions, for instance in reproducing studies across research groups. In such cases, it is natural to seek to learn the shared versus condition-specific structure. Existing hierarchical extensions of factor analysis have been proposed, but face practical issues including identifiability problems. To address these shortcomings, we propose a class of SUbspace Factor Analysis (SUFA) models, which characterize variation across groups at the level of a lower-dimensional subspace. We prove that the proposed class of SUFA models lead to identifiability of the shared versus group-specific components of the covariance, and study their posterior contraction properties. Taking a Bayesian approach, these contributions are developed alongside efficient posterior computation algorithms. Our sampler fully integrates out latent variables, is easily parallelizable and has complexity that does not depend on sample size. We illustrate the methods through application to integration of multiple gene expression datasets relevant to immunology. Supplementary materials for this article are available online, including a standardized description of the materials available for reproducing the work.
暂无评论