This study introduces CLIP-Flow,a novel network for generating images from a given image or *** effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-...
详细信息
This study introduces CLIP-Flow,a novel network for generating images from a given image or *** effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-and text-to-image *** particular,we adopted Contrastive Language-Image Pretraining(CLIP)as an encoder to extract semantics and StyleGAN as a decoder to generate images from such ***,to bridge the embedding space of CLIP and latent space of StyleGAN,real NVP is employed and modified with activation normalization and invertible *** the images and text in CLIP share the same representation space,text prompts can be fed directly into CLIP-Flow to achieve text-to-image *** conducted extensive experiments on several datasets to validate the effectiveness of the proposed image-to-image synthesis *** addition,we tested on the public dataset Multi-Modal CelebA-HQ,for text-to-image *** validated that our approach can generate high-quality text-matching images,and is comparable with state-of-the-art methods,both qualitatively and quantitatively.
作者:
Ma, HaoYang, JingyuanHuang, HuiShenzhen University
Visual Computing Research Center College of Computer Science and Software Engineering Shenzhen China (GRID:grid.263488.3) (ISNI:0000 0001 0472 9649)
Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given ***,most existing GAN-based translation methods fail to produce photorealistic *** this st...
详细信息
Exemplar-based image translation involves converting semantic masks into photorealistic images that adopt the style of a given ***,most existing GAN-based translation methods fail to produce photorealistic *** this study,we propose a new diffusion model-based approach for generating high-quality images that are semantically aligned with the input mask and resemble an exemplar in *** proposed method trains a conditional denoising diffusion probabilistic model(DDPM)with a SPADE module to integrate the semantic *** then used a novel contextual loss and auxiliary color loss to guide the optimization process,resulting in images that were visually pleasing and semantically *** demonstrate that our method outperforms state-of-the-art approaches in terms of both visual quality and quantitative metrics.
The interpretation of colors in visualizations is facilitated when the assignments between colors and concepts in the visualizations match human's expectations, implying that the colors can be interpreted in a sem...
详细信息
The interpretation of colors in visualizations is facilitated when the assignments between colors and concepts in the visualizations match human's expectations, implying that the colors can be interpreted in a semantic manner. However, manually creating a dataset of suitable associations between colors and concepts for use in visualizations is costly, as such associations would have to be collected from humans for a large variety of concepts. To address the challenge of collecting this data, we introduce a method to extract color-concept associations automatically from a set of concept images. While the state-of-the-art method extracts associations from data with supervised learning, we developed a self-supervised method based on colorization that does not require the preparation of ground truth color-concept associations. Our key insight is that a set of images of a concept should be sufficient for learning color-concept associations, since humans also learn to associate colors to concepts mainly from past visual input. Thus, we propose to use an automatic colorization method to extract statistical models of the color-concept associations that appear in concept images. Specifically, we take a colorization model pre-trained on ImageNet and fine-tune it on the set of images associated with a given concept, to predict pixel-wise probability distributions in Lab color space for the images. Then, we convert the predicted probability distributions into color ratings for a given color library and aggregate them for all the images of a concept to obtain the final color-concept associations. We evaluate our method using four different evaluation metrics and via a user study. Experiments show that, although the state-of-the-art method based on supervised learning with user-provided ratings is more effective at capturing relative associations, our self-supervised method obtains overall better results according to metrics like Earth Mover's Distance (EMD) and Entropy Differenc
Large language models (LLMs) can perform complex reasoning by generating intermediate thoughts under zero-shot or few-shot settings. However, zero-shot prompting always encounters low performance, and the superior per...
详细信息
Hybrid memory systems composed of dynamic random access memory(DRAM)and Non-volatile memory(NVM)often exploit page migration technologies to fully take the advantages of different memory *** previous proposals usually...
详细信息
Hybrid memory systems composed of dynamic random access memory(DRAM)and Non-volatile memory(NVM)often exploit page migration technologies to fully take the advantages of different memory *** previous proposals usually migrate data at a granularity of 4 KB pages,and thus waste memory bandwidth and DRAM *** this paper,we propose Mocha,a non-hierarchical architecture that organizes DRAM and NVM in a flat address space physically,but manages them in a cache/memory *** the commercial NVM device-Intel Optane DC Persistent Memory Modules(DCPMM)actually access the physical media at a granularity of 256 bytes(an Optane block),we manage the DRAM cache at the 256-byte size to adapt to this feature of *** design not only enables fine-grained data migration and management for the DRAM cache,but also avoids write amplification for Intel Optane *** also create an Indirect Address Cache(IAC)in Hybrid Memory Controller(HMC)and propose a reverse address mapping table in the DRAM to speed up address translation and cache ***,we exploit a utility-based caching mechanism to filter cold blocks in the NVM,and further improve the efficiency of the DRAM *** implement Mocha in an architectural *** results show that Mocha can improve application performance by 8.2%on average(up to 24.6%),reduce 6.9%energy consumption and 25.9%data migration traffic on average,compared with a typical hybrid memory architecture-HSCC.
Video surveillance is widely adopted across various sectors for purposes such as law enforcement, COVID-19 isolation monitoring, and analyzing crowds for potential threats like flash mobs or violence. The vast amount ...
详细信息
Music classification is a fundamental task in the field of Music Information Retrieval. This paper focuses on composer classification, a specific task within music classification. Compressive techniques are commonly e...
详细信息
According to WHO reports, cancer is the leading cause of death worldwide. The second most prevalent cause of cancer-related death in both men and women is colorectal cancer (CRC). One potential approach for reducing t...
详细信息
Machine learning models are increasingly being integrated into various aspects of society, impacting decision-making processes across domains such as healthcare, finance, and autonomous systems. However, as these mode...
详细信息
The development of the Internet of Things(IoT)technology is leading to a new era of smart applications such as smart transportation,buildings,and smart ***,these applications act as the building blocks of IoT-enabled ...
详细信息
The development of the Internet of Things(IoT)technology is leading to a new era of smart applications such as smart transportation,buildings,and smart ***,these applications act as the building blocks of IoT-enabled smart *** high volume and high velocity of data generated by various smart city applications are sent to flexible and efficient cloud computing resources for ***,there is a high computation latency due to the presence of a remote cloud *** computing,which brings the computation close to the data source is introduced to overcome this *** an IoT-enabled smart city environment,one of the main concerns is to consume the least amount of energy while executing tasks that satisfy the delay *** efficient resource allocation at the edge is helpful to address this *** this paper,an energy and delay minimization problem in a smart city environment is formulated as a bi-objective edge resource allocation ***,we presented a three-layer network architecture for IoT-enabled smart ***,we designed a learning automata-based edge resource allocation approach considering the three-layer network architecture to solve the said bi-objective minimization *** Automata(LA)is a reinforcement-based adaptive decision-maker that helps to find the best task and edge resource *** extensive set of simulations is performed to demonstrate the applicability and effectiveness of the LA-based approach in the IoT-enabled smart city environment.
暂无评论