Language-guided fashion image editing is challenging,as fashion image editing is local and requires high precision,while natural language cannot provide precise visual information for *** this paper,we propose LucIE,a...
详细信息
Language-guided fashion image editing is challenging,as fashion image editing is local and requires high precision,while natural language cannot provide precise visual information for *** this paper,we propose LucIE,a novel unsupervised language-guided local image editing method for fashion *** adopts and modifies recent text-to-image synthesis network,DF-GAN,as its ***,the synthesis backbone often changes the global structure of the input image,making local image editing *** increase structural consistency between input and edited images,we propose Content-Preserving Fusion Module(CPFM).Different from existing fusion modules,CPFM prevents iterative refinement on visual feature maps and accumulates additive modifications on RGB *** achieves local image editing explicitly with language-guided image segmentation and maskguided image blending while only using image and text *** on the DeepFashion dataset shows that LucIE achieves state-of-the-art *** with previous methods,images generated by LucIE also exhibit fewer *** provide visualizations and perform ablation studies to validate LucIE and the *** also demonstrate and analyze limitations of LucIE,to provide a better understanding of LucIE.
The grey wolf optimizer(GWO),a population-based meta-heuristic algorithm,mimics the predatory behavior of grey wolf *** exploring and introducing improvement mechanisms is one of the keys to drive the development and ...
详细信息
The grey wolf optimizer(GWO),a population-based meta-heuristic algorithm,mimics the predatory behavior of grey wolf *** exploring and introducing improvement mechanisms is one of the keys to drive the development and application of GWO *** overcome the premature and stagnation of GWO,the paper proposes a multiple strategy grey wolf optimization algorithm(MSGWO).Firstly,an variable weights strategy is proposed to improve convergence rate by adjusting the weights ***,this paper proposes a reverse learning strategy,which randomly reverses some individuals to improve the global search ***,the chain predation strategy is designed to allow the search agent to be guided by both the best individual and the previous ***,this paper proposes a rotation predation strategy,which regards the position of the current best individual as the pivot and rotate other members for enhacing the exploitation *** verify the performance of the proposed technique,MSGWO is compared with seven state-of-the-art meta-heuristics and four variant GWO algorithms on CEC2022 benchmark functions and three engineering optimization *** results demonstrate that MSGWO has better performance on most of benchmark functions and shows competitive in solving engineering design problems.
Demand forecasting has emerged as a crucial element in supply chain management. It is essential to identify anomalous data and continuously improve the forecasting model with new data. However, existing literature fai...
详细信息
Recovering 3D human meshes from monocular images is an inherently ill-posed and challenging task due to depth ambiguity,joint occlusion,and ***,most existing approaches do not model such uncertainties,typically yieldi...
详细信息
Recovering 3D human meshes from monocular images is an inherently ill-posed and challenging task due to depth ambiguity,joint occlusion,and ***,most existing approaches do not model such uncertainties,typically yielding a single reconstruction for one *** contrast,the ambiguity of the reconstruction is embraced and the problem is considered as an inverse problem for which multiple feasible solutions *** address these issues,the authors propose a multi-hypothesis approach,multi-hypothesis human mesh recovery(MH-HMR),to efficiently model the multi-hypothesis representation and build strong relationships among the hypothetical ***,the task is decomposed into three stages:(1)generating a reasonable set of initial recovery results(i.e.,multiple hypotheses)given a single colour image;(2)modelling intra-hypothesis refinement to enhance every single-hypothesis feature;and(3)establishing inter-hypothesis communication and regressing the final human ***,the authors take further advantage of multiple hypotheses and the recovery process to achieve human mesh recovery from multiple uncalibrated *** with state-of-the-art methods,the MH-HMR approach achieves superior performance and recovers more accurate human meshes on challenging benchmark datasets,such as Human3.6M and 3DPW,while demonstrating the effectiveness across a variety of *** code will be publicly available at https://***/faculty/likun/projects/MH-HMR.
To fulfill the explosion of multi-modal data, multi-modal sentiment analysis (MSA) emerged and attracted widespread attention. Unfortunately, conventional multi-modal research relies on large-scale datasets. On the on...
详细信息
It remains an interesting and challenging problem to synthesize a vivid and realistic singing face driven by music. In this paper, we present a method for this task with natural motions for the lips, facial expression...
详细信息
It remains an interesting and challenging problem to synthesize a vivid and realistic singing face driven by music. In this paper, we present a method for this task with natural motions for the lips, facial expression, head pose, and eyes. Due to the coupling of mixed information for the human voice and backing music in common music audio signals, we design a decouple-and-fuse strategy to tackle the challenge. We first decompose the input music audio into a human voice stream and a backing music stream. Due to the implicit and complicated correlation between the two-stream input signals and the dynamics of the facial expressions, head motions, and eye states, we model their relationship with an attention scheme, where the effects of the two streams are fused seamlessly. Furthermore, to improve the expressivenes of the generated results, we decompose head movement generation in terms of speed and direction, and decompose eye state generation into short-term blinking and long-term eye closing, modeling them separately. We have also built a novel dataset, SingingFace, to support training and evaluation of models for this task, including future work on this topic. Extensive experiments and a user study show that our proposed method is capable of synthesizing vivid singing faces, qualitatively and quantitatively better than the prior state-of-the-art.
360∘ videos have become increasingly popular recently, but consume much more bandwidth than non-360∘ videos. Usually, 360∘ video streaming partitions the video surface into multiple tiles and encodes the tiles inde...
详细信息
Colorization research has long been a focal point in computer vision and image processing. However, due to its inherently ill-posed nature, a reasonable assessment of the quality of their outcomes remains a challenge....
详细信息
Much of the information that we use is geospatially referenced. The need for homogeneous representation of global geographic themes is recognised as critical for sustainable development goals. The richness of local ge...
详细信息
Several recent papers have investigated the potential of language models as knowledge bases as well as the existence of severe biases when extracting factual knowledge. In this work, we focus on the factual probing pe...
详细信息
暂无评论