Hair editing is a critical image synthesis task that aims to edit hair color and hairstyle using text descriptions or reference images, while preserving irrelevant attributes (e.g., identity, background, cloth). Many ...
ISBN:
(纸本)9798331314385
Hair editing is a critical image synthesis task that aims to edit hair color and hairstyle using text descriptions or reference images, while preserving irrelevant attributes (e.g., identity, background, cloth). Many existing methods are based on StyleGAN to address this task. However, due to the limited spatial distribution of StyleGAN, it struggles with multiple hair color editing and facial preservation. Considering the advancements in diffusion models, we utilize Latent Diffusion Models (LDMs) for hairstyle editing. Our approach introduces Multi-stage Hairstyle Blend (MHB), effectively separating control of hair color and hairstyle in diffusion latent space. Additionally, we train a warping module to align the hair color with the target region. To further enhance multi-color hairstyle editing, we fine-tuned a CLIP model using a multi-color hairstyle dataset. Our method not only tackles the complexity of multi-color hairstyles but also addresses the challenge of preserving original colors during diffusion editing. Extensive experiments showcase the superiority of our method in editing multi-color hairstyles while preserving facial attributes given textual descriptions and reference images.
Multimodal Sentiment Analysis is a burgeoning research area, leveraging various modalities to predict the sentiment score. Nevertheless, previous studies have disregarded the impact of noise interference on specific m...
Multimodal Sentiment Analysis is a burgeoning research area, leveraging various modalities to predict the sentiment score. Nevertheless, previous studies have disregarded the impact of noise interference on specific modal sentiments during video recording, thereby compromising the accuracy of sentiment prediction. In this paper, we propose the Guided Circular Decomposition and Cross-Modal Recombination (GCD-CMR) model, which aims to eliminate contaminated sentiment features in a fine-grained way. To achieve this, we utilize tailored global information specific to each modality to guide the circular decomposing process in the GCD module, to produce a set of sentiment prototypes. Subsequently, in the CMR module, we align cross-modal sentiment prototypes and remove the contaminated prototypes for recombination. Experimental results on two publicly available datasets demonstrate that our model surpasses state-of-the-art models, confirming the effectiveness of our proposed method. We release the code at: https://***/nianhua20/GCD-CMR.
How to perform efficient service migration in a mobile edge environment has become one of the research hotspots in the field of service computing. Most service migration approaches assume that the mobile edge network ...
详细信息
Now object detection based on deep learning tries different *** uses fewer data training networks to achieve the effect of large dataset ***,the existing methods usually do not achieve the balance between network para...
详细信息
Now object detection based on deep learning tries different *** uses fewer data training networks to achieve the effect of large dataset ***,the existing methods usually do not achieve the balance between network parameters and training *** makes the information provided by a small amount of picture data insufficient to optimize model parameters,resulting in unsatisfactory detection *** improve the accuracy of few shot object detection,this paper proposes a network based on the transformer and high-resolution feature extraction(THR).High-resolution feature extractionmaintains the resolution representation of the *** and spatial attention are used to make the network focus on features that are more useful to the *** addition,the recently popular transformer is used to fuse the features of the existing *** compensates for the previous network failure by making full use of existing object *** on the Pascal VOC and MS-COCO datasets prove that the THR network has achieved better results than previous mainstream few shot object detection.
The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image ***,their ability to learn local,contextual relationships between pixels re...
详细信息
The self-attention mechanism of Transformers,which captures long-range contextual information,has demonstrated significant potential in image ***,their ability to learn local,contextual relationships between pixels requires further *** methods face challenges in efficiently managing multi-scale fea-tures of different granularities from the encoder backbone,leaving room for improvement in their global representation and feature extraction *** address these challenges,we propose a novel Decoder with Multi-Head Feature Receptors(DMHFR),which receives multi-scale features from the encoder backbone and organizes them into three feature groups with different granularities:coarse,fine-grained,and full *** groups are subsequently processed by Multi-Head Feature Receptors(MHFRs)after feature capture and modeling *** include two Three-Head Feature Receptors(THFRs)and one Four-Head Feature Receptor(FHFR).Each group of features is passed through these MHFRs and then fed into axial transformers,which help the model capture long-range dependencies within the *** three MHFRs produce three distinct feature *** output from the FHFR serves as auxiliary auxiliary features in the prediction head,and the prediction output and their losses will eventually be *** results show that the Transformer using DMHFR outperforms 15 state of the arts(SOTA)methods on five public ***,it achieved significant improvements in mean DICE scores over the classic Parallel Reverse Attention Network(PraNet)method,with gains of 4.1%,2.2%,1.4%,8.9%,and 16.3%on the CVC-ClinicDB,Kvasir-SEG,CVC-T,CVC-ColonDB,and ETIS-LaribPolypDB datasets,respectively.
The ticket automation provides crucial support for the normal operation of IT software systems. An essential task of ticket automation is to assign experts to solve upcoming tickets. However, facing thousands of ticke...
详细信息
For large-scale multitask wireless sensor networks (LSM-WSNs), the traditional data collection mode could suffer low energy-efficiency on data transmission, since the large-scale multitask scenarios could result in mu...
For large-scale multitask wireless sensor networks (LSM-WSNs), the traditional data collection mode could suffer low energy-efficiency on data transmission, since the large-scale multitask scenarios could result in much higher packet collision probability, especially for harsh environments. Mobile data collection is an efficient data acquisition way to prolong network lifetime for LSM-WSNs. However, the mobile collectors could suffer electricity shortage problem, since the limited battery capacity of any mobile collector could not afford the energy consumption of its long-distance movement and massive data collection in large-scale multitask scenarios. Deploying wireless chargers to supplement the energy of mobile collectors is a feasible solution to electricity shortage problem, but will incur extra charger deployment cost. In this paper, we focus on the problem that how to optimize such charger deployment cost, which is NP-hard. By transforming it into minimum-cost submodular cover problem, we devise an efficient approximation algorithm with a provable approximation ratio. The extensive simulation results reveal that our solution always outperforms the other solutions under whatever configurations.
At present, deep learning technologies have been widely used in the field of natural language process, such as text summarization. In CQA, the answer summary could help users get a complete answer quickly. There are s...
At present, deep learning technologies have been widely used in the field of natural language process, such as text summarization. In CQA, the answer summary could help users get a complete answer quickly. There are still some problems with the current answer summary scheme, such as semantic inconsistency, repetition of words, etc. In order to solve this, we propose a novel scheme Answer Summarization based on Multi-layer Attention Scheme (ASMAM). Based on the traditional Seq2Seq, we introduce self-attention and multi-head attention scheme respectively during sentence and text encoding, which could improve text representation ability of the model. In order to solve "long distance dependence" of RNN and too many parameters of LSTM, we all use GRU as the neuron at the encoder and decoder sides. Experiments over the Yahoo! Answers dataset demonstrate that the coherence and fluency of the generated summary are all superior to the benchmark model in ROUGE evaluation system.
Multi-stability and control of hyperchaotic system are researched in this paper. Firstly, a new 4D hyperchaotic system containing hidden attractors is modeled. Secondly, the coexistence of different attractors is conf...
Multi-stability and control of hyperchaotic system are researched in this paper. Firstly, a new 4D hyperchaotic system containing hidden attractors is modeled. Secondly, the coexistence of different attractors is confirmed and transient hyperchaotic phenomena is found as the parameters are changed. Then, the Hamiltonian energy function of the system is calculated and the energy feedback controller is designed to control the system. Finally, numerical simulations are performed to verify the validity of the results.
Serverless edge computing is emerging as an enabler to provision scalable and flexible Function-as-a-Service (FaaS) applications with lightweight function instances at network edge. In serverless edge computing, the f...
ISBN:
(数字)9798350368550
ISBN:
(纸本)9798350368567
Serverless edge computing is emerging as an enabler to provision scalable and flexible Function-as-a-Service (FaaS) applications with lightweight function instances at network edge. In serverless edge computing, the function instances with inter-dependencies are scheduled to proximate edge nodes in a distributed manner. However, the heterogeneity and unpredictability of edge networks bring significant challenge in realizing optimal scheduling decision to guarantee execution performance of applications without any prior. In view of this challenge, a Serverless Function Scheduling Method, named SFSM, is proposed in this paper for FaaS applications over edge computing. First, a long-term optimization problem is formulated to reduce completion time and decoupled into time-slot sub-problems via Lyapunov optimization. To avoid the cross-edge redundant data transmission overhead of inter-functions, a two-level graph optimization is designed with vertical and horizontal data merging. Then, SFSM incorporates an online multi-armed bandit-based scheduling algorithm that only requires the context of requests without complete information of edge networks. Finally, extensive experimental results based on real-world datasets demonstrate the effectiveness and superiority of SFSM.
暂无评论