Large Language Models (LLMs) demonstrate robust capabilities across various fields, leading to a paradigm shift in LLM-enhanced Recommender System (RS). Research to date focuses on point-wise and pair-wise recommendat...
详细信息
As image manipulation technology advances rapidly,the malicious use of image tampering has alarmingly escalated,posing a significant threat to social *** the realm of image tampering localization,accurately localizing...
详细信息
As image manipulation technology advances rapidly,the malicious use of image tampering has alarmingly escalated,posing a significant threat to social *** the realm of image tampering localization,accurately localizing limited samples,multiple types,and various sizes of regions remains a multitude of *** issues impede the model’s universality and generalization capability and detrimentally affect its *** tackle these issues,we propose FL-MobileViT-an improved MobileViT model devised for image tampering *** proposed model utilizes a dual-stream architecture that independently processes the RGB and noise domain,and captures richer traces of tampering through dual-stream ***,the model incorporating the Focused Linear Attention mechanism within the lightweight network(MobileViT).This substitution significantly diminishes computational complexity and resolves homogeneity problems associated with traditional Transformer attention mechanisms,enhancing feature extraction diversity and improving the model’s localization *** comprehensively fuse the generated results from both feature extractors,we introduce the ASPP architecture for multi-scale feature *** facilitates a more precise localization of tampered regions of various ***,to bolster the model’s generalization ability,we adopt a contrastive learning method and devise a joint optimization training strategy that leverages fused features and captures the disparities in feature distribution in tampered *** strategy enables the learning of contrastive loss at various stages of the feature extractor and employs it as an additional constraint condition in conjunction with cross-entropy *** a result,overfitting issues are effectively alleviated,and the differentiation between tampered and untampered regions is *** evaluations on five benchmark datasets(IMD-20,CASIA,NIST-16,Columbia and Coverage)validat
This systematic literature review explores the application of transformer models in early detection of human depression, encompassing text, audio, and video data modalities. Transformer architectures, notably BERT for...
详细信息
Voice is one of the most widely used media for information transmission in human society. While high-quality synthetic voices are extensively utilized in various applications, they pose significant risks to content se...
详细信息
Online multi-label streaming feature selection has gained significant interest in high-volume data applications. Neighborhood Rough Set (NRS) has emerged as a practical tool for handling multi-label feature selection....
详细信息
The human brain has a simple time analyzing and processing images. The brain is able to rapidly deconstruct and distinguish an image's various components when the eye perceives it. With the Convolutional Neural Ne...
详细信息
Event Relation Extraction (ERE) aims to extract various types of relations between different events within texts. Although Large Language Models (LLMs) have demonstrated impressive capabilities in many natural languag...
详细信息
Human neuroimaging datasets provide rich multi-scale spatiotemporal information about the state of the brain. Most current methods, such as spectral analysis, focus on a single facet of these datasets and do not take ...
详细信息
N-ary Knowledge Graphs (NKGs), where a fact can involve more than two entities, have gained increasing attention. Link Prediction in NKGs (LPN) aims to predict missing elements in facts to facilitate the completion of...
详细信息
The explosive growth of social media means portrait editing and retouching are in high *** portraits are commonly captured and stored as raster images,editing raster images is non-trivial and requires the user to be h...
详细信息
The explosive growth of social media means portrait editing and retouching are in high *** portraits are commonly captured and stored as raster images,editing raster images is non-trivial and requires the user to be highly *** at developing intuitive and easy-to-use portrait editing tools,we propose a novel vectorization method that can automatically convert raster images into a 3-tier hierarchical *** base layer consists of a set of sparse diffusion curves(DCs)which characterize salient geometric features and low-frequency colors,providing a means for semantic color transfer and facial expression *** middle level encodes specular highlights and shadows as large,editable Poisson regions(PRs)and allows the user to directly adjust illumination by tuning the strength and changing the shapes of *** top level contains two types of pixel-sized PRs for high-frequency residuals and fine details such as pimples and *** train a deep generative model that can produce high-frequency residuals *** to the inherent meaning in vector primitives,editing portraits becomes easy and *** particular,our method supports color transfer,facial expression editing,highlight and shadow editing,and automatic *** quantitatively evaluate the results,we extend the commonly used FLIP metric(which measures color and feature differences between two images)to consider *** new metric,illumination-sensitive FLIP,can effectively capture salient changes in color transfer results,and is more consistent with human perception than FLIP and other quality measures for portrait *** evaluate our method on the FFHQR dataset and show it to be effective for common portrait editing tasks,such as retouching,light editing,color transfer,and expression editing.
暂无评论