Instruction Fine-Tuning (IFT) emerges as an essential step of training large language models to robustly carry out tasks of interest. However, there lacks a systematic investigation about the underlying mechanisms of ...
详细信息
Large language models (LLMs) are highly effective in various natural language processing (NLP) tasks. However, they are susceptible to producing unreliable conjectures in ambiguous contexts called hallucination. This ...
详细信息
作者:
杨晓雨周琨贺欣张立军State Key Laboratory of Integrated Optoelectronics
Key Laboratory of Automobile Materials of MOEKey Laboratory of Material Simulation Methods&Software of MOEand School of Materials Science and EngineeringJilin UniversityChangchun 130012China
Recent advancements in model pruning have focused on developing new algorithms and improving upon benchmarks. However, the practical application of these algorithms across various models and platforms remains a signif...
详细信息
This study introduces CLIP-Flow,a novel network for generating images from a given image or *** effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-...
详细信息
This study introduces CLIP-Flow,a novel network for generating images from a given image or *** effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-and text-to-image *** particular,we adopted Contrastive Language-Image Pretraining(CLIP)as an encoder to extract semantics and StyleGAN as a decoder to generate images from such ***,to bridge the embedding space of CLIP and latent space of StyleGAN,real NVP is employed and modified with activation normalization and invertible *** the images and text in CLIP share the same representation space,text prompts can be fed directly into CLIP-Flow to achieve text-to-image *** conducted extensive experiments on several datasets to validate the effectiveness of the proposed image-to-image synthesis *** addition,we tested on the public dataset Multi-Modal CelebA-HQ,for text-to-image *** validated that our approach can generate high-quality text-matching images,and is comparable with state-of-the-art methods,both qualitatively and quantitatively.
Recent evaluations have highlighted the tapered posit number format as a promising alternative to the uniform precision IEEE 754 floating-point numbers, which suffer from various deficiencies. Although the posit encod...
详细信息
ISBN:
(纸本)9783031727085;9783031727092
Recent evaluations have highlighted the tapered posit number format as a promising alternative to the uniform precision IEEE 754 floating-point numbers, which suffer from various deficiencies. Although the posit encoding scheme offers superior coding efficiency at values close to unity, its efficiency markedly diminishes with deviation from unity. This reduction in efficiency leads to suboptimal encodings and a consequent diminution in dynamic range, thereby rendering posits suboptimal for general-purpose computer arithmetic. This paper introduces and formally proves 'takum' as a novel general-purpose logarithmic tapered-precision number format, synthesising the advantages of posits in low-bit applications with high encoding efficiency for numbers distant from unity. Takums exhibit an asymptotically constant dynamic range in terms of bit string length, which is delineated in the paper to be suitable for a general-purpose number format. It is demonstrated that takums either match or surpass existing alternatives. Moreover, takums address several issues previously identified in posits while unveiling novel and beneficial arithmetic properties.
Since different kinds of face forgeries leave similar forgery traces in videos,learning the common features from different kinds of forged faces would achieve promising generalization ability of forgery ***,to accurat...
详细信息
Since different kinds of face forgeries leave similar forgery traces in videos,learning the common features from different kinds of forged faces would achieve promising generalization ability of forgery ***,to accurately detect known forgeries while ensuring high generalization ability of detecting unknown forgeries,we propose an intra-inter network(IIN)for face forgery detection(FFD)in videos with continual *** proposed IIN mainly consists of three modules,i.e.,intra-module,inter-module,and forged trace masking module(FTMM).Specifically,the intra-module is trained for each kind of face forgeries by supervised learning to extract special features,while the inter-module is trained by self-supervised learning to extract the common *** a result,the common and special features of the different forgeries are decoupled by the two feature learning modules,and then the decoupled common features can be utlized to achieve high generalization ability for ***,the FTMM is deployed for contrastive learning to further improve detection *** experimental results on FaceForensic++dataset demonstrate that the proposed IIN outperforms the state-of-the-arts in ***,the generalization ability of the IIN verified on DFDC and Celeb-DF datasets demonstrates that the proposed IIN significantly improves the generalization ability for FFD.
The Sunway family supercomputers have achieved a series of remarkable achievements. However, the toolchains provided by them are not perfect, which has brought great challenges to the development of high-performance a...
详细信息
Research on mass gathering events is critical for ensuring public security and maintaining social ***,most of the existing works focus on crowd behavior analysis areas such as anomaly detection and crowd counting,and ...
详细信息
Research on mass gathering events is critical for ensuring public security and maintaining social ***,most of the existing works focus on crowd behavior analysis areas such as anomaly detection and crowd counting,and there is a relative lack of research on mass gathering *** believe real-time detection and monitoring of mass gathering behaviors are essential formigrating potential security risks and ***,it is imperative to develop a method capable of accurately identifying and localizing mass gatherings before disasters occur,enabling prompt and effective *** address this problem,we propose an innovative Event-Driven Attention Network(EDAN),which achieves image-text matching in the scenario of mass gathering events with good results for the first *** image-text retrieval methods based on global alignment are difficult to capture the local details within complex scenes,limiting retrieval *** local alignment-based methods aremore effective at extracting detailed features,they frequently process raw textual features directly,which often contain ambiguities and redundant information that can diminish retrieval efficiency and degrade model *** overcome these challenges,EDAN introduces an Event-Driven AttentionModule that adaptively focuses attention on image regions or textual words relevant to the event *** calculating the semantic distance between event labels and textual content,this module effectively significantly reduces computational complexity and enhances retrieval *** validate the effectiveness of EDAN,we construct a dedicated multimodal dataset tailored for the analysis of mass gathering events,providing a reliable foundation for subsequent *** conduct comparative experiments with other methods on our dataset,the experimental results demonstrate the effectiveness of *** the image-to-text retrieval task,EDAN achieved the best performance on the R@5 metric,w
In high-stakes sectors such as network security, IoT security, accurately distinguishing between normal and anomalous data is critical due to the significant implications for operational success and safety in decision...
暂无评论