检索结果-内蒙古大学图书馆

Fake Face Detection Based on Fusion of Spatial Texture and High-Frequency Noise

Chinese Journal of Electronics 2025年第1期34卷 212-221页

作者： Dengyong Zhang Feifan Qi Jiahao Chen Jiaxin Chen Rongrong Gong Yuehong Tian Lebing Zhang Hunan Provincial Key Laboratory of Intelligent Processing of Big Data on Transportation School of Computer and Communication Engineering Changsha University of Science and Technology School of Computer and Communication Engineering Changsha University of Science and Technology Changsha Social Work College Changkuangao Beijing Technology Co. Ltd. School of Computer and Artificial Intelligence Huaihua University

The rapid development of the Internet has led to the widespread dissemination of manipulated facial images, significantly impacting people's daily lives. With the continuous advancement of Deepfake technology, the generated counterfeit facial images have become increasingly challenging to distinguish. There is an urgent need for a more robust and convincing detection method. Current detection methods mainly operate in the spatial domain and transform the spatial domain into other domains for analysis. With the emergence of transformers, some researchers have also combined traditional convolutional networks with transformers for detection. This paper explores the artifacts left by Deepfakes in various domains and, based on this exploration, proposes a detection method that utilizes the steganalysis rich model to extract high-frequency noise to complement spatial features. We have designed two main modules to fully leverage the interaction between these two aspects based on traditional convolutional neural networks. The first is the multi-scale mixed feature attention module, which introduces artifacts from high-frequency noise into spatial textures, thereby enhancing the model's learning of spatial texture features. The second is the multi-scale channel attention module, which reduces the impact of background noise by weighting the features. Our proposed method was experimentally evaluated on mainstream datasets, and a significant amount of experimental results demonstrate the effectiveness of our approach in detecting Deepfake forged faces, outperforming the majority of existing methods.

关键词： Deepfakes Adaptation models Attention mechanisms Noise Transforms Transformers Feature extraction Internet Convolutional neural networks Faces

来源：评论

学校读者我要写书评

暂无评论

Social Media User Behavior Mining and Sentiment Analysis based on Computer big data 4

Social Media User Behavior Mining and Sentiment Analysis bas...

引用

4th IEEE International Conference on Mobile Networks and Wireless Communications, ICMNWC 2024

作者： Zhang, Qiang Hou, Yutong School of Artificial Intelligence and Big Data Chongqing Metropolitan College of Science and Technology Chongqing China

ISBN: (纸本)9798350352931

Faced with the rapid development of social networks and the enormous business opportunities they contain, data mining and analysis based on social networks has become an inevitable trend. By utilizing various technologies such as Visual C++6.0 and Pajek, data collection and analysis were achieved. The user ID (Identification) was added to the Sina server and relevant user information was requested through the API (Application Programming Interface). Through the data analysis module, the stored data was processed appropriately for different research objects, and network analysis tools can be used to visually display certain results. Using this model, sentiment classification was performed on Weibo samples with existing labels, and the accuracy, recall and F1 value of manual labeling were evaluated. The positive accuracy was 0.85;recall was 0.82;F1 value was 0.84. This article helps to improve user experience and explore more user value. © 2024 IEEE.

关键词： data accuracy

来源：评论

学校读者我要写书评

暂无评论

A Survey of LLM datasets:From Autoregressive Model to AI Chatbot

引用

Journal of Computer science & technology 2024年第3期39卷 542-566页

作者：杜非马新建杨婧如柳熠罗超然王学斌姜海鸥景翔 National Key Laboratory of Data Space Technology and System Beijing 100195China Advanced Institute of Big Data Beijing 100195China Fu Foundation School of Engineering and Applied Science Columbia UniversityNY 10027U.S.A. School of Software and Microelectronics Peking UniversityBeijing 100091China CCF IEEE

Since OpenAI opened access to ChatGPT,large language models(LLMs)become an increasingly popular topic attracting researchers’attention from abundant ***,public researchers meet some problems when developing LLMs given that most of the LLMs are produced by industries and the training details are typically *** datasets are an important setup of LLMs,this paper does a holistic survey on the training datasets used in both the pre-train and fine-tune *** paper first summarizes 16 pre-train datasets and 16 fine-tune datasets used in the state-of-the-art ***,based on the properties of the pre-train and fine-tune processes,it comments on pre-train datasets from quality,quantity,and relation with models,and comments on fine-tune datasets from quality,quantity,and *** study then critically figures out the problems and research trends that exist in current LLM *** study helps public researchers train and investigate LLMs by visual cases and provides useful comments to the research community regarding data *** the best of our knowledge,this paper is the first to summarize and discuss datasets used in both autoregressive and chat *** survey offers insights and suggestions to researchers and LLM developers as they build their models,and contributes to the LLM study by pointing out the existing problems of LLM studies from the perspective of data.

关键词： large language model(LLM) autoregressive model AI chatbot natural language processing(NLP)corpora OpenAI

来源：评论

学校读者我要写书评

暂无评论

CrossFi: A Cross Domain Wi-Fi Sensing Framework Based on Siamese Network

引用

IEEE Internet of Things Journal 2025年第12期12卷 20138-20155页

作者： Zhao, Zijian Chen, Tingwei Cai, Zhijie Li, Xiaoyang Li, Hang Chen, Qimei Zhu, Guangxu Shenzhen Research Institute of Big Data Shenzhen518115 China Sun Yat-sen University School of Computer Science and Engineering Guangzhou510275 China Shenzhen Research Institute of Big Data Shenzhen518115 China Wuhan University School of Electronic Information Wuhan430072 China

In recent years, Wi-Fi sensing has garnered significant attention due to its numerous benefits, such as privacy protection, low cost, and penetration ability. Extensive research has been conducted in this field, focusing on areas such as gesture recognition, people identification, and fall detection. However, many data-driven methods encounter challenges related to domain shift, where the model fails to perform well in environments different from the training data. One major factor contributing to this issue is the limited availability of Wi-Fi sensing datasets, which makes models learn excessive irrelevant information and over-fit to the training set. Unfortunately, collecting large-scale Wi-Fi sensing datasets across diverse scenarios is a challenging task. To address this problem, we propose CrossFi, a siamese network-based approach that excels in both in-domain scenario and cross-domain scenario, including few-shot, zero-shot scenarios, and even works in few-shot new-class scenario where testing set contains new categories. The core component of CrossFi is a sample-similarity calculation network called CSi-Net, which improves the structure of the siamese network by using an attention mechanism to capture similarity information, instead of simply calculating the distance or cosine similarity. Based on it, we develop an extra Weight-Net that can generate a template for each class, so that our CrossFi can work in different scenarios. Experimental results demonstrate that our CrossFi achieves state-of-the-art performance across various scenarios. In gesture recognition task, our CrossFi achieves an accuracy of 98.17% in in-domain scenario, 91.72% in one-shot cross-domain scenario, 64.81% in zero-shot cross-domain scenario, and 84.75% in one-shot new-class scenario. The code for our model is publicly available at https://***/RS2002/CrossFi. © 2014 IEEE.

关键词： Gesture recognition

来源：评论

学校读者我要写书评

暂无评论

A survey on multimodal large language models

引用

National science Review 2024年第12期11卷 277-296页

作者： Shukang Yin Chaoyou Fu Sirui Zhao Ke Li Xing Sun Tong Xu Enhong Chen School of Artificial Intelligence and Data Science University of Science and Technology of China State Key Laboratory for Novel Software Technology Nanjing University School of Intelligence Science and Technology Nanjing University Tencent YouTu Lab

Recently, the multimodal large language model(MLLM) represented by GPT-4V has been a new rising research hotspot, which uses powerful large language models(LLMs) as a brain to perform multimodal tasks. The surprising emergent capabilities of the MLLM, such as writing stories based on images and optical character recognition–free math reasoning, are rare in traditional multimodal methods, suggesting a potential path to artificial general intelligence. To this end, both academia and industry have endeavored to develop MLLMs that can compete with or even outperform GPT-4V, pushing the limit of research at a surprising speed. In this paper, we aim to trace and summarize the recent progress of MLLMs. First, we present the basic formulation of the MLLM and delineate its related concepts, including architecture,training strategy and data, as well as evaluation. Then, we introduce research topics about how MLLMs can be extended to support more granularity, modalities, languages and scenarios. We continue with multimodal hallucination and extended techniques, including multimodal in-context learning, multimodal chain of thought and LLM-aided visual reasoning. To conclude the paper, we discuss existing challenges and point out promising research directions.

关键词： multimodal large language model vision language model large language model

来源：评论

学校读者我要写书评

暂无评论

Unsupervised social network embedding via adaptive specific mappings

引用

Frontiers of Computer science 2024年第3期18卷 61-71页

作者： Youming GE Cong HUANG Yubao LIU Sen ZHANG Weiyang KONG School of Computer Science and Engineering Sun Yat-Sen UniversityGuangzhou 510275China Guangdong Key Laboratory of Big Data Analysis and Processing Guangzhou 510006China

In this paper,we address the problem of unsuperised social network embedding,which aims to embed network nodes,including node attributes,into a latent low dimensional *** recent methods,the fusion mechanism of node attributes and network structure has been proposed for the problem and achieved impressive prediction ***,the non-linear property of node attributes and network structure is not efficiently fused in existing methods,which is potentially helpful in learning a better network *** this end,in this paper,we propose a novel model called ASM(Adaptive Specific Mapping)based on encoder-decoder *** encoder,we use the kernel mapping to capture the non-linear property of both node attributes and network *** particular,we adopt two feature mapping functions,namely an untrainable function for node attributes and a trainable function for network *** the mapping functions,we obtain the low dimensional feature vectors for node attributes and network structure,***,we design an attention layer to combine the learning of both feature vectors and adaptively learn the node *** encoder,we adopt the component of reconstruction for the training process of learning node attributes and network *** conducted a set of experiments on seven real-world social network *** experimental results verify the effectiveness and efficiency of our method in comparison with state-of-the-art baselines.

关键词： network embedding specific kernel mapping attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Real-time distance field acceleration based free-viewpoint video synthesis for large sports fields

引用

Computational Visual Media 2024年第2期10卷 331-353页

作者： Yanran Dai Jing Li Yuqi Jiang Haidong Qin Bang Liang Shikuan Hong Haozhe Pan Tao Yang School of Telecommunications Engineering Xidian UniversityXi’an 710071China National Engineering Laboratory for Integrated AeroSpace-Ground-Ocean Big Data Application Technology SAIIPthe School of Computer ScienceNorthwestern Polytechnical UniversityXi’an 710129China

Free-viewpoint video allows the user to view objects from any virtual perspective,creating an immersive visual *** technology enhances the interactivity and freedom of multimedia ***,many free-viewpoint video synthesis methods hardly satisfy the requirement to work in real time with high precision,particularly for sports fields having large areas and numerous moving *** address these issues,we propose a freeviewpoint video synthesis method based on distance field *** central idea is to fuse multiview distance field information and use it to adjust the search step size *** step size search is used in two ways:for fast estimation of multiobject three-dimensional surfaces,and synthetic view rendering based on global occlusion *** have implemented our ideas using parallel computing for interactive display,using CUDA and OpenGL frameworks,and have used real-world and simulated experimental datasets for *** results show that the proposed method can render free-viewpoint videos with multiple objects on large sports fields at 25 ***,the visual quality of our synthetic novel viewpoint images exceeds that of state-of-the-art neural-rendering-based methods.

关键词： free-viewpoint video view synthesis camera array distance field sports video

来源：评论

学校读者我要写书评

暂无评论

Abnormal Clustering and Cross Slicing Transformer for Insect Fine-Grained Image Classification 2

Abnormal Clustering and Cross Slicing Transformer for Insect...

引用

2nd International Conference on Algorithm, Image Processing and Machine Vision, AIPMV 2024

作者： Mei, Aokun Huo, Hua Big Data and Computing Intelligence Engineering Technology Research Center School of Information Engineering Henan University of Science and Technology Big Data Analysis Laboratory of Henan Medical Luoyang471023 China

ISBN: (纸本)9798350390254

Insect fine-grained image classification is an application scenario in fine-grained image classification. It not only has the characteristics of small inter-class differences and large intra-class differences, but also has the difficulty that some categories have multiple life-stage forms, which makes the general fine-grained image classification model difficult to play a role in insect scenes. To this end, based on the Vision Transformer, we design a fine-grained classification network for insect images based on abnormal clustering and cross slicing, called ACCS-Trans. In the first stage of the model, we use segmentation and clustering operations to distinguish the special morphology of insects in those few-shot life stages. The model can avoid the interference of few-sample abnormal morphology on the class feature extraction of the current class during training. The second stage is the cross slicing module, which uses the anchor box of the image sample segmentation region in the first stage to cut the original sample image to form the main target image, which is used as the information supplement area of the original image. Finally, the image is divided into doubling patch groups by two vertical patch operations. In the third stage, we concatenate the multiplication patch group and input it into the Vision Transformer network for class feature extraction. We fully experiment with ACCS-Trans on two insect image datasets. Compared with the current mainstream fine-grained image classification models, ACCS- Trans achieves state-of-the-art effects on both datasets, and we do ablation experiments for each module. The effect of each module on our ACCS- Trans is analyzed. The excellent performance of our ACCS- Trans in insect scenes is verified in these experiments, these provide new ideas for the task of fine-grained classification of insect images. © 2024 IEEE.

关键词： Image segmentation

来源：评论

学校读者我要写书评

暂无评论

MULTI-MEDIA IMAGE AND VIDEO OVERLAY TEXT EXTRACTION BASED ON BAYESIAN CLASSIFICATION ALGORITHM

引用

Scalable Computing 2024年第4期25卷 2664-2670页

作者： YIN, LIANGLIANG WANG, ZIQIANG School of Big Data Science Hebei Finance University Baoding071051 China Jiangsu Ocean University Jiangsu222005 China

With the continuous development of Internet technology, using multimedia for virtual application has become a new way. In this paper, by introducing the bayesian classification algorithm, multimedia graphics and video for carding, testing and implementation of edge image of fine processing, matching the corresponding text for quick positioning, at the same time use the filter for image and video background picture, further classification, distinguish between background and text, the extraction of superposition character. Simulation results show that the bayesian classification algorithm is effective and improves the efficiency and accuracy of image and video processing. © (2024), SCPE.

关键词： Extraction

来源：评论

学校读者我要写书评

暂无评论

CBPF: A Novel Method For Filtering Poisoned data Based on Composite Backdoor Attacks

引用

IEEE Internet of Things Journal 2025年第13期12卷 25136-25147页

作者： Xia, Hanfeng Hong, Haibo Wang, Ruili Sun, Yiru Ding, Hao Zhejiang Gongshang University Zhejiang Key Laboratory of Big Data and Future E-Commerce Technology School of Computer Science and Technology Hangzhou China Massey University School of Mathematical and Computational Sciences Auckland New Zealand

Backdoor attacks involve the injection of a limited quantity of poisoned samples containing triggers into the training dataset. During the inference stage, backdoor attacks can uphold a high level of accuracy for normal examples, yet when presented with trigger-containing instances, the model may erroneously predict them as the targeted class designated by the attacker. This paper addresses the challenge of backdoor attacks by developing a novel method for filtering poisoned samples. We primarily leverage two key characteristics of backdoor attacks: 1) Multiple backdoors can exist simultaneously within a single model;2) The discovery through Composite Backdoor Attack (CBA) that altering two triggers in a sample to new target labels does not compromise the original functionality of the triggers, yet enables the prediction of the data as a new target class when both triggers are present simultaneously. Therefore, a novel three-stage poisoning data filtering approach, known as Composite Backdoor Poisoning Filtering (CBPF), is proposed as an effective solution. Firstly, utilizing the identified distinctions in output between poisoned and clean samples, a subset of data is partitioned to include both poisoned and clean data. Subsequently, benign triggers are incorporated and labels are adjusted to create new target and benign target classes, thereby prompting the poisoned and clean data to be classified as distinct entities during the inference stage. The experimental results indicate that CBPF is successful in filtering out poisoned data produced by seven advanced attacks on CIFAR-10, GTSRB and ImageNet-12. On average, CBPF attains a notable filtering success rate of 99.88% for these attacks on CIFAR-10. Additionally, the model trained on the uncontaminated samples exhibits sustained high accuracy levels. © 2014 IEEE.

关键词： Wiener filtering

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：