检索结果-内蒙古大学图书馆

2024 Findings of the Association for Computational Linguistics, EMNLP 2024

作者： Chao, Wen-Shuo Zheng, Zhi Zhu, Hengshu Liu, Hao The Hong Kong University of Science and Technology Guangzhou China School of Data Science University of Science and Technology of China China Computer Network Information Center Chinese Academy of Sciences China

ISBN: (纸本)9798891761681

Large Language Models (LLMs) demonstrate robust capabilities across various fields, leading to a paradigm shift in LLM-enhanced Recommender System (RS). Research to date focuses on point-wise and pair-wise recommendation paradigms, which are inefficient for LLM-based recommenders due to high computational costs. However, existing list-wise approaches also fall short in ranking tasks due to misalignment between ranking objectives and next-token prediction. Moreover, these LLM-based methods struggle to effectively address the order relation among candidates, particularly given the scale of ratings. To address these challenges, this paper introduces the large language model framework with Aligned Listwise Ranking Objectives (ALRO). ALRO is designed to bridge the gap between the capabilities of LLMs and the nuanced requirements of ranking tasks. Specifically, ALRO employs explicit feedback in a listwise manner by introducing soft lambda loss, a customized adaptation of lambda loss designed for optimizing order relations. This mechanism provides more accurate optimization goals, enhancing the ranking process. Additionally, ALRO incorporates a permutation-sensitive learning mechanism that addresses position bias, a prevalent issue in generative models, without imposing additional computational burdens during inference. Our evaluative studies reveal that ALRO outperforms both existing embedding-based recommendation methods and LLM-based recommendation baselines. © 2024 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Research on Improved MobileViT Image Tamper Localization Model

引用

computers, Materials & Continua 2024年第8期80卷 3173-3192页

作者： Jingtao Sun Fengling Zhang Huanqi Liu Wenyan Hou School of Computer Science and Technology Xi’an University of Posts and TelecommunicationsXi’an710121China Shaanxi Key Laboratory of Network Data Analysis and Intelligent Processing Xi’an University of Posts and TelecommunicationsXi’an710121China

As image manipulation technology advances rapidly,the malicious use of image tampering has alarmingly escalated,posing a significant threat to social *** the realm of image tampering localization,accurately localizing limited samples,multiple types,and various sizes of regions remains a multitude of *** issues impede the model’s universality and generalization capability and detrimentally affect its *** tackle these issues,we propose FL-MobileViT-an improved MobileViT model devised for image tampering *** proposed model utilizes a dual-stream architecture that independently processes the RGB and noise domain,and captures richer traces of tampering through dual-stream ***,the model incorporating the Focused Linear Attention mechanism within the lightweight network(MobileViT).This substitution significantly diminishes computational complexity and resolves homogeneity problems associated with traditional Transformer attention mechanisms,enhancing feature extraction diversity and improving the model’s localization *** comprehensively fuse the generated results from both feature extractors,we introduce the ASPP architecture for multi-scale feature *** facilitates a more precise localization of tampered regions of various ***,to bolster the model’s generalization ability,we adopt a contrastive learning method and devise a joint optimization training strategy that leverages fused features and captures the disparities in feature distribution in tampered *** strategy enables the learning of contrastive loss at various stages of the feature extractor and employs it as an additional constraint condition in conjunction with cross-entropy *** a result,overfitting issues are effectively alleviated,and the differentiation between tampered and untampered regions is *** evaluations on five benchmark datasets(IMD-20,CASIA,NIST-16,Columbia and Coverage)validat

关键词： Image tampering localization focused linear attention mechanism MobileViT contrastive loss

来源：评论

学校读者我要写书评

暂无评论

Systematic Literature Review of Transformer Model Implementations in Detecting Depression 6

Systematic Literature Review of Transformer Model Implementa...

引用

6th International Conference on computer and Informatics Engineering, IC2IE 2023

作者： Nanggala, Kenjovan Pardamean, Bens Elwirehardja, Gregorius Natanael Bina Nusantara University Binus Graduate Program - Master of Computer Science Computer Science Department Jakarta Indonesia Bioinformatics and Data Science Research Center Bina Nusantara University Jakarta Indonesia

ISBN: (纸本)9798350345162

This systematic literature review explores the application of transformer models in early detection of human depression, encompassing text, audio, and video data modalities. Transformer architectures, notably BERT for text, have proven adept at capturing crucial contextual and linguistic patterns associated with depression. For audio and video data, hybrid approaches that combine transformer models with other architectures are prevalent. Key features considered include eye gaze, head pose, facial muscle movements, and audio characteristics such as MFCC and Log-mel Spectrogram, along with text embeddings. Performance comparisons underscore the superiority of text-based data in consistently delivering the most promising results, followed by audio and video modalities when utilizing transformer models. The fusion of multiple modalities emerges as an effective strategy for enhancing predictive accuracy, with the amalgamation of audio, video, and text data yielding the most precise outcomes. However, it is noteworthy that unimodal approaches also exhibit potential, with text data exhibiting superior performance over audio and video data. Nevertheless, several challenges persist in this research domain, including imbalanced datasets, the limited availability of comprehensive and diverse samples, and the inherent complexities in interpreting visual cues. Addressing these challenges remains imperative for the continued advancement of depression detection using transformer-based models across various modalities. © 2023 IEEE.

关键词： deep learning major depressive disorder multimodal transformer unimodal

来源：评论

学校读者我要写书评

暂无评论

What's the Real: A Novel Design Philosophy for Robust AI-Synthesized Voice Detection 24

What's the Real: A Novel Design Philosophy for Robust AI-Syn...

引用

32nd ACM International Conference on Multimedia, MM 2024

作者： Hai, Xuan Liu, Xin Tan, Yuan Liu, Gang Li, Song Niu, Weina Zhou, Rui Zhou, Xiaokang School of Information Science and Engineering Lanzhou University Lanzhou China The State Key Laboratory of Blockchain and Data Security Zhejiang University Hangzhou China School of Computer Science and Engineering University of Electronic Science and Technology of China Chengdu China Faculty of Business Data Science Kansai University Osaka Japan

ISBN: (纸本)9798400706868

Voice is one of the most widely used media for information transmission in human society. While high-quality synthetic voices are extensively utilized in various applications, they pose significant risks to content security and trust building. Numerous studies have concentrated on AI-synthesized voice detection to mitigate these risks, with many claiming to achieve promising performance. However, recent research has demonstrated that fake voice detectors suffer from serious overfitting to speaker-irrelative features (SiFs) and cannot be used in real-world scenarios. In this paper, we analyze the limitations of existing fake voice detectors and propose a new design philosophy, guiding the detection model to prioritize learning human voice features rather than the difference between the human voice and the synthetic voice. Based on this philosophy, we propose a novel AI-synthesized voice detection framework named SiFSafer, which uses pre-trained speech representation models to enhance the learning of feature distribution in human voices and the adapter fine-tuning to optimize the performance. The evaluation shows that the average EERs of existing fake voice detectors in the ASVspoof datasets can exceed 20% if the SiFs like silence segments are removed, while SiFSafer achieves an EER of less than 8%, indicating that SiFSafer is robust to SiFs and strongly resistant to existing attacks. © 2024 ACM.

关键词： Fake detection

来源：评论

学校读者我要写书评

暂无评论

Online multi-label streaming feature selection based on neighborhood rough set with label correlation 24

Online multi-label streaming feature selection based on neig...

引用

3rd International Conference on Artificial Intelligence and Education, ICAIE 2024

作者： Pan, Siping Lin, Yaojin Mao, Yu Lin, Shaojie School of Computer Science Minnan Normal University Fujian Zhangzhou China Key Laboratory of Data Science and Intelligence Application Minnan Normal University Fujian Zhangzhou China

ISBN: (纸本)9798400712692

Online multi-label streaming feature selection has gained significant interest in high-volume data applications. Neighborhood Rough Set (NRS) has emerged as a practical tool for handling multi-label feature selection. However, the majority of existing works have been concentrated on situations where labels are treated as independent and unrelated entities, disregarding the genuine context of interdependence and correlation among labels. To address this issue, this paper introduces a novel approach for online multi-label streaming feature selection, incorporating NRS and Label Correlation (LC). In our approach, we propose the concept of strongly related label subsets based on NRS. As considering label correlation, we compute the similarity between different labels and assign different weights to each label. This integrated method enhances the effectiveness of feature selection by leveraging the interdependencies among labels. he proposed reliability of the algorithm is validated experimentally. © 2024 Copyright held by the owner/author(s).

关键词： Feature extraction

来源：评论

学校读者我要写书评

暂无评论

Calligraphy Alphabet Perception Using Artificial Intelligence 9

Calligraphy Alphabet Perception Using Artificial Intelligenc...

引用

9th International Conference on science, Technology, Engineering and Mathematics, ICONSTEM 2024

作者： Venkatesh, S. Jeevitha, D. Gnanaselvi, J. Anitha Srm Institute of Science And Technology Department of Data Science And Business Systems School of Computing Chennai Kattankulathur India Jeppiaar Engineering College Department of Computer Science And Engineering Chennai India

ISBN: (纸本)9798350365092

The human brain has a simple time analyzing and processing images. The brain is able to rapidly deconstruct and distinguish an image's various components when the eye perceives it. With the Convolutional Neural Network (CNN) as its foundation, this research suggests deep learning conceptual models. When the algorithms are compared, it becomes clear that CNN-based classification of handwritten alphabets performs better than other algorithms in terms of accuracy. The Manual Net, Alex Net, and LeNet Architectures are among the CNN algorithms employed in this research. The convulational layer, max pooling, flattening, feature assortment, rectifier lined unit, and completely linked softmaxx layers are each components of the aforementioned designs. The proposed network is tested using an image dataset comprising 530 training photos and 2756 testing images. The top precision and cost-efficient model will be used in the Django context to build a handler line for supplying the appeal to be recognized and obtaining the productivity outcome of recognized appeal. © 2024 IEEE.

关键词： Convolutional neural networks

来源：评论

学校读者我要写书评

暂无评论

Large Language Model-Based Event Relation Extraction with Rationales 31

Large Language Model-Based Event Relation Extraction with Ra...

引用

31st International Conference on Computational Linguistics, COLING 2025

作者： Hu, Zhilei Li, Zixuan Jin, Xiaolong Bai, Long Guo, Jiafeng Cheng, Xueqi CAS Key Laboratory of Network Data Science and Technology Institute of Computing Technology Chinese Academy of SciencesSchool of Computer Science and Technology University of Chinese Academy of Sciences

ISBN: (纸本)9798891761964

Event Relation Extraction (ERE) aims to extract various types of relations between different events within texts. Although Large Language Models (LLMs) have demonstrated impressive capabilities in many natural language processing tasks, existing ERE methods based on LLMs still face three key challenges: (1) Time Inefficiency: The existing pairwise method of combining events and determining their relations is time-consuming for LLMs. (2) Low Coverage: When dealing with numerous events in a document, the limited generation length of fine-tuned LLMs restricts the coverage of their extraction results. (3) Lack of Rationale: Essential rationales concerning the results that could enhance the reasoning ability of the model are overlooked. To address these challenges, we propose LLMERE, an LLM-based approach with rationales for the ERE task. LLMERE transforms ERE into a question-and-answer task that may have multiple answers. By extracting all events related to a specified event at once, LLMERE reduces time complexity from O(n2) to O(n), compared to the pairwise method. Subsequently, LLMERE enhances the coverage of extraction results by employing a partitioning strategy that highlights only a portion of the events in the document at a time. In addition to the extracted results, LLMERE is also required to generate corresponding rationales/reasons behind them, in terms of event coreference information or transitive chains of event relations. Experimental results on three widely used datasets show that LLMERE achieves significant improvements over baseline methods. © 2025 Association for Computational Linguistics.

关键词： Computational linguistics

来源：评论

学校读者我要写书评

暂无评论

Multilayer Network Analysis of Brain Signals for Detecting Alzheimer’s Disease 12th

Multilayer Network Analysis of Brain Signals for Detecting...

引用

12th International Conference on Computational Advances in Bio and Medical sciences, ICCABS 2023

作者： Nguyen, Sean M. Basiri, Mohammad Amin Khanmohammadi, Sina School of Computer Science University of Oklahoma NormanOK73019 United States Data Science and Analytics Institute University of Oklahoma NormanOK73019 United States

ISBN: (纸本)9783031827679

Human neuroimaging datasets provide rich multi-scale spatiotemporal information about the state of the brain. Most current methods, such as spectral analysis, focus on a single facet of these datasets and do not take full advantage of the inherent spatiotemporal information. Here, we consider a multilayer cross-frequency functional connectivity analysis to capture the complex spatiotemporal features of neural datasets at multiple scales and show that such features could potentially provide a better description of the neural activity. We demonstrate the effectiveness of this approach by applying the proposed method to capture disruptions of cross-frequency brain connections in Alzheimer’s patients. More specifically, we compared the multi-scale features extracted from electroencephalogram (EEG) data with traditional features in a machine learning framework to distinguish Alzheimer’s patients from control subjects. Our results show that such multi-scale features improve the prediction accuracy when compared to traditional feature extraction methods in EEG analysis. © The Author(s), under exclusive license to Springer Nature Switzerland AG 2025.

关键词： Multilayer neural networks

来源：评论

学校读者我要写书评

暂无评论

Inductive Link Prediction in N-ary Knowledge Graphs 31

Inductive Link Prediction in N-ary Knowledge Graphs

引用

31st International Conference on Computational Linguistics, COLING 2025

作者： Wei, Jiyao Guan, Saiping Jin, Xiaolong Guo, Jiafeng Cheng, Xueqi School of Computer Science and Technology University of Chinese Academy of Sciences Key Laboratory of Network Data Science and Technology Institute of Computing Technology Chinese Academy of Sciences China

ISBN: (纸本)9798891761964

N-ary Knowledge Graphs (NKGs), where a fact can involve more than two entities, have gained increasing attention. Link Prediction in NKGs (LPN) aims to predict missing elements in facts to facilitate the completion of NKGs. Current LPN methods implicitly operate under a closed-world assumption, meaning that the sets of entities and roles are fixed. These methods focus on predicting missing elements within facts composed of entities and roles seen during training. However, in reality, new facts involving unseen entities and roles frequently emerge, requiring completing these facts. Thus, this paper proposes a new task, Inductive Link Prediction in NKGs (ILPN), which aims to predict missing elements in facts involving unseen entities and roles in emerging NKGs. To address this task, we propose a Meta-learning-based Nary knowledge Inductive Reasoner (MetaNIR), which employs a graph neural network with meta-learning mechanisms to embed unseen entities and roles adaptively. The obtained embeddings are used to predict missing elements in facts involving unseen elements. Since no existing dataset supports this task, three datasets are constructed to evaluate the effectiveness of MetaNIR. Extensive experimental results demonstrate that MetaNIR consistently outperforms representative models across all datasets. © 2025 Association for Computational Linguistics.

关键词： Knowledge graph

来源：评论

学校读者我要写书评

暂无评论

Hierarchical vectorization for facial images

引用

Computational Visual Media 2024年第1期10卷 97-118页

作者： Qian Fu Linlin Liu Fei Hou Ying He School of Computer Science and Engineering Nanyang Technological University639798SingaporeSingapore Data61 Commonwealth Scientific and Industrial Research OrganisationSydney2015Australia Interdisciplinary Graduate School Nanyang Technological University and Alibaba Group639798SingaporeSingapore State Key Laboratory of Computer Science Institute of SoftwareChinese Academy of SciencesBeijing100190China University of Chinese Academy of Sciences Beijing100049China

The explosive growth of social media means portrait editing and retouching are in high *** portraits are commonly captured and stored as raster images,editing raster images is non-trivial and requires the user to be highly *** at developing intuitive and easy-to-use portrait editing tools,we propose a novel vectorization method that can automatically convert raster images into a 3-tier hierarchical *** base layer consists of a set of sparse diffusion curves(DCs)which characterize salient geometric features and low-frequency colors,providing a means for semantic color transfer and facial expression *** middle level encodes specular highlights and shadows as large,editable Poisson regions(PRs)and allows the user to directly adjust illumination by tuning the strength and changing the shapes of *** top level contains two types of pixel-sized PRs for high-frequency residuals and fine details such as pimples and *** train a deep generative model that can produce high-frequency residuals *** to the inherent meaning in vector primitives,editing portraits becomes easy and *** particular,our method supports color transfer,facial expression editing,highlight and shadow editing,and automatic *** quantitatively evaluate the results,we extend the commonly used FLIP metric(which measures color and feature differences between two images)to consider *** new metric,illumination-sensitive FLIP,can effectively capture salient changes in color transfer results,and is more consistent with human perception than FLIP and other quality measures for portrait *** evaluate our method on the FFHQR dataset and show it to be effective for common portrait editing tasks,such as retouching,light editing,color transfer,and expression editing.

关键词： face editing vectorization Poisson editing color transfer illumination editing expression editing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：