检索结果-内蒙古大学图书馆

23rd China National Conference on computational Linguistics, CCL 2024

作者： Zhang, Chunkang Cao, Boxi Lu, Yaojie Lin, Hongyu Cao, Liu Zeng, Ke Wan, Guanglu Cai, Xunliang Han, Xianpei Sun, Le Chinese Information Processing Laboratory Beijing China State Key Laboratory of Computer Science Institute of Software Chinese Academy of Sciences Beijing China University of Chinese Academy of Sciences Beijing China Meituan Beijing China

ISBN: (纸本)9789819783663

Instruction Fine-Tuning (IFT) emerges as an essential step of training large language models to robustly carry out tasks of interest. However, there lacks a systematic investigation about the underlying mechanisms of instruction fine-tuning, particularly on the forgetting phenomenon after IFT, known as alignment tax. Therefore, to understand the mechanism of IFT from the forgetting perspective, we investigate the alternation of the text pattern and knowledge within models throughout the entire IFT process. Specifically, we restore fine-tuned models to their base version by training them on the data sharing a similar distribution with the pre-training corpus and compare their results Our experiment indicates that there is a stage transition of forgetting during IFT process: (1) Pseudo Forgetting: in this stage, models mainly shift their familiar text pattern away from pre-training data format while the world knowledge is preserved. Consequently, models will recover to their original performance when they are restored to the base version. (2) Actual Forgetting: in this stage, models forget the acquired knowledge as well. Therefore, they fail to reach the original performance even if they are restored to the base version. © The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2025.

关键词： Restoration

来源：评论

学校读者我要写书评

暂无评论

Benchmarking Hallucination in Large Language Models based on Unanswerable Math Word Problem 30

Benchmarking Hallucination in Large Language Models based on...

引用

Joint 30th International Conference on computational Linguistics and 14th International Conference on Language Resources and Evaluation, LREC-COLING 2024

作者： Sun, Yuhong Yin, Zhangyue Guo, Qipeng Wu, Jiawen Qiu, Xipeng Zhao, Hui Software Engineering Institute East China Normal University China School of Computer Science Fudan University China Shanghai AI Laboratory China Shanghai Key Laboratory of Trustworthy Computing Shanghai China

ISBN: (纸本)9782493814104

Large language models (LLMs) are highly effective in various natural language processing (NLP) tasks. However, they are susceptible to producing unreliable conjectures in ambiguous contexts called hallucination. This paper presents a new method for evaluating LLM hallucination in Question Answering (QA) based on the unanswerable math word problem (MWP). To support this approach, we innovatively develop a dataset called Unanswerable Math Word Problem (UMWP) which comprises 5200 questions across five categories. We developed an evaluation methodology combining text similarity and mathematical expression detection to determine whether LLM considers the question unanswerable. The results of extensive experiments conducted on 31 LLMs, including GPT-3, InstructGPT, LLaMA, and Claude, demonstrate that in-context learning and reinforcement learning with human feedback (RLHF) training significantly enhance the model's ability to avoid hallucination. We show that utilizing MWP is a reliable and effective approach to assess hallucination. Our code and data are available at https://***/Yuki-Asuuna/UMWP. © 2024 ELRA Language Resource Association: CC BY-NC 4.0.

关键词： Reinforcement learning

来源：评论

学校读者我要写书评

暂无评论

机器学习方法及应用:光电半导体材料计算设计

引用

science China Materials 2024年第4期67卷 1042-1081页

作者：杨晓雨周琨贺欣张立军 State Key Laboratory of Integrated Optoelectronics Key Laboratory of Automobile Materials of MOEKey Laboratory of Material Simulation Methods&Software of MOEand School of Materials Science and EngineeringJilin UniversityChangchun 130012China

高通量计算与材料数据库推动了数据驱动的机器学习方法的发展.机器学习已经成为材料计算研究的重要方法,在分析材料数据、加速材料计算、预测材料性质、推进新材料发现、筛选和设计等方面展现出极大的潜力.众多与材料计算相交叉的机器... 详细信息

高通量计算与材料数据库推动了数据驱动的机器学习方法的发展.机器学习已经成为材料计算研究的重要方法,在分析材料数据、加速材料计算、预测材料性质、推进新材料发现、筛选和设计等方面展现出极大的潜力.众多与材料计算相交叉的机器学习方法、模型以及框架不断涌现.本文综述了近年来光电半导体材料计算设计领域内机器学习方法的最新进展与应用.介绍了机器学习的流程与类型,基于不同材料表示方法的浅层模型、集成模型和深度神经网络,以及相关材料数据库和相关工具.我们还讨论了这些模型在预测材料稳定性与光电性质、材料逆向设计、构建材料构效关系等方面的应用.最后,本文对目前机器学习方法存在的机遇与挑战,即数据数量与质量、材料的表示、材料逆向设计做了进一步总结与讨论.

关键词：材料数据库机器学习深度神经网络逆向设计数据驱动材料计算集成模型光电性质

来源：评论

学校读者我要写书评

暂无评论

ONNXPruner: ONNX-Based General Model Pruning Adapter

引用

IEEE Transactions on Pattern Analysis and Machine Intelligence 2025年第7期47卷 5806-5817页

作者： Ren, Dongdong Li, Wenbin Ding, Tianyu Wang, Lei Fan, Qi Huo, Jing Pan, Hongbing Gao, Yang Nanjing University School of Electronic Science and Engineering China Nanjing University State Key Laboratory for Novel Software Technology School of Intelligence Science and Technology China Microsoft Corporation Applied Sciences Group RedmondWA98052 United States University of Wollongong School of Computing and Information Technology NSW2522 Australia

Recent advancements in model pruning have focused on developing new algorithms and improving upon benchmarks. However, the practical application of these algorithms across various models and platforms remains a significant challenge. To address this challenge, we propose ONNXPruner, a versatile pruning adapter designed for the ONNX format models. ONNXPruner streamlines the adaptation process across diverse deep learning frameworks and hardware platforms. A novel aspect of ONNXPruner is its use of node association trees, which automatically adapt to various model architectures. These trees clarify the structural relationships between nodes, guiding the pruning process, particularly highlighting the impact on interconnected nodes. Furthermore, we introduce a tree-level evaluation method. By leveraging node association trees, this method allows for a comprehensive analysis beyond traditional single-node evaluations, enhancing pruning performance without the need for extra operations. Experiments across multiple models and datasets confirm ONNXPruner's strong adaptability and increased efficacy. Our work aims to advance the practical application of model pruning. © 1979-2012 IEEE.

关键词： Adaptation Models Training Filters Deep Learning computational Modeling software Interoperability Load Modeling software Algorithms Electronic Mail General Model Pruning ONNX Tree Level Evaluation Deep Neural Network Model Pruning Deep Learning Model Architecture Deep Learning Framework Hardware Platform Superior Performance Deep Neural Network Feature Maps Py Torch Image Net Training Images Sparse Matrix Root Node Type Of Operation Single Node Deep Neural Network Model Validation Images Extra Components Importance Of Filtering Pruning Method Squeeze Net

来源：评论

学校读者我要写书评

暂无评论

CLIP-Flow:Decoding images encoded in CLIP space

引用

computational Visual Media 2024年第6期10卷 1157-1168页

作者： Hao Ma Ming Li Jingyuan Yang Or Patashnik Dani Lischinski Daniel Cohen-Or Hui Huang Visual Computing Research Center College of Computer Science and Software EngineeringShenzhen UniversityShenzhen 518060China Department of Computer Science Tel Aviv UniversityTel Aviv 6997801Israel School of Computer Science and Engineering the Hebrew University of JerusalemJerusalem 91904Israel

This study introduces CLIP-Flow,a novel network for generating images from a given image or *** effectively utilize the rich semantics contained in both modalities,we designed a semantics-guided methodology for image-and text-to-image *** particular,we adopted Contrastive Language-Image Pretraining(CLIP)as an encoder to extract semantics and StyleGAN as a decoder to generate images from such ***,to bridge the embedding space of CLIP and latent space of StyleGAN,real NVP is employed and modified with activation normalization and invertible *** the images and text in CLIP share the same representation space,text prompts can be fed directly into CLIP-Flow to achieve text-to-image *** conducted extensive experiments on several datasets to validate the effectiveness of the proposed image-to-image synthesis *** addition,we tested on the public dataset Multi-Modal CelebA-HQ,for text-to-image *** validated that our approach can generate high-quality text-matching images,and is comparable with state-of-the-art methods,both qualitatively and quantitatively.

关键词： image-to-image text-to-image contrastive language-image pretraining(CLIP) flow StyleGAN

来源：评论

学校读者我要写书评

暂无评论

Beating Posits at Their Own Game: Takum Arithmetic 5th

Beating Posits at Their Own Game: Takum Arithmetic

引用

5th International Conference for Next Generation Arithmetic

作者： Hunhold, Laslo Univ Cologne Parallel & Distributed Syst Grp Cologne Germany

ISBN: (纸本)9783031727085;9783031727092

Recent evaluations have highlighted the tapered posit number format as a promising alternative to the uniform precision IEEE 754 floating-point numbers, which suffer from various deficiencies. Although the posit encoding scheme offers superior coding efficiency at values close to unity, its efficiency markedly diminishes with deviation from unity. This reduction in efficiency leads to suboptimal encodings and a consequent diminution in dynamic range, thereby rendering posits suboptimal for general-purpose computer arithmetic. This paper introduces and formally proves 'takum' as a novel general-purpose logarithmic tapered-precision number format, synthesising the advantages of posits in low-bit applications with high encoding efficiency for numbers distant from unity. Takums exhibit an asymptotically constant dynamic range in terms of bit string length, which is delineated in the paper to be suitable for a general-purpose number format. It is demonstrated that takums either match or surpass existing alternatives. Moreover, takums address several issues previously identified in posits while unveiling novel and beneficial arithmetic properties.

关键词： takum arithmetic tapered number format logarithmic number system dynamic range posit arithmetic

来源：评论

学校读者我要写书评

暂无评论

IIN-FFD:Intra-Inter Network for Face Forgery Detection

引用

Tsinghua science and Technology 2024年第6期29卷 1839-1850页

作者： Qihua Zhou Zhili Zhou Zhipeng Bao Weina Niu Yuling Liu School of Software Nanjing University of Information Science and TechnologyNanjing 210044China Institute of Artificial Intelligence Guangzhou UniversityGuangzhou 510006China School of Computer Science University of Electronic Science and Technology of ChinaChengdu 611731SichuanChina School of Computer Science Hunan UniversityChangsha 410082HunanChina

Since different kinds of face forgeries leave similar forgery traces in videos,learning the common features from different kinds of forged faces would achieve promising generalization ability of forgery ***,to accurately detect known forgeries while ensuring high generalization ability of detecting unknown forgeries,we propose an intra-inter network(IIN)for face forgery detection(FFD)in videos with continual *** proposed IIN mainly consists of three modules,i.e.,intra-module,inter-module,and forged trace masking module(FTMM).Specifically,the intra-module is trained for each kind of face forgeries by supervised learning to extract special features,while the inter-module is trained by self-supervised learning to extract the common *** a result,the common and special features of the different forgeries are decoupled by the two feature learning modules,and then the decoupled common features can be utlized to achieve high generalization ability for ***,the FTMM is deployed for contrastive learning to further improve detection *** experimental results on FaceForensic++dataset demonstrate that the proposed IIN outperforms the state-of-the-arts in ***,the generalization ability of the IIN verified on DFDC and Celeb-DF datasets demonstrates that the proposed IIN significantly improves the generalization ability for FFD.

关键词： deep learning information security image classfication neural networks face forgery face forgery detection

来源：评论

学校读者我要写书评

暂无评论

SWPFOPLD: A Profiling and Optimizing Loader for SW26010Pro Processors 8

SWPFOPLD: A Profiling and Optimizing Loader for SW26010Pro P...

引用

8th International Conference on Computer and Communication Systems, ICCCS 2023

作者： Qian, Hong Wu, Wei Zhu, Qi Wang, Fei Zhao, Jinwei Zheng, Yan School of Computer Science and Technology University of Science and Technology of China Hefei China Jiangnan Institute of Computing Technology System Software Research Department Wuxi China Tsinghua University Department of Computer Science and Technology Beijing China National Research Center of Parallel Computer Engineering and Technology Beijing China

ISBN: (纸本)9781665456128

The Sunway family supercomputers have achieved a series of remarkable achievements. However, the toolchains provided by them are not perfect, which has brought great challenges to the development of high-performance application software. In this paper, a profiling and optimizing tool is proposed to assist people to analyze and optimize the performance of their programs. SWPFOPLD is independent of the application program and gathers the runtime performance data through the PMUs, an automatically hot functions rearrange optimization based on the performance data is furtherly accomplished. The evaluation shows that SWPFOPLD can be easily and effectively used to analyze and optimize the performance of the application programs. © 2023 IEEE.

关键词： Supercomputers

来源：评论

学校读者我要写书评

暂无评论

Event-Driven Attention Network:A Cross-Modal Framework for Efficient Image-Text Retrieval in Mass Gathering Events

引用

Computers, Materials & Continua 2025年第5期83卷 3277-3301页

作者： Kamil Yasen Heyan Jin Sijie Yang Li Zhan Xuyang Zhang Ke Qin Ye Li School of Computer Science and Engineering University of Electronic Science and Technology of ChinaChengdu611731China School of Information and Software Engineering University of Electronic Science and Technology ofChinaChengdu611731China Kashi Institute of Electronics and Information Industry Kashi844508China

Research on mass gathering events is critical for ensuring public security and maintaining social ***,most of the existing works focus on crowd behavior analysis areas such as anomaly detection and crowd counting,and there is a relative lack of research on mass gathering *** believe real-time detection and monitoring of mass gathering behaviors are essential formigrating potential security risks and ***,it is imperative to develop a method capable of accurately identifying and localizing mass gatherings before disasters occur,enabling prompt and effective *** address this problem,we propose an innovative Event-Driven Attention Network(EDAN),which achieves image-text matching in the scenario of mass gathering events with good results for the first *** image-text retrieval methods based on global alignment are difficult to capture the local details within complex scenes,limiting retrieval *** local alignment-based methods aremore effective at extracting detailed features,they frequently process raw textual features directly,which often contain ambiguities and redundant information that can diminish retrieval efficiency and degrade model *** overcome these challenges,EDAN introduces an Event-Driven AttentionModule that adaptively focuses attention on image regions or textual words relevant to the event *** calculating the semantic distance between event labels and textual content,this module effectively significantly reduces computational complexity and enhances retrieval *** validate the effectiveness of EDAN,we construct a dedicated multimodal dataset tailored for the analysis of mass gathering events,providing a reliable foundation for subsequent *** conduct comparative experiments with other methods on our dataset,the experimental results demonstrate the effectiveness of *** the image-to-text retrieval task,EDAN achieved the best performance on the R@5 metric,w

关键词： Mass gathering events image-text retrieval attention mechanism

来源：评论

学校读者我要写书评

暂无评论

Dissect Black Box: Interpreting for Rule-Based Explanations in Unsupervised Anomaly Detection 38

Dissect Black Box: Interpreting for Rule-Based Explanations ...

引用

38th Conference on Neural Information Processing Systems, NeurIPS 2024

作者： Zhang, Yu Li, Ruoyu Wu, Nengwu Li, Qing Lin, Xinhan Hu, Yang Li, Tao Jiang, Yong Shanghai Artificial Intelligence Laboratory China Peng Cheng Laboratory China College of Computer Science and Software Engineering Shenzhen University China Tsinghua University China Tsinghua Shenzhen International Graduate School China Hunan University of Science and Technology China

In high-stakes sectors such as network security, IoT security, accurately distinguishing between normal and anomalous data is critical due to the significant implications for operational success and safety in decision-making. The complexity is exacerbated by the presence of unlabeled data and the opaque nature of black-box anomaly detection models, which obscure the rationale behind their predictions. In this paper, we present a novel method to interpret the decision-making processes of these models, which are essential for detecting malicious activities without labeled attack data. We put forward the Segmentation Clustering Decision Tree (SCD-Tree), designed to dissect and understand the structure of normal data distributions. The SCD-Tree integrates predictions from the anomaly detection model into its splitting criteria, enhancing the clustering process with the model's insights into anomalies. To further refine these segments, the Gaussian Boundary Delineation (GBD) algorithm is employed to define boundaries within each segmented distribution, effectively delineating normal from anomalous data points. At this point, this approach addresses the curse of dimensionality by segmenting high-dimensional data and ensures resilience to data variability and perturbations through flexible boundary fitting. We transform the intricate operations of anomaly detection into an interpretable rule's format, constructing a comprehensive set of rules for understanding. Our method's evaluation on diverse datasets and models demonstrates superior explanation accuracy, fidelity, and robustness over existing method, proving its efficacy in environments where interpretability is paramount. © 2024 Neural information processing systems foundation. All rights reserved.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：