The prefrontal cortex is considered as one of the key brain regions for the study of affective dysfunction in depression. The study of neural processes in this region during specific emotional tasks may enhance our un...
详细信息
Ranked list truncation is of critical importance in a variety of professional information retrieval applications such as patent search or legal search. The goal is to dynamically determine the number of returned docum...
详细信息
Ensemble-based debiasing methods have been shown effective in mitigating the reliance of classifiers on specific dataset bias, by exploiting the output of a bias-only model to adjust the learning target. In this paper...
ISBN:
(纸本)9781713845393
Ensemble-based debiasing methods have been shown effective in mitigating the reliance of classifiers on specific dataset bias, by exploiting the output of a bias-only model to adjust the learning target. In this paper, we focus on the bias-only model in these ensemble-based methods, which plays an important role but has not gained much attention in the existing literature. Theoretically, we prove that the debiasing performance can be damaged by inaccurate uncertainty estimations of the bias-only model. Empirically, we show that existing bias-only models fall short in producing accurate uncertainty estimations. Motivated by these findings, we propose to conduct calibration on the bias-only model, thus achieving a three-stage ensemble-based debiasing framework, including bias modeling, model calibrating, and debiasing. Experimental results on NLI and fact verification tasks show that our proposed three-stage debiasing framework consistently outperforms the traditional two-stage one in out-of-distribution accuracy.
Relevance plays a central role in information retrieval (IR), which has received extensive studies starting from the 20th century. The definition and the modeling of relevance has always been critical challenges in bo...
详细信息
Question Answering (QA), a popular and promising technique for intelligent information access, faces a dilemma about data as most other AI techniques. On one hand, modern QA methods rely on deep learning models which ...
详细信息
Based on the message-passing paradigm, there has been an amount of research proposing diverse and impressive feature propagation mechanisms to improve the performance of GNNs. However, less focus has been put on featu...
详细信息
With the increase in the number of users and business volume, the business systems of Internet companies are becoming more and more complex, resulting in a surge in the number of alarms. A large number of dirty alarms...
详细信息
ISBN:
(纸本)9781665418164
With the increase in the number of users and business volume, the business systems of Internet companies are becoming more and more complex, resulting in a surge in the number of alarms. A large number of dirty alarms add a huge workload to security operations, which indirectly pose a large number of threats to business systems. At present, most systems use the method of accessing third-party Threat Intelligence to assist operators to realize automatic handling of alarms. However, this method has lagging and accuracy problems, making this work always difficult to meet the requirements of fast and accurate. This article proposes a new method for gathering vulnerability Threat Intelligence, which can obtain vulnerability information in advance of security announcements issued by security vendors. By analyzing the vulnerability disclosure process, this method obtains vulnerability information from the original source submitted by open source mail group, of developers. We used NLP technology and XGBoost model to automatically analyze the vulnerability information, and finally generate FINTEL. The experimental result shows that this method has an accuracy of 93%, and can obtain vulnerability information 10h to 7 days before security vendors release. The scope of application covers all open source code repositories and some closed source repositories.
Pre-training and fine-tuning have achieved remarkable success in many downstream natural language processing (NLP) tasks. Recently, pre-training methods tailored for information retrieval (IR) have also been explored,...
详细信息
Limited-angle and sparse-view computed tomography (LACT and SVCT) are crucial for expanding the scope of X-ray CT applications. However, they face challenges due to incomplete data acquisition, resulting in diverse ar...
详细信息
With the development of the digital economy and the advent of the big data era, the rapid growth of textual information in cyberspace has placed higher demands on information processing technology. This paper proposes...
详细信息
ISBN:
(数字)9798350391367
ISBN:
(纸本)9798350391374
With the development of the digital economy and the advent of the big data era, the rapid growth of textual information in cyberspace has placed higher demands on information processing technology. This paper proposes an information extraction technique based on the Text-to-Text Transfer Transformer, known as T5, and keyBERT models, aiming to efficiently distill the key content from textual information in cyberspace. This method combines automatic summarization and keyword extraction to form information briefs. The experiments selected representative policy documents related to data elements as input data and employed information entropy and ROUGE scores as metrics to evaluate the automatic summarization model. The results were compared with the outputs of similar models. Experimental results indicate that the T5 model outperforms other models in terms of summarization effectiveness, and the proposed information extraction method shows significant advantages in readability and processing efficiency, demonstrating practical application value.
暂无评论