In the age of digital technology, the exponential spread of fake news has become a significant issue for society. In response to this issue, considerable advances have been made to identify fake news using machine lea...
详细信息
ISBN:
(纸本)9798350361513;9798350372304
In the age of digital technology, the exponential spread of fake news has become a significant issue for society. In response to this issue, considerable advances have been made to identify fake news using machine learning (ML) models. This literature review investigates the current state of research on detecting fake news. It emphasizes the use of ML models such as TF-LIP, Naive Bayes, and Random Forest, as well as deep learning (DL) models such as Convolutional Neural Networks (CNNs), Long Short-Term Memory (LSTM) networks, and Transformer-based models such as BERT. The review concisely summarizes the essential findings and discusses the potential future implications offake news identification. It also emphasizes the need for additional research to address numerous challenges, such as effective multimedia content management, protection against adversarial attacks, attainment of model generalizability, facilitation of real-time detection, and adherence to ethical standards when developing detection systems. This review is a resource for researchers and practitioners seeking to develop effective methods for addressing the perpetually expanding problem of detecting fake news.
Large language models (LLMs) exhibit complementary strengths in various tasks, motivating the research of LLM ensembling. However, existing work focuses on training an extra reward model or fusion model to select or c...
The data source and data collection method are described, and the results of initial data are presented. And, the method of word segmentation and stop words removal in the preprocessing process is used to clean the da...
详细信息
Relying on Transformer for complex visual feature learning, object tracking has witnessed the new standard for state-of-the-arts (SOTAs). However, this advancement accompanies by larger training data and longer traini...
ISBN:
(纸本)9781713871088
Relying on Transformer for complex visual feature learning, object tracking has witnessed the new standard for state-of-the-arts (SOTAs). However, this advancement accompanies by larger training data and longer training period, making tracking increasingly expensive. In this paper, we demonstrate that the Transformer-reliance is not necessary and the pure ConvNets are still competitive and even better yet more economical and friendly in achieving SOTA tracking. Our solution is to unleash the power of multimodal vision-language (VL) tracking, simply using ConvNets. The essence lies in learning novel unified-adaptive VL representations with our modality mixer (ModaMixer) and asymmetrical ConvNet search. We show that our unified-adaptive VL representation, learned purely with the ConvNets, is a simple yet strong alternative to Transformer visual features, by unbelievably improving a CNN-based Siamese tracker by 14.5% in SUC on challenging LaSOT (50.7%.65.2%), even outperforming several Transformer-based SOTA trackers. Besides empirical results, we theoretically analyze our approach to evidence its effectiveness. By revealing the potential of VL representation, we expect the community to divert more attention to VL tracking and hope to open more possibilities for future tracking beyond Transformer. Code and models are released at https://***/JudasDie/SOTS.
para>With the acceleration of the digitization process of ancient literature, the automatic extraction of entity information in ancient literature can enable researchers to study ancient history and literature deep...
详细信息
Tokenizer, serving as a translator to map the intricate visual data into a compact latent space, lies at the core of visual generative models. Based on the finding that existing tokenizers are tailored to image or vid...
Stance detection in texts is an important task in naturallanguageprocessing with diverse applications in political, economic and marketing. The research conducted in this area is categorized into traditional methods...
详细信息
ISBN:
(数字)9798331508913
ISBN:
(纸本)9798331508920
Stance detection in texts is an important task in naturallanguageprocessing with diverse applications in political, economic and marketing. The research conducted in this area is categorized into traditional methods, deep learning methods and language model-based methods. However, to the best of our knowledge, none of these methods have effectively addressed context-aware stance detection. In this paper, we propose a method called CASKOW, which first collects context for a given text from Wikipedia as an external resource using information retrieval techniques and then employs the context alongside a large language model (LLM) to detect the stance. Experimental results show that CASKOW achieves a 7% improvement in F1-measure when the label “none” is included and a 10 % improvement when the label “none” is excluded, compared to models that do not utilize contextual information. Additionally, non-numerical interpretation of results reveals key insights, allowing for the identification of errors and challenges in stance detection datasets.
Transformer architecture has become ubiquitous in the naturallanguageprocessing field. To interpret the Transformer-based models, their attention patterns have been extensively analyzed. However, the Transformer arc...
详细信息
The rapid development of language science and computing technology, especially the popularization of broadband Internet, has caused the explosion of all-language news to spread and communicate faster and faster. Among...
详细信息
ISBN:
(纸本)9781665483117
The rapid development of language science and computing technology, especially the popularization of broadband Internet, has caused the explosion of all-language news to spread and communicate faster and faster. Among multi-modal news such as text, image, audio, and video, text news still accounts for the largest proportion of Internet news. In the face of more than 7,000 existing human languages, efficiently identifying the language of text news has become the most basic naturallanguageprocessing technology, which can select accurate languageprocessingmethods for subsequent in-depth content processing and network public opinion analysis. Based on the idea of N-Gram, we designed and implemented a set of language identification methods suitable for all-language Internet news from two aspects: language training and language identification, and applied it to actual text news preprocessing. The language identification results of all-language Internet news show that our method has good recognition accuracy and efficiency.
In the realm of machine learning, a profound understanding of sentence semantics holds paramount importance for various applications, notably text classification. Traditionally, this comprehension has been entrusted t...
详细信息
ISBN:
(纸本)9798350342734
In the realm of machine learning, a profound understanding of sentence semantics holds paramount importance for various applications, notably text classification. Traditionally, this comprehension has been entrusted to deep learning models, despite their computationally intensive nature, particularly when dealing with lengthy sequences. The nuanced impact of individual words within a sentence on semantic expression necessitates a strategic removal of less pertinent words to alleviate the computational burden of the model. Presently, prevailing approaches for word removal predominantly employ methods such as truncation, stop-word elimination and attention mechanisms. Regrettably, these techniques often lack a robust theoretical foundation concerning semantics and interpretability. To bridge this conceptual gap, our study introduces the concept of 'Semantic Pillar Words' (SPW) within a sentence, anchored in a Semantic Euclidean space. Here, the semantics of a word are represented as a constellation of semantic points, with a text sequence encapsulating the convex hull of these semantic points of words. We propose a novel method for Semantic Pillar Word extraction, known as 'SPW-Conv', which dynamically and interpretably prunes text segments, striving to preserve the semantic pillars inherent in the original text. Our extensive experimentation encompasses three diverse text classification datasets, revealing that SPW-Conv outperforms existing methods. Remarkably, it becomes evident that retaining less than 80% of the words within a sentence suffices to capture its semantics adequately, all while achieving classification accuracy levels comparable to those obtained using the entire original text.
暂无评论