In this fast-running world, machine communication plays a vital role. To compete with this world, human-machine interaction is a necessary thing. To enhance this, naturallanguageprocessing technique is used widely. ...
详细信息
The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study ...
详细信息
ISBN:
(纸本)9781954085527
The recent GPT-3 model (Brown et al., 2020) achieves remarkable few-shot performance solely by leveraging a natural-language prompt and a few task demonstrations as input context. Inspired by their findings, we study few-shot learning in a more practical scenario, where we use smaller language models for which fine-tuning is computationally efficient. We present LM-BFF-better few-shot fine-tuning of language models1-a suite of simple and complementary techniques for fine-tuning language models on a small number of annotated examples. Our approach includes (1) prompt-based fine-tuning together with a novel pipeline for automating prompt generation;and (2) a refined strategy for dynamically and selectively incorporating demonstrations into each context. Finally, we present a systematic evaluation for analyzing few-shot performance on a range of NLP tasks, including classification and regression. Our experiments demonstrate that our methods combine to dramatically outperform standard fine-tuning procedures in this low resource setting, achieving up to 30% absolute improvement, and 11% on average across all tasks. Our approach makes minimal assumptions on task resources and domain expertise, and hence constitutes a strong task-agnostic method for few-shot learning.(2)
Graphs have long been proposed as a tool to browse and navigate in a collection of documents in order to support exploratory search. Many techniques to automatically extract different types of graphs, showing for exam...
详细信息
作者:
Bassi, AUniv Chile
Fac Ciencias Fis & Matemat Dept Ciencias Computac Santiago Chile
This paper presents a semantic model based on well-known psycholinguistic theories of human memory. It is centered on a spreading activation network, but it departs from classical models by representing associations b...
详细信息
ISBN:
(纸本)0769508103
This paper presents a semantic model based on well-known psycholinguistic theories of human memory. It is centered on a spreading activation network, but it departs from classical models by representing associations between structured units instead of atomic nodes. Network units have an activity level that evolves according to their expected contextual relevance. Spreading activation explains the predictive top-down effect of knowledge. It supports a general heuristics which may be used as the first step of more elaborated methods. This model is suited to deal with the interaction between semantic and episodic memories, as well as many other practical issues regarding naturallanguageprocessing, including the retroactive effect of semantics over perception and the operation in open-worlds.
State-of-the-art pretrained contextualized models (PCM) eg. BERT use tasks such as WiC and WSD to evaluate their word-in-context representations. This inherently assumes that performance in these tasks reflect how wel...
详细信息
Applying differential privacy (DP) by means of the DP-SGD algorithm to protect individual data points during training is becoming increasingly popular in NLP. However, the choice of granularity at which DP is applied ...
详细信息
Several recent works seek to develop foundation models specifically for medical applications, adapting general-purpose large language models (LLMs) and vision-language models (VLMs) via continued pretraining on public...
详细信息
One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we adapt large language models (LLMs)...
详细信息
ISBN:
(纸本)9798891760615
One challenge in speech translation is that plenty of spoken content is long-form, but short units are necessary for obtaining high-quality translations. To address this mismatch, we adapt large language models (LLMs) to split long ASR transcripts into segments that can be independently translated so as to maximize the overall translation quality. We overcome the tendency of hallucination in LLMs by incorporating finite-state constraints during decoding;these eliminate invalid outputs without requiring additional training. We discover that LLMs are adaptable to transcripts containing ASR errors through prompt-tuning or fine-tuning. Relative to a state-of-the-art automatic punctuation baseline, our best LLM improves the average BLEU by 2.9 points for English-German, English-Spanish, and English-Arabic TED talk translation in 9 test sets, just by improving segmentation.
Advancements in mobile technology makes it easier to communicate in real time, but at the cost of having a wider potential attack area for phishing. While there has been research in the field related to Email and SMS,...
详细信息
ISBN:
(纸本)9798350328837;9798350328844
Advancements in mobile technology makes it easier to communicate in real time, but at the cost of having a wider potential attack area for phishing. While there has been research in the field related to Email and SMS, Instant Messages lags behind. The widespread usage of instant messengers by individuals of all ages further motivates the addition of software security features in this context. This research aims to detect phishing in mobile instant messages by analysing the language of the message with the help of naturallanguageprocessing to detect keywords pointing towards phishing. We built the machine learning models using 3 different methods for feature extraction and 3 classification algorithms. Our tests showed that balancing the data with random oversampling increased the classifiers' performance, which were able to achieve an accuracy up to 99.2%.
Large vision-language models (LVLMs) are prone to hallucinations, where certain contextual cues in an image can trigger the language module to produce overconfident and incorrect reasoning about abnormal or hypothetic...
详细信息
暂无评论