Low-resource language translation remains a significant challenge in naturallanguageprocessing, particularly for the Mongolian-Chinese language pair under the "Belt and Road" initiative. Existing translati...
详细信息
ISBN:
(纸本)9789819794393;9789819794409
Low-resource language translation remains a significant challenge in naturallanguageprocessing, particularly for the Mongolian-Chinese language pair under the "Belt and Road" initiative. Existing translation systems struggle with this pair due to the scarcity of high-quality data. This paper addresses these challenges by combining multilingual k-nearest-neighbor machine translation (kNNMT) with Chinese-centric methods. We constructed a robust multilingual datastore and introduced an incomplete-trust loss function to effectively manage low-quality data. Additionally, we implemented re-ranking techniques to further enhance the robustness and accuracy of the translation model. The experimental results indicate that this combined approach significantly improves Mongolian-Chinese translation quality on the mBART model, with a BLEU score increase of 3.81 points and a TER score decrease of 0.0531 points. Our findings demonstrate that integrating kNN-MT with Chinese-centric methods and employing advanced loss functions and re-ranking techniques can effectively address data scarcity and quality issues, leading to substantial improvements in translation performance for low-resource language pairs.
Pre-trained chemical language models (CLMs) excel in the field of molecular property prediction, utilizing string-based molecular descriptors such as SMILES for learning universal representations. However, such string...
详细信息
Topic-Dependent Argument Mining (TDAM), that is extracting and classifying argument components for a specific topic from large document sources, is an inherently difficult task for machine learning models and humans a...
详细信息
Large language Models (LLMs) have demonstrated impressive performance on a range of naturallanguageprocessing (NLP) tasks. Unfortunately, the immense amount of computations and memory accesses required for LLM train...
详细信息
ISBN:
(纸本)9798891760615
Large language Models (LLMs) have demonstrated impressive performance on a range of naturallanguageprocessing (NLP) tasks. Unfortunately, the immense amount of computations and memory accesses required for LLM training makes them prohibitively expensive in terms of hardware cost, and thus challenging to deploy in use cases such as on-device learning. In this paper, motivated by the observation that LLM training is memory-bound, we propose a novel dynamic quantization strategy, termed Dynamic Stashing Quantization (DSQ), that puts a special focus on reducing the memory operations, but also enjoys the other benefits of low precision training, such as the reduced arithmetic cost. We conduct a thorough study on two translation tasks (trained-from-scratch) and three classification tasks (fine-tuning). DSQ reduces the amount of arithmetic operations by 20.95x and the number of DRAM operations by 2.55x on IWSLT17 compared to the standard 16-bit fixed-point.
The escalating complexity of modern software systems has rendered the management of requirements increasingly arduous, often plagued by redundancy, inconsistency, and inefficiency. Traditional manual methods prove ina...
详细信息
The escalating complexity of modern software systems has rendered the management of requirements increasingly arduous, often plagued by redundancy, inconsistency, and inefficiency. Traditional manual methods prove inadequate for addressing the intricacies of dynamic, large-scale datasets. In response, this research introduces SQUIRE (Semantic Quick Requirements Engineering), a cutting-edge automated framework leveraging advanced naturallanguageprocessing (NLP) techniques, specifically Sentence-BERT (SBERT) embeddings and hierarchical clustering, to semantically organize requirements into coherent functional clusters. SQUIRE is meticulously designed to enhance modularity, mitigate redundancy, and strengthen traceability within requirements engineering processes. Its efficacy is rigorously validated using real-world datasets from diverse domains, including attendance management, e-commerce systems, and school operations. empirical evaluations reveal that SQUIRE outperforms conventional clustering methods, demonstrating superior intracluster cohesion and inter-cluster separation, while significantly reducing manual intervention. This research establishes SQUIRE as a scalable and domain-agnostic solution, effectively addressing the evolving complexities of contemporary software development. By streamlining requirements management and enabling software teams to focus on strategic initiatives, SQUIRE advances the state of NLP-driven methodologies in Requirements Engineering, offering a robust foundation for future innovations.
Prompt-based learning is susceptible to intrinsic bias present in pre-trained language models (LMs), leading to sub-optimal performance in prompt-based zero/few-shot settings. In this work, we propose a null-input pro...
详细信息
Recent advances in Large language Models (LLMs) have sparked wide interest in validating and comprehending the human-like cognitive-behavioral traits LLMs may capture and *** cognitive-behavioral traits include typica...
详细信息
Open Information Extraction (OpenIE) represents a crucial NLP task aimed at deriving structured information from unstructured text, unrestricted by relation type or *** survey paper provides an overview of OpenIE tech...
Legal practice is intrinsically rooted in the fabric of language, yet legal practitioners and scholars have been slow to adopt tools from naturallanguageprocessing (NLP). At the same time, the legal system is experi...
详细信息
ISBN:
(纸本)9798891760615
Legal practice is intrinsically rooted in the fabric of language, yet legal practitioners and scholars have been slow to adopt tools from naturallanguageprocessing (NLP). At the same time, the legal system is experiencing an access to justice crisis, which could be partially alleviated with NLP. In this position paper, we argue that the slow uptake of NLP in legal practice is exacerbated by a disconnect between the needs of the legal community and the focus of NLP researchers. In a review of recent trends in the legal NLP literature, we find limited overlap between the legal NLP community and legal academia. Our interpretation is that some of the most popular legal NLP tasks fail to address the needs of legal practitioners. We discuss examples of legal NLP tasks that promise to bridge disciplinary disconnects and highlight interesting areas for legal NLP research that remain underexplored.
The success of ChatGPT validates the potential of large language models (LLMs) in artificial general intelligence (AGI). Subsequently, the release of LLMs has sparked the opensource community's interest in instruc...
详细信息
ISBN:
(纸本)9798891760615
The success of ChatGPT validates the potential of large language models (LLMs) in artificial general intelligence (AGI). Subsequently, the release of LLMs has sparked the opensource community's interest in instructiontuning, which is deemed to accelerate ChatGPT's replication process. However, research on instruction-tuning LLMs in Chinese, the world's most spoken language, is still in its early stages. Therefore, this paper makes an in-depth empirical study of instruction-tuning LLMs in Chinese, which can serve as a cookbook that provides valuable findings for effectively customizing LLMs that can better respond to Chinese instructions. Specifically, we systematically explore the impact of LLM bases, parameter-efficient methods, instruction data types, which are the three most important elements for instruction-tuning. Besides, we also conduct experiment to study the impact of other factors, e.g., chain-of-thought data and human-value alignment. We hope that this empirical study can make a modest contribution to the open Chinese version of ChatGPT. This paper will release a powerful Chinese LLM that is comparable to ChatGLM. The code and data are available at https: //***/PhoebusSi/Alpaca-CoT.
暂无评论