WordNet is a lexical database for English that is supplied in a variety of formats, including one compatible with the Prolog programming language. Given the success and usefulness of WordNet, wordnets of other languag...
详细信息
WordNet is a lexical database for English that is supplied in a variety of formats, including one compatible with the Prolog programming language. Given the success and usefulness of WordNet, wordnets of other languages have been developed, including Spanish. The Spanish WordNet, like others, does not provide a version compatible with Prolog. This work aims to fill this gap by translating the Multilingual Central Repository (MCR) version of the Spanish WordNet into a Prolog-compatible format. Thanks to this translation, a set of Spanish lexical databases are obtained, which allows access to WordNet information using declarative techniques and the deductive capabilities of the Prolog language. Also, this work facilitates the development of other programs to analyze the obtained information. Remarkably, we have adapted the technique of differential testing, used in software testing, to verify the correctness of this conversion. In addition, to ensure the consistency of the generated Prolog databases, as well as the databases from which we started, a complete series of integrity constraint tests have been carried out. In this way we have discovered some inconsistency problems in the MCR databases that have a reflection in the generated Prolog databases and have been reported to the owners of those databases.
This position paper discusses the profound impact of Large language Models (LLMs) on semantic change, emphasizing the need for comprehensive monitoring and visualization techniques. Building on linguistic concepts, we...
详细信息
ISBN:
(纸本)9798331528423;9798331528430
This position paper discusses the profound impact of Large language Models (LLMs) on semantic change, emphasizing the need for comprehensive monitoring and visualization techniques. Building on linguistic concepts, we examine the interdependency between mental and language models, highlighting how LLMs and human cognition mutually influence each other within societal contexts. We introduce three primary theories to conceptualize such influences: (T1) Recontextualization, (T2) Standardization, and (T3) Semantic Dementia, illustrating how LLMs drive, standardize, and potentially degrade language semantics. Our subsequent review categorizes methods for visualizing semantic change into frequency-based, embedding-based, and context-based techniques, being first in assessing their effectiveness in capturing linguistic evolution: Embedding-based methods are highlighted as crucial for a detailed semantic analysis, reflecting both broad trends and specific linguistic changes. We underscore the need for novel visualization tools to explain LLM-induced semantic changes, ensuring the preservation of linguistic diversity and mitigating biases, while providing essential insights for the research on semantic change visualization and the dynamic nature of language evolution in the times of LLMs.
暂无评论