Data scarcity is still a major challenge in machine translation. The performance of state-of-the-art deep learning architectures, such as the Transformers, for under-resourced languages is well below the one for high-...
ISBN:
(纸本)9783031705625;9783031705632
Data scarcity is still a major challenge in machine translation. The performance of state-of-the-art deep learning architectures, such as the Transformers, for under-resourced languages is well below the one for high-resourced languages. This precludes access to information for millions of speakers across the globe. Previous research has shown that the Transformer is highly sensitive to hyperparameters in low-resource conditions. One such parameter is the size of the subword vocabulary of the model. In this paper, we show that using smaller vocabularies, as low as 1k tokens, instead of the default value of 32k, is preferable in a diverse array of low-resource conditions. We experiment with different sizes on English-Akkadian, Lower Sorbian-German, English-Manipuri, to obtain models that are faster to train, smaller, and better performing than the default setting. These models achieve improvements of up to 322% ChrF score, while being up to 66% smaller and up to 17% faster to train.
The vehicle-to-grid feature of today's electric vehicles suggests using them as batteries for stabilizing the power grid besides using them to fulfill mobility needs. In the context of car-sharing, the car-sharing...
ISBN:
(纸本)9783031610332;9783031610349
The vehicle-to-grid feature of today's electric vehicles suggests using them as batteries for stabilizing the power grid besides using them to fulfill mobility needs. In the context of car-sharing, the car-sharing provider may thus try to foster two goals: they may be interested in stabilizing the grid and ensuring the usage of as much green energy as possible. At the same time, they try to maximize satisfaction of the customer's requests. As such, each car-sharing provider has to implement a policy on how to react to booking requests. On the other hand, customers may react to how mobility needs are fulfilled and adapt their booking strategy. In this paper, we study the problem of how to model elements of car-sharing providers as well as those of customers in a multi-agent simulation. We identify the principal elements and targets while leaving concrete simulations as future work.
The evaluation of automatic speech transcriptions relies heavily on metrics such as Word Error Rate (WER) and Character Error Rate (CER). However, these metrics have faced criticism for their limited correlation with ...
ISBN:
(纸本)9783031705656;9783031705663
The evaluation of automatic speech transcriptions relies heavily on metrics such as Word Error Rate (WER) and Character Error Rate (CER). However, these metrics have faced criticism for their limited correlation with human perception and their inability to capture linguistic and semantic nuances accurately. Despite the introduction of metric-based embeddings to approximate human perception, their interpretability remains challenging compared to traditional metrics. In this article, we introduce a novel paradigm aimed at addressing these limitations. Our approach integrates a chosen metric to derive Minimum Edit Distance (minED), which serves as an indicator of the rate of serious errors in automatic speech transcriptions. Unlike conventional metrics, minED offers a more nuanced understanding of errors, accounting for both linguistic complexities and human perception. Furthermore, our paradigm facilitates the measurement of error severity from both intrinsic and extrinsic perspectives.
General-purpose Language Models (LMs) bypass the need for task-specific model training by allowing textual prompts to specify a downstream task (e.g., assessment, feedback generation). One of the main benefits of usin...
ISBN:
(纸本)9783031642982;9783031642999
General-purpose Language Models (LMs) bypass the need for task-specific model training by allowing textual prompts to specify a downstream task (e.g., assessment, feedback generation). One of the main benefits of using a prompt-based learning method is that it circumvents the need for supervised data and training on the downstream task. However, in high-stakes settings like education, LMs need to be evaluated rigorously on the specific downstream tasks before putting them in front of the students. Unlike traditional supervised learning models that are evaluated for a specific task, LMs are often evaluated on benchmark data and tasks that may not reflect the downstream use in education. Hence, we first present arguments for contextual evaluation of LMs. Next, we present a framework for behavior analysis - an alternative approach for model evaluation. Behavior analysis involves defining LM behaviors and designing tests (e.g., invariance to irrelevant perturbations). Using a case study of assessing science ideas in student essays, with past data from ecologically valid contexts, we illustrate how behavior analysis allowed for the identification of LM failures that are likely to go unnoticed in tests for generalization. By making the LMs more transparent for scrutiny, this study suggests a way to improve LM reliability and trustworthiness. Future studies will work with education stakeholders in translating their implicit expectations of desired model behaviors into explicitly defined tests, thereby building their agency and trust in educational AI.
Numerous mainland Chinese neologisms have been introduced to Hong Kong due to the increased communication between Hong Kong and mainland China. The purpose of this paper is to examine the influence of neologisms from ...
ISBN:
(纸本)9789819705825;9789819705832
Numerous mainland Chinese neologisms have been introduced to Hong Kong due to the increased communication between Hong Kong and mainland China. The purpose of this paper is to examine the influence of neologisms from mainland China on Hong Kong Cantonese by observing the use of these neologisms in local newspapers in Hong Kong. It has been shown that the local newspapers began to use the mainland neologisms at an early stage, which has resulted in semantic changes in Cantonese lexical items. There are three ways in which lexical semantic change can occur: 1) redefining the Cantonese-Mandarin homographs to make the senses conveyed by the mainland neologisms the only interpretation;2) replacing the Cantonese synonyms with mainland neologisms;and 3) providing new senses for the mainland neologisms to transform them into local neologisms.
The advancement of human-machine interactions and the expectation to interact with devices in a natural way necessitates the development of multi-modal dialogue systems that can process and respond to various forms of...
ISBN:
(纸本)9783031705656;9783031705663
The advancement of human-machine interactions and the expectation to interact with devices in a natural way necessitates the development of multi-modal dialogue systems that can process and respond to various forms of input modalities. Despite effective recognition methods of the different modalities, their integration into reusable frameworks often falls short, impeding rapid development and the long-term maintenance of systems that incorporate multiple modalities. This paper introduces stepDP, a novel and open-source dialogue platform designed to overcome these challenges by facilitating the quick and efficient implementation of multi-modal dialogue systems following a model-driven design paradigm. Well-researched concepts and algorithms were integrated into a framework and fine-tuned to work together seamlessly. One core concept is that the dialogue logic is abstracted from actual input modalities, allowing for the generalisation of dialogue behavior and seamless integration across domains. Emphasizing modularity and flexibility, stepDP allows for the easy integration of new features, for example, the combination of traditional NLU techniques with innovative LLMs, without requiring extensive system modifications. Our platform not only accelerates the development process but also promotes the exploration of new concepts and techniques in human-machine interaction.
Truly realistic models for policy making require multiple aspects of life, realistic social behaviour and the ability to simulate millions of agents. Current state of the art Agent-based models only achieve two of the...
ISBN:
(纸本)9783031610332;9783031610349
Truly realistic models for policy making require multiple aspects of life, realistic social behaviour and the ability to simulate millions of agents. Current state of the art Agent-based models only achieve two of these requirements. Models that prioritise realistic social behaviour are not easily scalable because the complex deliberation takes into account all information available at each time step for each agent. Our framework uses context to considerably narrow down the information that has to be considered. A key property of the framework is that it can dynamically slide between fast deliberation and complex deliberation. Context is expanded based on necessity. We introduce the elements of the framework, describe the architecture and show a proof-of-concept implementation. We give first steps towards validation using this implementation.
We have developed a pedagogical approach wherein learners acquire systems thinking skills and content knowledge by constructing qualitative representations. In this paper, we focus on how learners learn about the biol...
ISBN:
(纸本)9783031643019;9783031643026
We have developed a pedagogical approach wherein learners acquire systems thinking skills and content knowledge by constructing qualitative representations. In this paper, we focus on how learners learn about the biological mechanisms of calcium regulation by constructing such a representation, how they interact with the software, and the effect on learning outcomes. The software contains various functionalities to support learners, and a workbook guides them through the process. Cluster analysis of learners' use of the software categorizes them into three styles, which we have labelled: exploratory, comprehensive, and efficient. Learning outcomes are evaluated through pre- and post-tests and show overall improvement on systems thinking skills and content knowledge. No significant differences in outcome are observed between the interaction styles of learners. This implies that constructing qualitative representations effectively increases learners' systems thinking skills and understanding of calcium regulation, regardless of their interaction style.
The rapid development of electric vertical take-off and landing vehicles (eVTOL) is a response to the increasingly congested urban ground traffic, low positioning accuracy, and the many hidden risks of traversing flig...
ISBN:
(纸本)9783031607301;9783031607318
The rapid development of electric vertical take-off and landing vehicles (eVTOL) is a response to the increasingly congested urban ground traffic, low positioning accuracy, and the many hidden risks of traversing flight. Therefore, the safety interval standard is of great significance in ensuring the safe operation of these vehicles. To investigate the safety interval of multi-rotor eVTOL in low-altitude airspace, we have improved the Event collision model for lateral, longitudinal, and vertical collisions by modifying the shape characteristics. Specifically, we have replaced the original rectangular collision box with a round table body collision box, which reduces computational redundancy. We have also calculated the relative velocity and overlap probability while considering the error distribution. Finally, this text presents an arithmetic analysis using the EH 216-S model as an example to calculate the safety interval under different safety target levels and navigation accuracy. When the safety target level is 1x10(-7) times/flight hour and the navigation accuracy is high, the minimum longitudinal, lateral, and vertical safety intervals are 42.1 m, 24.5 m, and 14.4 m, respectively. The improved model reduced the longitudinal, lateral, and vertical collision risks by 43.4%, 23%, and 58.4%, respectively, compared to the original model. The study results can serve as a reference for developing eVTOL interval standards for multi-rotor types. If a vehicle smaller than the safety interval standard is detected, it can be promptly displayed on the human-computer interface to alert the supervisors.
This study investigates how automated and personalized feedback can support elementary teachers' learning within digital teaching simulations. We interviewed 15 participants to elicit their understanding and evalu...
ISBN:
(纸本)9783031642982;9783031642999
This study investigates how automated and personalized feedback can support elementary teachers' learning within digital teaching simulations. We interviewed 15 participants to elicit their understanding and evaluation of feedback about a teacher's facilitation of a science discussion, which was generated using natural language processing (NLP). Findings indicate that participants: (a) perceived strong alignment between the feedback and the discussion facilitated, (b) used the feedback to identify future instructional moves, and (c) noted how the feedback can support reflection.
暂无评论