检索结果-内蒙古大学图书馆

arXiv 2023年

作者： Herold, Christian Gao, Yingbo Zeineldeen, Mohammad Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

The integration of language models for neural machine translation has been extensively studied in the past. It has been shown that an external language model, trained on additional target-side monolingual data, can help improve translation quality. However, there has always been the assumption that the translation model also learns an implicit target-side language model during training, which interferes with the external language model at decoding time. Recently, some works on automatic speech recognition have demonstrated that, if the implicit language model is neutralized in decoding, further improvements can be gained when integrating an external language model. In this work, we transfer this concept to the task of machine translation and compare with the most prominent way of including additional monolingual data - namely back-translation. We find that accounting for the implicit language model significantly boosts the performance of language model fusion, although this approach is still outperformed by back-translation. Copyright © 2023, The Authors. All rights reserved.

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

Improving Long Context Document-Level Machine Translation

arXiv

引用

arXiv 2023年

作者： Herold, Christian Ney, Hermann Human Language Technology and Pattern Recognition Group Computer Science Department RWTH Aachen University AachenD-52056 Germany

Document-level context for neural machine translation (NMT) is crucial to improve the translation consistency and cohesion, the translation of ambiguous inputs, as well as several other linguistic phenomena. Many works have been published on the topic of document-level NMT, but most restrict the system to only local context, typically including just the one or two preceding sentences as additional information. This might be enough to resolve some ambiguous inputs, but it is probably not sufficient to capture some document-level information like the topic or style of a conversation. When increasing the context size beyond just the local context, there are two challenges: (i) the memory usage increases exponentially (ii) the translation performance starts to degrade. We argue that the widely-used attention mechanism is responsible for both issues. Therefore, we propose a constrained attention variant that focuses the attention on the most relevant parts of the sequence, while simultaneously reducing the memory consumption. For evaluation, we utilize targeted test sets in combination with novel evaluation techniques to analyze the translations in regards to specific discourse-related phenomena. We find that our approach is a good compromise between sentence-level NMT vs attending to the full context, especially in low resource scenarios. Copyright © 2023, The Authors. All rights reserved.

关键词： Neural machine translation

来源：评论

学校读者我要写书评

暂无评论

Using the interactive software FossilSketch to teach micropaleontology to undergraduate students

引用

Journal of Geoscience Education 2025年第2期73卷 133-153页

作者： Stepanova, Anna Belanger, Christina Anwar, Saira Stanley, Christine Nath, Ankur Cherian, Josh Hammond, Tracy Department of Computer Science and Engineering Texas A&M University College Station TX United States Department of Geology and Geophysics Texas A&M University College Station TX United States Department of Multidisciplinary Engineering Texas A&M University College Station TX United States Department of Educational Administration and Human Resource Development Texas A&M University College Station TX United States Institute for Engineering Education and Innovation Texas A&M University College Station TX United States

Micropaleontology is a critical tool for determining the ages of geologic records, reconstructing ancient environments, and monitoring modern ecosystem health. However, most students are not exposed to micropaleontology in their college coursework. To enable non-expert instructors to integrate microfossil identification training in their undergraduate courses, we developed FossilSketch, an interactive web-based educational tool that introduces students to the basics of micropaleontology and guides students through a scaffolded learning experience that develops microfossil identification skills. Here we test the impact of FossilSketch on students’ ability to learn micropaleontology skills, such as identification of microfossils to genus level and basics of fossil data analysis, using data on students’ performance and survey responses collected in an undergraduate paleontology course for geology majors at a large public university. A total of 112 students took part in this study. Analysis of classroom assessments showed that junior and senior geology majors who used FossilSketch were better able to understand the process of microfossil identification, recognize morphological characteristics, and achieve a correct identification than those who did not use FossilSketch. Students who used FossilSketch needed to ask the teaching assistant fewer questions and felt better prepared for specimen-based work than students who did not use FossilSketch. These results suggest that FossilSketch improves students’ understanding of the microfossil identification process. © 2024 National Association of Geoscience Teachers.

关键词： micropaleontology Taxonomy undergraduate education

来源：评论

学校读者我要写书评

暂无评论

GMOCSO: Multi-objective Cat Swarm Optimization Algorithm based on a Grid System

arXiv

引用

arXiv 2025年

作者： Ahmed, Aram M. Hassan, Bryar A. Rashid, Tarik A. Noori, Kaniaw A. Saeed, Soran Ab.M. Ahmed, Omed H. Umar, Shahla U. Computer Science and Engineering Department University of Kurdistan Hewlêr Erbil Iraq Department of Computer Science College of Science Charmo University Chamchamal Sulaimani Iraq Database Technology Department Technical College of Informatics Sulaimani Polytechnic University Sulaimani Iraq Department of Information Technology University of Human Development Sulaymaniyah Iraq Network Department College of Computer Science and Information Technology Kirkuk University Kirkuk Iraq

This paper presents a multi-objective version of the Cat Swarm Optimization Algorithm called the Grid-based Multiobjective Cat Swarm Optimization Algorithm (GMOCSO). Convergence and diversity preservation are the two main goals pursued by modern multi-objective algorithms to yield robust results. To achieve these goals, we first replace the roulette wheel method of the original CSO algorithm with a greedy method. Then, two key concepts from Pareto Archived Evolution Strategy Algorithm (PAES) are adopted: the grid system and double archive strategy. Several test functions and a real-world scenario called the Pressure vessel design problem are used to evaluate the proposed algorithm's performance. In the experiment, the proposed algorithm is compared with other well-known algorithms using different metrics such as Reversed Generational Distance, Spacing metric, and Spread metric. The optimization results show the robustness of the proposed algorithm, and the results are further confirmed using statistical methods and graphs. Finally, conclusions and future directions were presented.. © 2025, CC BY.

关键词： Pressure vessels

来源：评论

学校读者我要写书评

暂无评论

Trade-Offs Between Fairness and Privacy in language Modeling

arXiv

引用

arXiv 2023年

作者： Matzken, Cleo Eger, Steffen Habernal, Ivan Trustworthy Human Language Technologies Department of Computer Science Technical University of Darmstadt Germany Natural Language Learning Group Faculty of Technology Universität Bielefeld Germany

Protecting privacy in contemporary NLP models is gaining in importance. So does the need to mitigate social biases of such models. But can we have both at the same time? Existing research suggests that privacy preservation comes at the price of worsening biases in classification tasks. In this paper, we explore the extent to which this tradeoff really holds when we incorporate both privacy preservation and debiasing techniques into training text generation models. How does improving the model along one dimension affect the other dimension as well as the utility of the model? We conduct an extensive set of experiments that include bias detection, privacy attacks, language modeling, and performance on downstream tasks. © 2023, CC BY.

关键词： Economic and social effects

来源：评论

学校读者我要写书评

暂无评论

New Ontology structure for intelligent controlling of traffic signals 26

New Ontology structure for intelligent controlling of traffi...

引用

26th International Conference on Knowledge-Based and Intelligent Information and Engineering Systems, KES 2022

作者： Mohammad, Mahmud Abdulla Manguri, Kamaran H. Abdulsamad, Taib Shamsadin Faeq Al-Talabani, Abdulbasit K. Abdulrahman, Akam Aziz Computer Science Department College of Basic Education University of Raparin Kurdistan Region Ranya Iraq College of Science and Technology University of Human Development Kurdistan Region Sulaymaniyah Iraq Department of Software Engineering Koya University Kurdistan Region Koya Iraq

This article proposes a novel ontology design for intelligent controlling of traffic signals, considering the investigated factors, crowded factors, road factors, visibility conditions, and emergency situations. Essentially, the proposed method uses video-based knowledge and key feature from a monocular video camera only, capturing footage from either a traffic signals perspective or the top of the road lane. The key factors and entities in the traffic scene are formed into an ontology, which has been evaluated using synthetic datasets to interpret challenging cases. Semantic features related to the key factors in the scene are obtained and fed to the ontology. The experimental results indicate that the proposed method is capable of controlling traffic signals more efficiently than the fixed intervals protocol. © 2022 The Authors. Published by Elsevier B.V.

关键词： Semantics

来源：评论

学校读者我要写书评

暂无评论

Effects of Customized Aggression Reduction Interventions with Male Adolescents: A Single-Case Research Design

引用

Journal of Child and Adolescent Counseling 2023年第1期9卷 50-65页

作者： Raychelle Cassada Lohmann Stanley B. Baker ClarLynda R. Williams-DeVane a Department of Educational Leadership Policy and Human Development North Carolina State University Raleigh North Carolina USA b Department of Mathematics and Computer Science Fisk University Nashville Tennessee USA

ABSTRACTABSTRACTAggression during adolescence can lead to unhealthy outcomes. Prior research suggests that youth with disruptive behaviors filter information in a distorted manner and struggle with social information processing skills. Teaching effective social processing skills can help reduce aggressive behaviors. The current study aimed to investigate the effects of customized aggression interventions on adolescent males. We used a theory-informed framework to guide the development of the interventions using anN= 1/ABA single-case research design with four male adolescents aged 13–14 who volunteered to participate in our study. A female licensed clinical mental health counselor designed and delivered the interventions and collected the outcome data. Participants completed a series of temporal assessments examining proactive, reactive, and total aggression. We hypothesized that customized interventions would be an effective means to address and reduce problematic aggressive behaviors. The data produced small to large effect sizes for three of the four participants, and statistically significant differences were observed between phases. The results have implications for the contributions of utilizing social information processing theory-informed customized aggression interventions with adolescents using single-case research design methodology.

关键词： Aggression social information processing single-case research design

来源：评论

学校读者我要写书评

暂无评论

Robust Knowledge Distillation from RNN-T Models with Noisy Training Labels Using Full-Sum Loss

Robust Knowledge Distillation from RNN-T Models with Noisy T...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Mohammad Zeineldeen Kartik Audhkhasi Murali Karthick Baskar Bhuvana Ramabhadran Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany Google LLC New York

This work studies knowledge distillation (KD) and addresses its constraints for recurrent neural network transducer (RNN-T) models. In hard distillation, a teacher model transcribes large amounts of unlabelled speech to train a student model. Soft distillation is another popular KD method that distills the output logits of the teacher model. Due to the nature of RNN-T alignments, applying soft distillation between RNNT architectures having different posterior distributions is challenging. In addition, bad teachers having high word-error-rate (WER) reduce the efficacy of KD. We investigate how to effectively distill knowledge from variable quality ASR teachers, which has not been studied before to the best of our knowledge. We show that a sequence-level KD, full-sum distillation, outperforms other distillation methods for RNN-T models, especially for bad teachers. We also propose a variant of full-sum distillation that distills the sequence discriminative knowledge of the teacher leading to further improvement in WER. We conduct experiments on public datasets namely SpeechStew and LibriSpeech, and on in-house production data.

关键词： Knowledge engineering Training Recurrent neural networks Transducers Production Signal processing Acoustics

来源：评论

学校读者我要写书评

暂无评论

End-To-End Training of a Neural HMM with Label and Transition Probabilities

End-To-End Training of a Neural HMM with Label and Transitio...

引用

IEEE Workshop on Automatic Speech Recognition and Understanding

作者： Daniel Mann Tina Raissi Wilfried Michel Ralf Schlüter Hermann Ney AppTek GmbH Aachen Germany Machine Learning and Human Language Technology Computer Science Department RWTH Aachen University Aachen Germany

We investigate a novel modeling approach for end-to-end neural network training using hidden Markov models (HMM) where the transition probabilities between hidden states are modeled and learned explicitly. Most contemporary sequence-to-sequence models allow for from-scratch training by summing over all possible label segmentations in a given topology. In our approach there are explicit, learnable probabilities for transitions between segments as opposed to a blank label that implicitly encodes duration *** implement a GPU-based forward-backward algorithm that enables the simultaneous training of label and transition *** investigate recognition results and additionally Viterbi alignments of our models. We find that while the transition model training does not improve recognition performance, it has a positive impact on the alignment quality. The generated alignments are shown to be viable targets in state-of-the-art Viterbi trainings.

关键词：

来源：评论

学校读者我要写书评

暂无评论

Lattice-Free Sequence Discriminative Training for Phoneme-Based Neural Transducers

Lattice-Free Sequence Discriminative Training for Phoneme-Ba...

引用

International Conference on Acoustics, Speech, and Signal Processing (ICASSP)

作者： Zijian Yang Wei Zhou Ralf Schlüter Hermann Ney Computer Science Department Human Language Technology and Pattern Recognition RWTH Aachen University Aachen Germany AppTek GmbH Aachen Germany

Recently, RNN-Transducers have achieved remarkable results on various automatic speech recognition tasks. However, lattice-free sequence discriminative training methods, which obtain superior performance in hybrid models, are rarely investigated in RNN-Transducers. In this work, we propose three lattice-free training objectives, namely lattice-free maximum mutual information, lattice-free segment-level minimum Bayes risk, and lattice-free minimum Bayes risk, which are used for the final posterior output of the phoneme-based neural transducer with a limited context dependency. Compared to criteria using N-best lists, lattice-free methods eliminate the decoding step for hypotheses generation during training, which leads to more efficient training. Experimental results show that lattice-free methods gain up to 6.5% relative improvement in word error rate compared to a sequence-level cross-entropy trained model. Compared to the N-best-list based minimum Bayes risk objectives, lattice-free methods gain 40% - 70% relative training time speedup with a small degradation in performance.

关键词： Training Degradation Transducers Error analysis Signal processing Decoding Speech processing

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：