We present a word sense disambiguation approach with application in machine translation from Arabic to English. The approach consists of two main steps: First, a natural language processing method that deals with the ...
详细信息
We present a word sense disambiguation approach with application in machine translation from Arabic to English. The approach consists of two main steps: First, a natural language processing method that deals with the rich morphology of Arabic language and second, the translation including word sense disambiguation. The main innovative features of this approach are the adaptation of the Naive Bayesian approach with new features to consider the Arabic language properties and the exploitation of a large parallel corpus to find the correct sense based on its cohesion with words in the training corpus. We expect that the resulting system will overcome the problem of the absence of the vowel signs, which is the main reason for the translation ambiguity between Arabic and other languages.
In this work, we focused on end-to-end speech recognition for less-resourced language, Amharic. The result can be integrated with other tasks such as spoken content retrieval. We explored three models, which consist o...
详细信息
ISBN:
(数字)9781728185262
ISBN:
(纸本)9781728185279
In this work, we focused on end-to-end speech recognition for less-resourced language, Amharic. The result can be integrated with other tasks such as spoken content retrieval. We explored three models, which consist of Convolutional Neural Networks, Recurrent Neural Networks, and Connectionist Temporal Classification, towards end-to-end speech recognition on less-resourced language. Further, we studied the possibility of having an end-to-end system with 1-best output keeping the network parameters and computational resource minimal. The paper gives attention to finding a more suitable sub-lexical unit for the Amharic end-to-end speech recognition system which can be used as an audio indexing unit. We present the first result comparing grapheme, phoneme, and syllable-based end-to-end speech recognition systems for our target language. The models are evaluated on approximately 52 hours of Amharic speech corpus containing read-speech, audiobooks, and multi-genre radio programs. On the test set, we report a character error rate (CER) of 19.21% and a syllable error rate (SER) of 39.98% for a syllable-based end-to-end model without lexicons and language model integrated.
作者:
Beel, JoeranTrinity College Dublin
Department of Computer Science and Statistics Knowledge and Data Engineering Group ADAPT Centre Ireland
In this position paper, we question the current practice of calculating evaluation metrics for recommender systems as single numbers (e.g. precision p=.28 or mean absolute error MAE = 1.21). We argue that single numbe...
详细信息
Policy engineering is the process of authoring IT management policies, detecting and resolving policy conflicts and revising existing policies to accommodate changing IT resources, business goals and business processe...
详细信息
ISBN:
(纸本)9781424492190
Policy engineering is the process of authoring IT management policies, detecting and resolving policy conflicts and revising existing policies to accommodate changing IT resources, business goals and business processes. Policy authoring is often followed by policy enforcement where the actions specified by subjects are performed on targets (resources). In this paper, we study the use of semantically enhanced techniques, such as ontologies, to model resources and their corresponding actions, coupled with a mechanism that can accommodate frequent organizational change, to model policy subjects. For the modeling of policy subjects, the rule-based Community-based Policy management will be used. This integration falls into the category of combining Description Logics (DL) and Logic Programs (LP). We aim to study this integration primarily from the scope of overall system expressivity, but also from the scope of minimizing the cognitive load perceived by policy authors. Such an evaluation can help determine shortfalls in the design of the software system or of the policy model used. To study the balance in modeling with DL and LP techniques, the encoding of part of the Trinity College Dublin statutes will be performed, which is a sufficiently complex real-world example.
Convolutional Neural Network has shown to achieve a state of the art performance in computer vision. They have also progressively become popular in speech recognition and other natural language processing tasks. In th...
详细信息
ISBN:
(数字)9781728185262
ISBN:
(纸本)9781728185279
Convolutional Neural Network has shown to achieve a state of the art performance in computer vision. They have also progressively become popular in speech recognition and other natural language processing tasks. In this study, we aim at designing a light-weight Convolutional Neural Network architecture for the under-resourced end-to-end speech recognition task. We present a carefully designed 1-dimensional Convolutional deep neural network architecture that could achieve reasonable accuracy to be cascaded with spoken content retrieval systems. We explored the usage of Convolutional Neural Networks with Connectionist Temporal Classification under resource-constrained conditions. The possibility of having an end-to-end system with the best decoding result keeping the network parameters and computational time minimum is also shown. The paper presents the results on the Amharic syllable-based end-to-end speech recognition system implementing the designed model. The architecture is trained and evaluated on ≈70 hours of Amharic read-speech, audiobooks, and multi-genre radio programs. On the development set, we report a character error rate of 12.60% and a syllable error rate of 27.28% without language-models integrated. Likewise, on the test set 18.38% character error rate and 27.71% syllable error rate is reached.
Information technology (IT) has made a prolific impact, both in sociological and commercial terms. In the commercial world, the pursuit of new technology and working practices has often been at the expense of equal re...
详细信息
Information technology (IT) has made a prolific impact, both in sociological and commercial terms. In the commercial world, the pursuit of new technology and working practices has often been at the expense of equal regard for the correct methods to manage the new technology. Contemporary IT techniques and methods include electronic patient records (EPR) which are normally implemented on a local practice or hospital basis. However EPR implementation has major organisational and cultural implications which will form the main focus of the paper.
Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. The reason for their immediate success is the fact that no specific sk...
详细信息
Automatic analysis of sentiments expressed in large scale online reviews is very important for intelligent business applications. Sentiment classification is the most popular task of sentiment analysis, which is more ...
详细信息
Arabizi is an informal written form of dialectal Arabic transcribed in Latin alphanumeric characters. It has a proven popularity on chat platforms and social media, yet it suffers from a severe lack of natural languag...
详细信息
Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. The reason for their immediate success is the fact that no specific sk...
详细信息
ISBN:
(纸本)9783885791881
Social bookmark tools are rapidly emerging on the Web. In such systems users are setting up lightweight conceptual structures called folksonomies. The reason for their immediate success is the fact that no specific skills are needed for participating. In this paper we specify a formal model for folksonomies, briefly describe our own system BibSonomy, which allows for sharing both bookmarks and publication references, and discuss first steps towards emergent semantics.
暂无评论