While there has been lots of interest in code-switching in informal text such as tweets and online content, we ask whether code-switching occurs in the proceedings of multilingual institutions. We focus on the Canadia...
详细信息
In this paper we present the first application of Native language Identification (NLI) to Arabic learner data. NLI, the task of predicting a writer’s first language from their writing in other languages has been most...
详细信息
This paper deals with the problem of automatic language identification of noisy texts, which represents an important task in naturallanguageprocessing. Actually, there exist several works in this field, which are ba...
详细信息
A multilingual person writing a sentence or a piece of text tends to switch between languages s/he is proficient in. This alteration between languages, commonly known as code-switching, presents us with the problem of...
详细信息
A mechanism of machine understanding in processing and resolving of problems generated and formulated by users in naturallanguage is considered. The theory described is based on the Minsky thinking model. An architec...
A mechanism of machine understanding in processing and resolving of problems generated and formulated by users in naturallanguage is considered. The theory described is based on the Minsky thinking model. An architecture and software implementation of the computer system based on the described algorithm are presented.
Arabic on social media has all the properties of any language on social media that make it tough for naturallanguageprocessing, plus some specific problems. These include diglossia, the use of an alternative alphabe...
详细信息
This paper describes the CLAS system which accepts naturallanguage queries in the domain of music theory to perform passage retrieval from a musical score. This system was produced for participation in the C@merata M...
详细信息
This paper describes the CLAS system which accepts naturallanguage queries in the domain of music theory to perform passage retrieval from a musical score. This system was produced for participation in the C@merata MediaEval 2014 shared task. The system uses a domain-specific parser to interpret the query and answer generation methodsbased on feature unification. Performance on this task was encouraging with 0.76 precision and 0.96 recall.
Twitter users demonstrate many characteristics via their online presence. Connections, community memberships, and communication patterns reveal both idiosyncratic and general properties of users. In addition, the cont...
详细信息
Most representation learning algorithms for language and image processing are local, in that they identify features for a data point based on surrounding points. Yet in languageprocessing, the correct meaning of a wo...
详细信息
Supervised methods have been the dominant approach for Chinese word segmentation. The performance can drop significantly when the test domain is different from the training domain. In this paper, we study the problem ...
详细信息
暂无评论