Usually, Stemmers are manually created by native target language speakers, on the basis of rules for suffix stripping or replacements. In such cases linguistic knowledge of the target language is a requirement. There ...
详细信息
ISBN:
(纸本)9798400713170
Usually, Stemmers are manually created by native target language speakers, on the basis of rules for suffix stripping or replacements. In such cases linguistic knowledge of the target language is a requirement. There are also approaches for automatic creation of stemmers. The later approaches are usually based on statistical processing of text corpuses to calculate the occurrences and other metrics for stems and affixes. In such cases linguistic knowledge of the target language is not needed. We have invented a semi-automatic methodology (and we have made an equivalent implementation) for Stemmer generation that neither linguistic knowledge of target language nor statistical processing of corpuses is needed. In our approach only Information Retrieval expertise is needed. Here we are evaluating our approach against the rule-based (linguistic-knowledge-based) approach. To do so we evaluate two stemmers by a native speaker of the target (Polish) language. The first Stemmer is rule-based (it incorporates linguistic knowledge of Polish language) and the second one is based on our methodology (only the experience of Information Retrieval Experts is incorporated). The results are interesting and hard to wait for, they are in favor of our methodology.
This study investigates the prediction of taxi trip durations in New York City using machine learning (ML) models and neural networks (NN). Three models Linear Regression, Random Forest Regressor, and a Neural Network...
详细信息
Printed Electronics (PE) technology has emerged as a promising alternative to silicon-based computing. It offers attractive properties such as on-demand ultra-low-cost fabrication, mechanical flexibility, and conforma...
详细信息
Printed Electronics (PE) provide a mechanically flexible and cost-effective solution for machine learning (ML) circuits, compared to silicon-based technologies. However, due to large feature sizes, printed classifiers...
详细信息
In the evolving domain of Human Activity Recognition (HAR) using Internet of Things (IoT) devices, there is an emerging interest in employing Deep Generative Models (DGMs) to address data scarcity, enhance data qualit...
详细信息
The integration of machine learning with augmented reality (AR) is vital for traditional batik preservation, especially in real-time pattern recognition. Traditional AR applications often rely on external servers, inc...
详细信息
The modern networking world is being exposed to many risks more frequently every day. Most of systems strongly rely on remaining anonymous throughout the whole endpoint exploitation process. Covert channels represent ...
详细信息
This study proposes an autonomous system for detecting objects in the form of a ball which is assumed to be the trajectory of the PENSHIP ship robot. In following the trajectory, the ship robot is prohibited from bein...
详细信息
Classification is the process of grouping classes and defining a class and determining the relationship between these classes. Landsat imagery with the distribution of residential areas and agricultural areas can be u...
详细信息
The wheeled soccer robot competition is held annually in Indonesia and internationally through the Indonesian Robot Contest and Robocup Middle Size League (MSL) competitions. The rate of occurrence of the ball moving ...
详细信息
暂无评论