This article presents a survey of models of rough neurocomputing that have their roots in rough set theory. Historically, rough neurocomputing has three main threads: training set production, calculus of granules, and...
详细信息
This paper presents the work-in-progress in the development of an automatic term recognition (ATR) system built around the Corpus Cient37;fico-T33;cnico (CCT). Terms are modeled using three non-correlated dimens...
详细信息
This paper presents the work-in-progress in the development of an automatic term recognition (ATR) system built around the Corpus Científico-Técnico (CCT). Terms are modeled using three non-correlated dimensions: unithood, domainhood and usage, applied to a set of -grams automatically extracted from the corpus. These dimensions are combined with a supervised machinelearning algorithm in order to classify n-grams as terms or non-Terms. Results of both noise and silence are promising given the paucity of data employed for training. Moreover, error analysis on noise reveals that other information dimensions can be used for significantly reducing noise.
A software concept for automated design of a multi-spindle drilling gear machine used in furniture production process is proposed. To find an optimised design of the target-machine, this means to find the minimum numb...
详细信息
ISBN:
(纸本)3540440259
A software concept for automated design of a multi-spindle drilling gear machine used in furniture production process is proposed. To find an optimised design of the target-machine, this means to find the minimum number of supports and gears as well as the optimised configuration of the multi-spindle drilling gears, an automated system based on pattern identification, knowledge discovery and automated decision process is explained. The transfer of acquired manual design experience from the human expert to a software strategy to solve the multi-criteria optimisation problem will achieve cost reductions during the machine design.
In this paper we propose the use of dominant point method for Chinese character recognition. We compare the performance of three classifiers on the same inputs;a statistical linear classifier, a machinelearning C4.5 ...
详细信息
We describe the multilingual Named Entity recognition and Classification (NERC) subpart of an e-retail product comparison system which is currently under development as part of the EU-funded project CROSSMARC. The sys...
详细信息
We describe the multilingual Named Entity recognition and Classification (NERC) subpart of an e-retail product comparison system which is currently under development as part of the EU-funded project CROSSMARC. The system must be rapidly extensible, both to new languages and new domains. To achieve this aim we use XML as our common exchange format and the monolingual NERC components use a combination of rule-based and machine-learning techniques. It has been challenging to process web pages which contain heavily structured data where text is intermingled with HTML and other code. Our preliminary evaluation results demonstrate the viability of our approach.
One of the fundamental challenges for datamining is to enable inductive learning algorithms to operate on very large databases. Ensemble learning techniques such as bagging have been applied successfully to improve a...
详细信息
ISBN:
(纸本)1853129259
One of the fundamental challenges for datamining is to enable inductive learning algorithms to operate on very large databases. Ensemble learning techniques such as bagging have been applied successfully to improve accuracy of classification models by generating multiple models, from replicate training sets, and aggregating them to form a composite model. In this paper, we adapt the bagging approach for scaling up and also study effects of data partitioning, sampling, and aggregation techniques for mining very large databases. Our recent work developed SORCER, a learning system that induces a near minimal rule set from a data set represented as a second-order decision table (a database relation in which rows have sets of atomic values as components). Despite its simplicity, experiments show that SORCER is competitive to other, state-of-the-art induction systems. Here we apply SORCER using two instance subset selection procedures (random partitioning and sampling with replacement) and two aggregation procedures (majority voting and selecting the model that performs best on a validation set). We experiment with the GIS data set, from the UCI KDD Repository, which contains 581,012 instances of 30x30 meter cells with 54 attributes for classifying forest cover types. Performance results are reported including results from mining the entire training data set using different compression algorithms in SORCER and published results from neural net and decision tree learners.
The proceedings contain 89 papers. The special focus in this conference is on datamining and Knowledge Engineering. The topics include: mining frequent sequential patterns under a similarity constraint;pre-pruning cl...
ISBN:
(纸本)9783540440253
The proceedings contain 89 papers. The special focus in this conference is on datamining and Knowledge Engineering. The topics include: mining frequent sequential patterns under a similarity constraint;pre-pruning classification trees to reduce overfitting in noisy domains;datamining for fuzzy decision tree structure with a genetic program;co-evolutionary datamining to discover rules for fuzzy resource management;discovering temporal rules from temporally ordered data;automated personalisation of internet users using self-organising maps;data abstractions for numerical attributes in datamining;calculating aggregates with range-encoded bit-sliced index;a classification algorithm for datamining;a hierarchical model to support Kansei mining process;indexing and mining of the local patterns in sequence database;a knowledge discovery by fuzzy rule based hopfield network;fusing partially inconsistent expert and learnt knowledge in uncertain hierarchies;organisational information management and knowledge discovery in email within mailing lists;design of multi-drilling gear machines by knowledge processing and machine simulation;approach based on hierarchically structured subject domain;a knowledge-based information extraction system for semi-structured labeled documents;measuring semantic similarity between words using lexical knowledge and neural networks;extraction of hidden semantics from web pages;self-organising maps for hierarchical tree view document clustering using contextual information;schema discovery of the semi-structured and hierarchical data;indexing and retrieving web document using computational and linguistic techniques and an intelligent mobile commerce system with dynamic contents builder and mobile products browser.
Many practical applications are related to frequent sequential patternmining, ranging from Web Usage mining to Bioinformatics. To ensure an appropriate extraction cost for useful mining tasks, a key issue is to push ...
详细信息
The proceedings contain 32 papers. The special focus in this conference is on Bagging, Boosting, Ensemble learning and Neural Networks. The topics include: Support vector machines, kernel logistic regression and boost...
ISBN:
(纸本)3540438181
The proceedings contain 32 papers. The special focus in this conference is on Bagging, Boosting, Ensemble learning and Neural Networks. The topics include: Support vector machines, kernel logistic regression and boosting;multiple classification systems in the context of feature extraction and selection;boosted tree ensembles for solving multiclass problems;distributed pasting of small votes;bagging and boosting for the nearest mean classifier;highlighting hardpatterns via adaboost weights evolution;using diversity with three variants of boosting;multistage neural network ensembles;forward and backward selection in regression hybrid network;types of multinet system;discriminant analysis and factorial multiple splits in recursive partitioning for datamining;new measure of classifier dependency in multiple classifier systems;a discussion on the classifier projection space for classifier combining;on the general application of the tomographic classifier fusion methodology;post-processing of classifier outputs in multiple classifier systems;trainable multiple classifier schemes for handwritten character recognition;generating classifiers ensembles from multiple prototypes and its application to handwriting recognition;adaptive feature spaces for land cover classification with limited ground truth;stacking with multi-response model trees;on combining one-class classifiers for image database retrieval;bias-variance analysis and ensembles of SVM;an experimental comparison of fixed and trained rules for crisp classifiers outputs;reduction of the boasting bias of linear experts;analysis of linear and order statistics combiners for fusion of imbalanced classifiers;boosting and classification of electronic nose data and content-based classification of digital photos.
Some challenges for Website designers are to provide correct and useful information to individual users with different backgrounds and interests, as well as to increase user satisfaction. Intelligent Web agents offer ...
详细信息
暂无评论