In this paper we present a method for the automatic discovery and tuning of term similarities. The method is based on the automatic extraction of significant patterns in which terms tend to appear. Beside that, we use...
详细信息
ISBN:
(纸本)3540440259
In this paper we present a method for the automatic discovery and tuning of term similarities. The method is based on the automatic extraction of significant patterns in which terms tend to appear. Beside that, we use lexical and functional similarities between terms to define a hybrid similarity measure as a linear combination of the three similarities. We then present a genetic algorithm approach to supervised learning of parameters that are used in this linear combination. We used a domain specific ontology to evaluate the generated similarity measures and set the direction of their convergence. The approach has been tested and evaluated in the domain of molecular biology.
A software concept for automated design of a multi-spindle drilling gear machine used in furniture production process is proposed. To find an optimised design of the target-machine, this means to find the minimum numb...
详细信息
ISBN:
(纸本)3540440259
A software concept for automated design of a multi-spindle drilling gear machine used in furniture production process is proposed. To find an optimised design of the target-machine, this means to find the minimum number of supports and gears as well as the optimised configuration of the multi-spindle drilling gears, an automated system based on pattern identification, knowledge discovery and automated decision process is explained. The transfer of acquired manual design experience from the human expert to a software strategy to solve the multi-criteria optimisation problem will achieve cost reductions during the machine design.
Mobile commerce (MC) is expected to occupy an important part in electronic commerce in spite of a number of problems such as restriction of device, limited contents and expensive charge system. But the functions that ...
详细信息
ISBN:
(纸本)3540440259
Mobile commerce (MC) is expected to occupy an important part in electronic commerce in spite of a number of problems such as restriction of device, limited contents and expensive charge system. But the functions that can automatically extend the limited contents, which are provided for the MC and efficiently support commercial transaction on the expensive charge system, are essential for the activation of the MC. In this paper we propose a next generation intelligence mobile commerce system, which enables a series of e-commerce activities like searching, ordering and settlement on the mobile device including the functions mentioned above. The proposed system has been actually designed, implemented and confirmed its effectiveness through experiments.
In order to establish a useful data warehouse, it must be correct and consistent. Hence, when selecting the data sources for building the data warehouse, it is essential know exactly about the concept and structure of...
详细信息
ISBN:
(纸本)3540440259
In order to establish a useful data warehouse, it must be correct and consistent. Hence, when selecting the data sources for building the data warehouse, it is essential know exactly about the concept and structure of all possible data sources and the dependencies between them. In a perfect world, this knowledge stems from an integrated, enterprize-wide data model. However, the reality is different and often an explicit model is not available. This paper proposes an approach for identifying data sources for a data warehouse, even without having detailed knowledge about interdependencies of data sources. Furthermore, we are able to confine the number of potential data sources. Hence, our approach reduces the time needed to build and maintain a data warehouse and it increases the data quality of the data warehouse.
Most inherently distributed systems require self diagnosis and on-line monitoring. This is especially true in the domains of power transmission and mobile communication. Much effort has been expended in developing on-...
详细信息
ISBN:
(纸本)3540440259
Most inherently distributed systems require self diagnosis and on-line monitoring. This is especially true in the domains of power transmission and mobile communication. Much effort has been expended in developing on-site monitoring systems for distributed power transformers and mobile communication base stations. In this paper, a new approach has been employed to implement the autonomous self diagnosis and on-site monitoring using multi-agents on mobile communication base stations.
The automatic induction of classification rules from examples in the form of a classification tree is an important technique used in data mining. One of the problems encountered is the overfilling of rules to training...
详细信息
ISBN:
(纸本)3540440259
The automatic induction of classification rules from examples in the form of a classification tree is an important technique used in data mining. One of the problems encountered is the overfilling of rules to training data. In some cases this can lead to an excessively large number of rules, many of which have very little predictive value for unseen data. This paper describes a means of reducing overfitting known as J-pruning, based on the J-measure, an information theoretic means of quantifying the information content of a rule. It is demonstrated that using J-pruning generally leads to a substantial reduction in the number of rules generated and an increase in predictive accuracy. The advantage gained becomes more pronounced as the proportion of noise increases.
Identifying and quantifying relevance of input features are particularly useful in data mining when dealing with ill-understood real-world data defined problems. The conventional methods, such as statistics and correl...
详细信息
ISBN:
(纸本)3540440259
Identifying and quantifying relevance of input features are particularly useful in data mining when dealing with ill-understood real-world data defined problems. The conventional methods, such as statistics and correlation analysis, appear to be less effective because the data of such type of problems usually contains high-level noise and the actual distributions of attributes are unknown. This papers presents a neural-network based method to identify relevant input features and quantify their general and specified relevance. An application to a real-world problem, i.e. osteoporosis prediction, demonstrates that the method is able to quantify the impacts of risk factors, and then select the most salient ones to train neural networks for improving prediction accuracy.
This paper describes and evaluates T3, an algorithm that builds trees of depth at most three, and results in high accuracy whilst keeping the size of the tree reasonably small. T3 is an improvement over T2 in that it ...
详细信息
ISBN:
(纸本)3540440259
This paper describes and evaluates T3, an algorithm that builds trees of depth at most three, and results in high accuracy whilst keeping the size of the tree reasonably small. T3 is an improvement over T2 in that it builds larger trees and adopts a less greedy approach. T3 gave better results than both T2 and C4.5 when run against publicly available data sets: T3 decreased classification error on average by 47% and generalisation error by 29%, compared to T2;and T3 resulted in 46% smaller trees and 32% less classification error compared to C4.5. Due to its way of handling unknown values, T3 outperforms C4.5 in generalisation by 99% to 66%, on a specific medical dataset.
We present the functional and architectural specification of BIKMAS, a Bioinformatics Knowledge Management System. BIKMAS contains an interactive user interface, a database in which several sources of knowledge are re...
详细信息
ISBN:
(纸本)3540440259
We present the functional and architectural specification of BIKMAS, a Bioinformatics Knowledge Management System. BIKMAS contains an interactive user interface, a database in which several sources of knowledge are registered and a nucleus of knowledge management implemented with an algorithm that filters scientific information and assists the user in the task of using knowledge. BIKMAS is an active information system capable of retrieving, processing and filtering scientific information, checking for consistency and structuring the relevant information for its efficient distribution and convenient use. Two of the most important aspects of BIKMAS are that the system is based on an object-oriented database and it has been developed in JAVA tightly integrated in Internet.
In this paper, we investigate data abstractions for mining association rules with numerical conditions and boolean consequents as a target class. The act of our abstraction corresponds to joining some consecutive prim...
详细信息
ISBN:
(纸本)3540440259
In this paper, we investigate data abstractions for mining association rules with numerical conditions and boolean consequents as a target class. The act of our abstraction corresponds to joining some consecutive primitive intervals of a numerical attribute. If the interclass variance for two adjacent intervals is less than a given admissible upper-bound epsilon, then they are combined together into an extended interval. Intuitively speaking, a low value of the variance means that the two intervals can provide almost the same posterior class distributions. This implies few properties or characteristics about the class would be lost by combining such intervals together. We discuss a bottom-up method for finding maximally extended intervals, called maximal appropriate abstraction. Based on such an abstraction, we can reduce the number of extracted rules, still preserving almost the same quality of the rules extracted without abstractions. The usefulness of our abstraction method is shown by preliminary experimental results.
暂无评论