In the previous chapters of this book quite different approaches to create networks based on existing data collections (Part II) have been discussed and diverse methods for network analysis have been proposed (Part II...
详细信息
In this paper, we address the problem of data Compression which is critical in wireless sensor networks. We proposed a novel Topology-based data Compression (TDC) algorithm for wireless sensor networks. We utilize the...
详细信息
After having been a term of reflection in philosophy as well as psychology for ages, the fascination with human wisdom finally reaches the realms of computer science. In comparison to the first philosophical definitio...
详细信息
Given a multi-features data set, a best preference query (BPQ) computes the maximal preference score (MPS) that the tuples in the data set can achieve with respect to a preference function. BPQs are very useful in app...
详细信息
Surprising a user with unexpected and fortunate recommendations is a key challenge for recommender systems. Motivated by the concept of bisociations, we propose ways to create an environment where such serendipitous r...
详细信息
When consuming data from federated domains, it is often necessary to identify the relationships that exist between the data schemas used in each domain. Discovering the exact nature of these relationships is difficult...
详细信息
When consuming data from federated domains, it is often necessary to identify the relationships that exist between the data schemas used in each domain. Discovering the exact nature of these relationships is difficult due to data set schema heterogeneity. Prior work has focused on inter-domain class equivalence. However it is not always possible to find an equivalent class in both schemas. For example, when instances are modeled as classes in one domain (e.g. router type) but as the attribute values of a single class in the other domain (e.g. router interface). This paper investigates whether when classifying instances in one data set against a second schema, it may be more useful to use some attribute (or attribute group) other than the original class type, to perform this classification. A machine-learning based classification approach to appropriate attribute selection is presented and its operation is evaluated using two large data-sets available on the web as Linked data. The classification problem is compounded by the less formal semantics of Linked data when compared to full ontologies but this also highlights the strength of our approach to dealing with noisy or under-specified data-sets and schemas. The experimental results show that our attribute selection approach is capable of discovering appropriate mappings for cases where the correspondence is conditioned on one attribute and that information gain provides a suitable scoring function for selection of correspondence patterns to describe these complex attribute-based mappings.
data mining is gaining societal momentum due to the ever increasing availability of large amounts of human data, easily collected by a variety of sensing technologies. data mining comes with unprecedented opportunitie...
详细信息
data mining is gaining societal momentum due to the ever increasing availability of large amounts of human data, easily collected by a variety of sensing technologies. data mining comes with unprecedented opportunities and risks: a deeper understanding of human behavior and how our society works is darkened by a greater chance of privacy intrusion and unfair discrimination based on the extracted patterns and profiles. Although methods independently addressing privacy or discrimination in data mining have been proposed in the literature, in this context we argue that privacy and discrimination risks should be tackled together, and we present a methodology for doing so while publishing frequent pattern mining results. We describe a combined pattern sanitization framework that yields both privacy and discrimination-protected patterns, while introducing reasonable (controlled) pattern distortion.
To enable discovery in large, heterogenious information networks a tool is needed that allows exploration in changing graph structures and integrates advanced graph mining methods in an interactive visualization frame...
详细信息
Currently, most works on interval valued problems mainly focus on attribute reduction (i.e., feature selection) by using rough set technologies. However, less research work on classifier building on interval-valued pr...
详细信息
Currently, most works on interval valued problems mainly focus on attribute reduction (i.e., feature selection) by using rough set technologies. However, less research work on classifier building on interval-valued problems has been conducted. It is promising to propose an approach to build classifier for interval-valued problems. In this paper, we propose a classification approach based on interval valued fuzzy rough sets. First, the concept of interval valued fuzzy granules are proposed, which is the crucial notion to build the reduction framework for the interval-valued databases. Second, the idea to keep the critical value invariant before and after reduction is selected. Third, the structure of reduction rule is completely studied by using the discernibility vector approach. After the description of rule inference system, a set of rules covering all the objects can be obtained, which is used as a rule based classifier for future classification. Finally, numerical examples are presented to illustrate feasibility and affectivity of the proposed method in the application of privacy protection.
The characteristics of decisions and the evaluation of their outcome are highly complex. In this paper, we first give a short analysis of different types of decisions such as long-term and short-term decisions or dile...
详细信息
暂无评论