The content and structure of linked information such as sets of web pages or research paper archives are dynamic and keep on changing. Even though different methods are proposed to exploit both the link structure and ...
详细信息
Data indexing is common in data mining when working with high-dimensional, large-scale data sets. Hadoop, a cloud computing project using the MapReduce framework in Java, has become of significant interest in distribu...
详细信息
The global expansion of the Web brings the global computing;and the increasing number of problems with increasing complexity & sophistication also makes collaboration desirable. In this paper, we presented a seman...
详细信息
Description logics are widely used to express structured data and provide reasoning facility to query and integrate data from different databases. This paper presents a many-sorted description logic MDL to represent r...
详细信息
Service-oriented computing is a new computing paradigm that utilizes services as fundamental elements for developing applications. Service composition plays a very important role in it. This paper focuses on service c...
详细信息
Syntax-based statistical translation model is proved to be better than phrasebased model, especially for language pairs with very different syntax structures, such as Chinese and English. In this talk I will introduce...
详细信息
This paper illustrates the ICT Statistical Machine Translation system used in the evaluation campaign of the International Workshop on Spoken Language Translation 2010. We participate in the DIALOG tasks for Chinese-t...
详细信息
Randí et al. proposed a significant graphical representation for DNA sequences, which is very compact and avoids loss of information. In this paper, we build a fast algorithm for this graphical representation wit...
详细信息
In this paper, we propose a novel manifold alignment method by learning the underlying common manifold with supervision of corresponding data pairs from different observation sets. Different from the previous algorith...
In this paper we introduce a compactness based clustering algorithm. The compactness of a data class is measured by comparing the inter-subset and intra-subset distances. The class compactness of a subset is defined a...
详细信息
In this paper we introduce a compactness based clustering algorithm. The compactness of a data class is measured by comparing the inter-subset and intra-subset distances. The class compactness of a subset is defined as the ratio of the two distances. A subset is called an isolated cluster (or icluster) if its class compactness is greater than 1. All iclusters make a containment tree. We introduce monotonic sequences of iclusters to simplify the structure of the icluster tree, based on which a clustering algorithm is designed. The algorithm has the following advantages: it is effective on data sets with clusters nonlinearly separated, of arbitrary shapes, or of different densities. The effectiveness of the algorithm is demonstrated by experiments.
暂无评论