Mobile phone data record people's calling logs in everyday life, which reflecting their custom, pattern and lifestyle. In this paper, we present approaches to urban activity analysis from real mobile phone locatio...
详细信息
With the rapid development of location sensing technology such as GPS, huge amount of location data through GPS are produced every day. The flood of taxi GPS data make it possible to predict the plentitude of traffic ...
详细信息
The reconfigurable manufacturing system is a cost-effective system that can accommodate a variety of equipments required by customers. However, because of the surprisingly increasing volume and semantically fuzzy natu...
详细信息
Very recently, the study of social networks has received a huge attention since we can learn and understand many hidden properties of our society. This paper investigates the potential of social network analysis to se...
详细信息
Sensor fusion is the combining of sensory data from disparate sources such that the resulting information is in some sense better than would be possible when these sources were used individually. The natural uncertain...
详细信息
Greenhouse gases remote sensing monitoring system is implementation of greenhouse gases remote sensing applied technologies. This paper discusses the business application mode, operation scheme and application technol...
详细信息
Machinery and equipment descriptions play an important role in the design and manufacturing of industrial machinery devices and help reducing the design time and manufacturing costs of machinery devices. However, one ...
详细信息
Recent years have witnessed the explosive growth of online social networks (OSNs), which provide a perfect platform for observing the information propagation. Based on the theory of complex network analysis, consideri...
详细信息
Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separa...
详细信息
Schema summarization on large-scale databases is a challenge. In a typical large database schema, a great proportion of the tables are closely connected through a few high degree tables. It is thus difficult to separate these tables into clusters that represent different topics. Moreover, as a schema can be very big, the schema summary needs to be structured into multiple levels, to further improve the usability. In this paper, we introduce a new schema summarization approach utilizing the techniques of community detection in social networks. Our approach contains three steps. First, we use a community detection algorithm to divide a database schema into subject groups, each representing a specific subject. Second, we cluster the subject groups into abstract domains to form a multi-level navigation structure. Third, we discover representative tables in each cluster to label the schema summary. We evaluate our approach on Freebase, a real world large-scale database. The results show that our approach can identify subject groups precisely. The generated abstract schema layers are very helpful for users to explore database.
Accurate classification of gene expression data offers great value in understanding the mechanism of tumor and effective clinical treatment. However, in real-world application, people often face a large number of unla...
详细信息
Accurate classification of gene expression data offers great value in understanding the mechanism of tumor and effective clinical treatment. However, in real-world application, people often face a large number of unlabeled samples and meager labeled ones, so semi-supervised learning is applied in cancer classification. In this paper, a Local Reconstruction and Global Preserving Based Semi-Supervised Dimensionality(LRGPSSDR) Method was proposed for cancer classification. LRGPSSDR makes full use of side information, which can set the edge weights of neighborhood graph through minimizing the local reconstruction error and can preserve the global geometric structure of the sampled data set as well as preserving its local geometric structure. Experimental results on five public gene expression datasets show the superior performance of the method.
暂无评论