This paper explores heterogeneous semantic data in Web 1.0, Semantic Web and Web 2.0 for topicspecific crawling and search. A statistical Semantic Association Model (SAM) is proposed to support semantic interoperabili...
详细信息
This paper explores heterogeneous semantic data in Web 1.0, Semantic Web and Web 2.0 for topicspecific crawling and search. A statistical Semantic Association Model (SAM) is proposed to support semantic interoperability among four different models of thesauruses, categories, ontologies, and folksonomies. Based on this model, a focused crawling and semantic search framework is developed. In focused crawling of potentially related textual and semantic data, URLs are ordered before crawling and irrelevant Web pages are filtered out after crawling according to SAM-based semantic relevance ranking. In order that the retrieved results are more semantically related to the user queries, approaches of SAM-based semantic query expansion and meta-search result aggregation are designed. Experiments show that the proposed model and framework effectively integrates both keyword data and heterogeneous semantic data for topic-specific crawling and search.
We do research on moving object classification in traffic video. Our aim is to classify the moving objects into pedestrians, bicycles and vehicles. Due to the advantage of self-organizing feature map (SOM), an unsuper...
详细信息
A yield estimation method by remote sensing was used to estimate the yield of winter wheat in Jiangsu province,*** first step of this study was to extract the planting area of winter wheat from environmental satellite...
详细信息
ISBN:
(纸本)9783642272776
A yield estimation method by remote sensing was used to estimate the yield of winter wheat in Jiangsu province,*** first step of this study was to extract the planting area of winter wheat from environmental satellite images and land-use map of Jiangsu province,meanwhile,correlation analyses were performed by using 8-day of composite Leaf Area Index(LAI)data from Moderate Resolution Imaging Spectroradiometer(MODIS)and statistical yield of corresponding ***,the average LAI was calculated at the optimal growth period,and the statistical yields of wheat for all counties were collected,in which the former was chosen as the independent variable and the latter was the dependent variable,and the regression model was ***,the accuracy and stability of the regression model were validated using the data of another *** results indicated that the yield estimation model at provincial level was reliable,the Root Mean Square Error(RMSE)and the Mean Absolute Error(MAE)of the model was 12.1%and 9.7%,*** addition,the yield estimation system of winter wheat in Jiangsu province was constructed and published based on ArcMap and ArcGIS Server.
In image/video processing software and hardware products, low complexity interpolation algorithms, such as cubic and splines methods, are commonly used. However, these methods tend to blur textures and produce jaggy e...
详细信息
In image/video processing software and hardware products, low complexity interpolation algorithms, such as cubic and splines methods, are commonly used. However, these methods tend to blur textures and produce jaggy effect compared with other adaptive methods such as NEDI, SAI. Tanner graph based image interpolation algorithm has better effect in dealing with edge and texture, but with high computation complexity. Thanks to the high performance parallel processing capability of today's GPU, use of complex algorithms for real time application is becoming possible. In this paper, we present a fast algorithm for tanner graph based image interpolation and it's implementation on GPU. In our algorithm, the image model training process of tanner graph based image interpolation is greatly simplified. Experimental results show that the GPU implementation can be more than 47 times as fast as the CPU implementation.
The FPgrowth is a famous frequent pattern's algorithm in data mining when working with high-dimensional, large-scale data sets. It is also known as great complexity on memory for the recursively processing. In gen...
详细信息
In this paper, we propose a novel method to implement fast detection of Common Visual Pattern (CVP). The purpose of CVP detection is to find the correspondences between the common visual regions of two given partial d...
详细信息
In this paper, we propose a novel method to implement fast detection of Common Visual Pattern (CVP). The purpose of CVP detection is to find the correspondences between the common visual regions of two given partial duplicate images. There are two major components of the proposed method which guarantee the good performance. First, we establish the Radiate-Geometric-Model (RGM). The RGM is represented by a set of radiate structures, and each structure is geometrically made up of a group of matched feature pairs. By utilizing the statistical information gained from the radiate structures, the RGM can not only quickly estimate the potential pairs of common regions but also organize the scale relationship between matched pairs into a compact form, hence increase the detection speed substantially. Second, we formulize the Radiate-Geometric-Model (RGM) into a graph optimization problem which could be solved by the method of graph-shift, thus make our algorithm capable of detecting the CVPs of all kinds of correspondences. Experimental results prove that the speed of our algorithm is at least 40 times faster than the state-of-the-art, while achieving a better detection performance at the same time.
In traditional fuzzy support vector machine(FSVM), membership function is established in global scope will reduce the membership of support vectors, and the FSVM based dismissing margin increases the training speed, b...
详细信息
This paper surveys research on the Resource Space Model RSM. RSM is a classification-based, multi-dimensional and content-based space model for efficiently and effectively managing various resources. As a non-relation...
详细信息
This paper surveys research on the Resource Space Model RSM. RSM is a classification-based, multi-dimensional and content-based space model for efficiently and effectively managing various resources. As a non-relational data model, it has a rather complete theoretical basis and has significant applications in faceted search and the future cyber-physical society. Applications in picture resources and email resources are introduced.
How to make use of limited memory space and processing speeds of computer for rapid and accurate data mining has become an important research topic on the stream data cluster analysis. A stream data clustering algorit...
详细信息
How to make use of limited memory space and processing speeds of computer for rapid and accurate data mining has become an important research topic on the stream data cluster analysis. A stream data clustering algorithm based on the minimum spanning tree (MSTSC) is described. MSTSC is divided into online processing and offline clustering. Stream data are analyzed online by using two groups of processing unit respectively. In offline process clusters is taken as representative objects, and the minimum spanning tree algorithm is used in offline clustering. MSTSC can improve the clustering quality on non-spherical clusters. Some experiments are carried out in both real data sets and synthetic data sets. Results show that MSTSC algorithm not only can deal with non-spherical clusters effectively, but also has better efficiency and clustering quality. In addition, MSTSC is insensitive to order of input data, and has a good effect for skewed class distributions.
Relevance feedback based on SVM classifier shows a good performance recently but the finite feedback counts limited by user's patience and the small sample size problem are not solved well, Co-SVM does a good job ...
详细信息
ISBN:
(纸本)9781457702099
Relevance feedback based on SVM classifier shows a good performance recently but the finite feedback counts limited by user's patience and the small sample size problem are not solved well, Co-SVM does a good job in solving these problems but still has some flaws. We propose three strategies to try to improve this algorithm: (1) different kernel functions are used to characterize the color and texture visual similarities; (2) a new method is proposed to caculate the confident scores of the contention samples; (3) a bunch of the most irrelevant images with the highest confident score are added into the labeled images to extend the size of labeled data while choosing a bunch of images for user labeling. Experimental results verify the superiority of our method over Co-SVM.
暂无评论