With the success of internet, recently more and more companies start to run web-based business. While running e-business sites, many companies have encountered unexpected degeneration of their web server applications ...
详细信息
PLSA(Probabilistic Latent Semantic Analysis) is a popular topic modeling technique for exploring document collections. Due to the increasing prevalence of large datasets, there is a need to improve the scalability of ...
详细信息
There are a number of leaf recognition methods, but most of them are based on Euclidean space. In this paper, we will introduce a new description of feature for the leaf image recognition, which represents the leaf co...
详细信息
In delay tolerant networks (DTNs), message delivery is operated in an opportunistic way through store-carry and forward relaying, and every DTN node is in anticipation of cooperation for data forwarding from others. U...
详细信息
Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithm...
详细信息
Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load (ETL) and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.
In this paper, we present a scalable implementation of a topic modeling (Adaptive Link-IPLSA) based method for online event analysis, which summarize the gist of massive amount of changing tweets and enable users to e...
详细信息
Cross-media is the outstanding characteristics of the age of big data with large scale and complicated processing task. This article presents 5 issues and briefly summarizes the research progress of cross-media knowle...
详细信息
Cross-media is the outstanding characteristics of the age of big data with large scale and complicated processing task. This article presents 5 issues and briefly summarizes the research progress of cross-media knowledge discovery. Furthermore, we propose a framework for cross-media semantic understanding which contains discriminative modeling, generative modeling and cognitive modeling. In cognitive modeling, a new model entitled CAM is proposed which is suitable for cross-media semantic understanding. Moreover, a Cross-Media intelligent Retrieval System (CMIRS) will be illustrated. In the final, the research directions and problems encountered are presented.
Searching frequent patterns in transactional databases is considered as one of the most important data mining problems and Apriori is one of the typical algorithms for this task. Developing fast and efficient algorith...
详细信息
We present a hierarchical chunk-to-string translation model, which can be seen as a compromise between the hierarchical phrase-based model and the tree-to-string model, to combine the merits of the two models. With th...
ISBN:
(纸本)9781622761715
We present a hierarchical chunk-to-string translation model, which can be seen as a compromise between the hierarchical phrase-based model and the tree-to-string model, to combine the merits of the two models. With the help of shallow parsing, our model learns rules consisting of words and chunks and meanwhile introduce syntax cohesion. Under the weighed synchronous context-free grammar defined by these rules, our model searches for the best translation derivation and yields target translation simultaneously. Our experiments show that our model significantly outperforms the hierarchical phrase-based model and the tree-to-string model on English-Chinese Translation tasks.
Concept learning in information systems is actually performed in knowledge granular space on information systems. But no much attention has been paid to study such a knowledge granular space and its structure so far, ...
详细信息
Concept learning in information systems is actually performed in knowledge granular space on information systems. But no much attention has been paid to study such a knowledge granular space and its structure so far, and its structure characteristics are still poorly understood. In this paper, the granular space is firstly topologized and is decomposed into granular worlds. Then it is modeled as a bounded lattice. Finally, by using graph theory, the bounded lattice obtained is expressed as a hass graph, and the mechanism of concept learning in information systems can be visually explained. With related properties of topological space, bounded lattice and graph theory, the "mysterious" granular space can be delved more deeply into. This work can form a basis for designing concept learning algorithm as well as can richen the theory system for granular computing.
暂无评论