检索结果-内蒙古大学图书馆

Parallel Web Mining System Based on Cloud Platform

ZTE Communications 2012年第4期10卷 45-53页

作者： Shengmei Luo Qing He Lixia Liu Xiang Ao Ning Li Fuzhen Zhuang Pre-Research department of ZTE Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Graduate University of Chinese Academy of Sciences

Traditional machine-learning algorithms are struggling to handle the exceedingly large amount of data being generated by the internet. In real-world applications, there is an urgent need for machine-learning algorithms to be able to handle large-scale, high-dimensional text data. Cloud computing involves the delivery of computing and storage as a service to a heterogeneous community of recipients, Recently, it has aroused much interest in industry and academia. Most previous works on cloud platforms only focus on the parallel algorithms for structured data. In this paper, we focus on the parallel implementation of web-mining algorithms and develop a parallel web-mining system that includes parallel web crawler; parallel text extract, transform and load （ETL） and modeling; and parallel text mining and application subsystems. The complete system enables variable real-world web-mining applications for mass data.

关键词： web mining large scale high volume high dimension cloudcomputing

来源：评论

学校读者我要写书评

暂无评论

CROSS-MEDIA CLOUD computing

CROSS-MEDIA CLOUD COMPUTING

引用

2012 IEEE 2nd International Conference on Cloud computing and Intelligence Systems

作者： Zhongzhi Shi Guang Jiang Bo Zhang Jinpeng Yue Xiaofei Zhao Key Laboratory of Intelligent Information Processing Institute of Computing TechnologyChinese Academy of Sciences Graduate University of Chinese Academy of Sciences

Cross-media is the outstanding characteristics of the age of big data with large scale and complicated processing task. This article presents 5 issues and briefly summarizes the research progress of cross-media knowledge discovery. Furthermore, we propose a framework for cross-media semantic understanding which contains discriminative modeling, generative modeling and cognitive modeling. In cognitive modeling, a new model entitled CAM is proposed which is suitable for cross-media semantic understanding. Moreover, a Cross-Media intelligent Retrieval System (CMIRS) will be illustrated. In the final, the research directions and problems encountered are presented.

关键词： Cross-media Knowledge discovery Cloud computing CMIRS

来源：评论

学校读者我要写书评

暂无评论

PPLSA: Parallel probabilistic latent semantic analysis based on MapReduce 1

引用

7th IFIP International Conference on intelligent information Processing, IIP 2012

作者： Li, Ning Zhuang, Fuzhen He, Qing Shi, Zhongzhi Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China Graduate University Chinese Academy of Sciences Beijing China Key Lab. of Machine Learning and Computational Intelligence College of Mathematics and Computer Science Hebei University Baoding China

ISBN: (数字)9783642328916

ISBN: (纸本)9783642328909

PLSA(Probabilistic Latent Semantic Analysis) is a popular topic modeling technique for exploring document collections. Due to the increasing prevalence of large datasets, there is a need to improve the scalability of computation in PLSA. In this paper, we propose a parallel PLSA algorithm called PPLSA to accommodate large corpus collections in the MapReduce framework. Our solution efficiently distributes computation and is relatively simple to implement. © 2012 IFIP International Federation for information Processing.

关键词： MapReduce

来源：评论

学校读者我要写书评

暂无评论

Mechanism of concept learning in topologized knowledge granular space based on GrC

Mechanism of concept learning in topologized knowledge granu...

引用

IEEE International Conference on Granular computing (GRC)

作者： Zuqiang Meng Zhongzhi Shi Shimo Shen College of Computer Electronics and Information Guangxi University Nanning China The Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China

Concept learning in information systems is actually performed in knowledge granular space on information systems. But no much attention has been paid to study such a knowledge granular space and its structure so far, and its structure characteristics are still poorly understood. In this paper, the granular space is firstly topologized and is decomposed into granular worlds. Then it is modeled as a bounded lattice. Finally, by using graph theory, the bounded lattice obtained is expressed as a hass graph, and the mechanism of concept learning in information systems can be visually explained. With related properties of topological space, bounded lattice and graph theory, the "mysterious" granular space can be delved more deeply into. This work can form a basis for designing concept learning algorithm as well as can richen the theory system for granular computing.

关键词： Computational modeling Iron Lattices Ice

来源：评论

学校读者我要写书评

暂无评论

Web Question Answering based on CCG parsing and DL ontology

Web Question Answering based on CCG parsing and DL ontology

引用

International Conference on information Science and Digital Content technology

作者： Peng Li Lejian Liao Beijing Engineering Research Centre of High Volume Language Information Processing & Cloud Computing Applications Multi-living-agent Information System Key Laboratory and Beijing Laboratory of Intelligent Information Technology School of Computer Scienc

ISBN: (纸本)9781467312882

This paper presents a novel WQA (Web Question Answering) approach based on the combination CCG (Combinatory Categorial Grammar) and DL (Description Logic) ontology, in order to promote semantic-level accuracy through deep text understanding capabilities. We propose to take DL based semantic modeling, i.e., translating lambda-expression encoding of question meaning into DL based semantic representations. The advantage of such approach is a seamless exploitation of existing semantic resource coded as DL ontology, which is widespread in such area as the Semantic Web and conceptual modeling. The experiments are conducted with a repository of complex Chinese questions which involves the satisfaction of some object property restrictions. The experimental results show that producing the semantic representations with the combination of CCG parsing and DL reasoning is an effective approach for question understanding at semantic level, in terms of both understanding accuracy promotion and semantic resource exploitation.

关键词： Semantics Ontologies Syntactics Grammar Cognition Accuracy Semantic Web

来源：评论

学校读者我要写书评

暂无评论

Hierarchical Chunk-to-String Translation 12

Hierarchical Chunk-to-String Translation

引用

Annual meeting of the Association for Computational Linguistics

作者： Yang Feng Dongdong Zhang Mu Li Ming Zhou Qun Liu University of Sheffield Sheffield UK Microsoft Research Asia Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences

ISBN: (纸本)9781622761715

We present a hierarchical chunk-to-string translation model, which can be seen as a compromise between the hierarchical phrase-based model and the tree-to-string model, to combine the merits of the two models. With the help of shallow parsing, our model learns rules consisting of words and chunks and meanwhile introduce syntax cohesion. Under the weighed synchronous context-free grammar defined by these rules, our model searches for the best translation derivation and yields target translation simultaneously. Our experiments show that our model significantly outperforms the hierarchical phrase-based model and the tree-to-string model on English-Chinese Translation tasks.

关键词：

来源：评论

学校读者我要写书评

暂无评论

${\rm S}^{3}{\rm MKL}$: Scalable Semi-Supervised Multiple Kernel Learning for Real-World Image Applications

引用

IEEE Transactions on Multimedia 2012年第4期14卷 1259-1274页

作者： Shuhui Wang Qingming Huang Shuqiang Jiang Qi Tian Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China Department of Computer Science University of Texas-San Antonio San Antonio TX USA

We study the visual learning models that could work efficiently with little ground-truth annotation and a mass of noisy unlabeled data for large scale Web image applications, following the subroutine of semi-supervised learning (SSL) that has been deeply investigated in various visual classification tasks. However, most previous SSL approaches are not able to incorporate multiple descriptions for enhancing the model capacity. Furthermore, sample selection on unlabeled data was not advocated in previous studies, which may lead to unpredictable risk brought by real-world noisy data corpse. We propose a learning strategy for solving these two problems. As a core contribution, we propose a scalable semi-supervised multiple kernel learning method (S 3 MKL) to deal with the first problem. The aim is to minimize an overall objective function composed of log-likelihood empirical loss, conditional expectation consensus (CEC) on the unlabeled data and group LASSO regularization on model coefficients. We further adapt CEC into a group-wise formulation so as to better deal with the intrinsic visual property of real-world images. We propose a fast block coordinate gradient descent method with several acceleration techniques for model solution. Compared with previous approaches, our model better makes use of large scale unlabeled images with multiple feature representation with lower time complexity. Moreover, to address the issue of reducing the risk of using unlabeled data, we design a multiple kernel hashing scheme to identify the “informative” and “compact” unlabeled training data subset. Comprehensive experiments are conducted and the results show that the proposed learning framework provides promising power for real-world image applications, such as image categorization and personalized Web image re-ranking with very little user interaction.

关键词： Kernel Visualization Training Data models Noise measurement Training data Learning systems

来源：评论

学校读者我要写书评

暂无评论

Modeling and testing embedded system by model checking

引用

International Journal of Advancements in computing technology 2012年第17期4卷 18-27页

作者： Sun, Fuzhen Song, Dandan Liao, Lejian Li, Guoqiang Beijing Engg. Res. Centre of High Volume Language Information Processing and Cloud Computing Appl Beijing Key Laboratory of Intelligent Information Technology School of Computer Science Beijing Institute of Technology China School of Computer Science and Technology Shandong University of Technology China

Software testing is the key validation technique used by industry up to today, but remain error prone and expensive cost. Automatically generating test cases from formal models of the system under test is a promising improvement approach to cut down the testing cost. This paper introduces a technique that automatically generate real-time conformance test cases from timed automata specifications. First, both reactive system and its environment is modeled by restricted automata with the notion of deterministic, input enabled and output urgent. Then demonstration is given to show how to efficiently generate real-time test cases with optimal execution time from diagnostic trace. Finally, we formally specify user's single purpose or coverage criteria to convert the test case generation problem into a reachability problem. This approach is implemented using model checkers as test case generation tools and experiment results on three different coverage criteria specifications show feasibility and effectiveness of our technique.

关键词： Model checking

来源：评论

学校读者我要写书评

暂无评论

Ultrasonic time of flight computed tomography for concrete inspection

引用

Journal of Theoretical and Applied information technology 2012年第2期45卷 462-467页

作者： Fan, Honghui Zhu, Hongjin School of Computer Engineering Jiangsu Teachers University of Technology Changzhou 213001 China Key Laboratory of Cloud Computing and Intelligent Information Processing of Changzhou City Jiangsu Teachers University of Technology Changzhou 213001 China

This research aims to evaluate the internal structure of concrete material configuration using an immersed ultrasonic computed tomography imaging technique. We propose a relative difference method of time of flight data to remove distortions in imaging process of concrete. Time of flight data for 306 paths were measured in total by manual scanning for one computer tomography image, we examined interpolation of time of flight data as the density which has a considerable effect on image quality in Filtered Back Projection (FPB) method. The relative difference of time of flight and interpolation is examined in detail using concrete phantoms. The accuracy of defect detection in concrete was significantly improved by the proposed technique. © 2005 - 2012 JATIT & LLS. All rights reserved.

关键词： Interpolation

来源：评论

学校读者我要写书评

暂无评论

Convergence analysis of particle swarm optimization algorithm

引用

Advances in information Sciences and Service Sciences 2012年第14期4卷 25-32页

作者： Gao, Shang Cao, Cungen School of Computer Science and Engineering Jiangsu University of Science and Technology Zhenjiang 212003 China Key Laboratory of Intelligent Computing Information Processing of Ministry of Education Xiangtan University Xiantan 411105 China Institute of Computing Technology Chinese Academy of Sciences Beijing 100080 China

Using Particle Swarm Optimization to handle complex functions with high-dimension it has the problems of low convergence speed and sensitivity to local convergence. The convergence of particle swarm algorithm is studied from the dynamic system theory, and the condition for the convergence of particle swarm algorithm is given. The analysis provided qualitative guidelines for the general algorithm parameter selection. Results of numerical tests show the efficiency of the results.

关键词： Parameter estimation

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：