检索结果-内蒙古大学图书馆

CloudLCA: finding the lowest common ancestor in metagenome analysis using cloud computing

Protein & Cell 2012年第2期3卷 148-152页

作者： Guoguang Zhao Dechao Bu Changning Liu Jing Li Jian Yang Zhiyong Liu Yi Zhao Runsheng Chen Bioinformatics Research Group Key Laboratory of Intelligent Information ProcessingAdvanced Computing Research LaboratoryInstitute of Computing TechnologyChinese Academy of SciencesBeijing 100190China Bioinformatics Laboratory and National Laboratory of Biomacromolecules Institute of BiophysicsChinese Academy of SciencesBeijing 100101China State Key Laboratory for Molecular Virology and Genetic Engineering National Institute for Viral Disease Control and PreventionChinese Center for Disease Control and PreventionBeijing 100176China Graduate School of the Chinese Academy of Sciences Beijing 100190China

Estimating taxonomic content constitutes a key problem in metagenomic sequencing data ***,extracting such content from high-throughput data of next-generation sequencing is very time-consuming with the currently available ***,we present CloudLCA,a parallel LCA algorithm that significantly improves the efficiency of determining taxonomic composition in metagenomic data *** show that CloudLCA(1)has a running time nearly linear with the increase of dataset magnitude,(2)displays linear speedup as the number of processors grows,especially for large datasets,and(3)reaches a speed of nearly 215 million reads each minute on a cluster with ten thin *** comparison with MEGAN,a well-known metagenome analyzer,the speed of CloudLCA is up to 5 more times faster,and its peak memory usage is approximately 18.5%that of MEGAN,running on a fat *** can be run on one multiprocessor node or a *** is expected to be part of MEGAN to accelerate analyzing reads,with the same output generated as MEGAN,which can be import into MEGAN in a direct way to finish the following ***,CloudLCA is a universal solution for finding the lowest common ancestor,and it can be applied in other fields requiring an LCA algorithm.

关键词： CloudLCA metagenome analysis cloud computing

来源：评论

学校读者我要写书评

暂无评论

Iterative Annotation Transformation with Predict-Self Reestimation for Chinese Word Segmentation 12

Iterative Annotation Transformation with Predict-Self Reesti...

引用

Conference on Empirical Methods in Natural Language processing

作者： Wenbin Jiang Fandong Meng Qun Liu Yajuan Lue Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences

ISBN: (纸本)9781622765034

In this paper we first describe the technology of automatic annotation transformation, which is based on the annotation adaptation algorithm (Jiang et al., 2009). It can automatically transform a human-annotated corpus from one annotation guideline to another. We then propose two optimization strategies, iterative training and predict-self reestimation, to further improve the accuracy of annotation guideline transformation. Experiments on Chinese word segmentation show that, the iterative training strategy together with predict-self reestimation brings significant improvement over the simple annotation transformation baseline, and leads to classifiers with significantly higher accuracy and several times faster processing than annotation adaptation does. On the Penn Chinese Treebank 5.0, it achieves an F-measure of 98.43%, significantly outperforms previous works although using a single classifier with only local features.

关键词： ANNOTATIONS Classifiers Iterative Word Chinese Chinese philosophy Chinese art AUTOMATIC

来源：评论

学校读者我要写书评

暂无评论

Mining association rules on qing court medical records: Semantic abstraction and standardization

Mining association rules on qing court medical records: Sema...

引用

2012 IEEE 12th International Conference on Computer and information technology, CIT 2012

作者： Cao, Cong Wang, Weimin Cao, Cungen Zang, Liangjun Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China Graduate University Chinese Academy of Sciences Beijing China School of Computer Science and Engineering Jiangsu University of Science and Technology Zhenjiang China

ISBN: (纸本)9780769548586

To explore the association relations among disease, pathogenesis, physician, symptoms and drug, we adapt a variational Apriori algorithm for discovering association rules on a dataset of the Qing Court Medical Records. There are five types of semantic associations we intend to discover, including Disease-Pathogenesis-Drug set(DPaD), Disease-Symptoms-Drug set (DSyD), Disease-Drug set (DD), Disease-Physician-Drug set (DPhD) and Disease-Drug Category Set (DDC). To solve the synonymity problem and the data sparseness problem, we give a mapping strategy which maps pathogenesis to standardized forms and maps drugs to drug categories. With the mapping strategy the number of frequent drug sets rises from 287 to 1184. The experimental results indicate that our method with the mapping strategy is an effective way to acquire valuable semantic association rules. © 2012 IEEE.

关键词： Association rules

来源：评论

学校读者我要写书评

暂无评论

microRNA的功能(英文)

引用

生物化学与生物物理进展 2012年第10期39卷 979-980页

作者：罗海涛赵屹 Bioinformatics Research Group Advanced Computing Research LaboratoryKey Laboratory of Intelligent Information ProcessingInstitute of Computing TechnologyChinese Academy of Sciences Research Center of Bioinformatics Kunshan RNAi Institute

MicroRNAs(miRNAs)are a class of small non-coding RNAs that play important roles in post-transcriptional regulation of gene expression[1].A large number of miRNAs have been found to be involved in a broad spectrum of biological functions such as regulation of innate and adaptive immunity,cell differentiation and development as well as

关键词：肾癌 IL-8 上皮细胞-间质细胞转化蛋白激酶C

来源：评论

学校读者我要写书评

暂无评论

TINAC: A fast and effective web video topic detection framework

TINAC: A fast and effective web video topic detection framew...

引用

International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

作者： Xiang Ao Fuzhen Zhuang Qing He Zhongzhi Shi The Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences China

Most of the previous works for web video topic detection(e.g., graph-based co-clustering method) always encounter the problem of real-time topic detection, since they all suffer from the high computation complexity. Therefore, a fast topic detection is needed to meet users' or administrators' requirement in real-world scenarios. Along this line, we propose a fast and effective topic detection framework, in which video streams are first partitioned into buckets using a time-window function, and then an incremental hierarchical clustering algorithm is developed, finally a video-based fusion strategy is used to integrate information from multiple modalities. Furthermore, a series of novel similarity metrics are defined in the framework. The experimental results on three months' YouTube videos demonstrate the effectiveness and efficiency of the proposed method.

关键词： Clustering algorithms Measurement Partitioning algorithms Streaming media Vectors Visualization Complexity theory

来源：评论

学校读者我要写书评

暂无评论

Nested granular local learning

Nested granular local learning

引用

IEEE International Conference on Granular Computing (GRC)

作者： Hong Hu Zhongzhi Shi Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences China

Local learning approaches are especially easy for parallel processing, so they are very important for cloud computing. In 1997, Lotti A. Zadeh proposed the concept of Granular Computing (GrC). Zadeh proposed that there are three basic concepts that underlie human cognition: granulation, organization and causation and a granule being a clump of points (objects) drawn together by indistinguishability, similarity, proximity or functionality. In this paper, we give out a novel local learning approach based on the concept of Granular computing named as "nested local learning NGLL". The experiment shows that the novel NGLL approach is better than the probabilistic latent semantic analysis (PLSA).

关键词： Probabilistic logic Pattern recognition

来源：评论

学校读者我要写书评

暂无评论

Deadline partitioning for periodic messages transmission over Switched Ethernet with non-preemptive EDF

引用

Journal of Computational information Systems 2012年第17期8卷 7171-7180页

作者： Tan, Ming Wei, Zhen Hefei University Key Laboratory of Network and Intelligent Information Processing Hefei China Hefei University of Technology School of Computer and Information Hefei China

When applying Switched Ethernet in real-time communications, the switch and the end-nodes schedule the real-time messages using Earliest Deadline First (EDF) algorithm. The problem we are facing is how to divide deadlines of real-time messages between the transmission link (from source-node to switch) and the reception link (from switch to destination-node), which is referred to as the problem of deadline partitioning. In this paper, an improved feasibility check method for periodic real-time messages scheduled under non-preemptive EDF was presented. In addition, the feasibility analysis of real-time periodic messages when Instances of messages are early released was given and proved by using real-time scheduling theory. Particularly, we proposed an algorithm for calculating the minimum non-preemptive EDF-feasible deadline of real-time messages. Moreover, a novel deadline partitioning scheme called MDPS (Deadline Partitioning Scheme based Deadline Minimization) was developed. By computing the minimum non-preemptive EDF-feasible deadlines and balancing the slack deadline of real-time messages on the transmission link and the reception link, MDPS can optimize deadline partitioning of real-time messages. The simulation results show that the MDPS performs better than the currently used deadline partitioning scheme referred to as ADPS in terms of using the aggregated switch throughput and the message missing ratio. © 2012 Binary information Press.

关键词： Scheduling algorithms

来源：评论

学校读者我要写书评

暂无评论

Semi-supervised expert metadata extraction based on co-training style

Semi-supervised expert metadata extraction based on co-train...

引用

International Conference on Fuzzy Systems and Knowledge Discovery (FSKD)

作者： Youmin Zhang Zhengtao Yu Li Liu Jianyi Guo Cunli Mao The School of Information Engineering and Automation The Intelligent Information Processing Key Laboratory Kunming University of Science and Technology Kunming China

Aiming at the problem that requiring large amounts of labeled training data while using supervised learning to extract the expert metadata, a semi-supervised expert metadata extraction method based on co-training style is proposed. Firstly, according to the characteristics of expert metadata, we select expert metadata features and label a certain amount of metadata samples, then train two classifiers with maximum entropy and conditional random respectively. Secondly, two classifiers are used to label metadata items in the unlabeled expert home pages; when the classification results of one type metadata in one expert page satisfy the confidence requirement, analyze the differences of each type metadata labeled by two classifiers; for the metadata satisfying the difference requirement, the better performing classifier for one type metadata is selected to label the certain type metadata, then the labeled expert homepage is obtained as the labeled sample. Finally, use the above-mentioned labeled expert homepage to extend training samples, and retrain two new classifiers, then iterate until two classifiers are convergent. In the experiment, we collected 2000 expert home pages; the results indicate that the semi-supervised expert metadata extraction method based on co-training style outperforms a number of supervised methods, which reduces the amount of manual labeling work effectively.

关键词： Data mining Accuracy Feature extraction Training Classification algorithms Organizations Labeling

来源：评论

学校读者我要写书评

暂无评论

Multi-task Semi-supervised Semantic Feature Learning for Classification

Multi-task Semi-supervised Semantic Feature Learning for Cla...

引用

IEEE International Conference on Data Mining (ICDM)

作者： Changying Du Fuzhen Zhuang Qing He Zhongzhi Shi The Key Laboratory of Intelligent Information Processing Institute of Computing Technology Chinese Academy of Sciences Beijing China

Multi-task learning has proven to be useful to boost the learning of multiple related but different tasks. Meanwhile, latent semantic models such as LSA and LDA are popular and effective methods to extract discriminative semantic features of high dimensional dyadic data. In this paper, we present a method to combine these two techniques together by introducing a new matrix tri-factorization based formulation for semi-supervised latent semantic learning, which can incorporate labeled information into traditional unsupervised learning of latent semantics. Our inspiration for multi-task semantic feature learning comes from two facts, i.e., 1) multiple tasks generally share a set of common latent semantics, and 2) a semantic usually has a stable indication of categories no matter which task it is from. Thus to make multiple tasks learn from each other we wish to share the associations between categories and those common semantics among tasks. Along this line, we propose a novel joint Nonnegative matrix tri-factorization framework with the aforesaid associations shared among tasks in the form of a semantic-category relation matrix. Our new formulation for multi-task learning can simultaneously learn (1) discriminative semantic features of each task, (2) predictive structure and categories of unlabeled data in each task, (3) common semantics shared among tasks and specific semantics exclusive to each task. We give alternating iterative algorithm to optimize our objective and theoretically show its convergence. Finally extensive experiments on text data along with the comparison with various baselines and three state-of-the-art multi-task learning algorithms demonstrate the effectiveness of our method.

关键词： Semantics Optimization Feature extraction Data mining Data models Joints Iterative methods

来源：评论

学校读者我要写书评

暂无评论

Memory performance prediction of web server applications based on grey system theory

引用

14th Asia Pacific Web technology Conference, APWeb 2012

作者： Huang, Faliang Zhang, Shichao Yuan, Changan Zhong, Zhi Faculty of Software Fujian Normal University Fuzhou 350007 China Faculty of Engineering and Information Technology University of Technology Broadway P.O. Box 123 Sydney NSW 2007 Australia Science Computing and Intelligent Information Processing GuangXi Higher Education Key Laboratory Nanning 530023 China

ISBN: (纸本)9783642292521

With the success of internet, recently more and more companies start to run web-based business. While running e-business sites, many companies have encountered unexpected degeneration of their web server applications performance, which may lead to loss of customers. Many managers wish to have a decision-support tool that cancan answer such questions, such as "will my web server applications performance degenerate?", and "what are the main reasons of the degenerations?". In this paper we first propose a new memory performance prediction model of web server applications based on grey system theory. And then, a software system "Memory Performance Manager" (MPM) is developed for predicting memory performance of the web server applications. Massive experiments demonstrate that the effectiveness of MPM's in predicting web server memory performances. © 2012 Springer-Verlag Berlin Heidelberg.

关键词： Managers

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：