检索结果-内蒙古大学图书馆

A k-populations algorithm for clustering categorical data

PATTERN RECOGNITION 2005年第7期38卷 1131-1134页

作者： kim, DW Lee, k Lee, D Lee, kH Korea Adv Inst Sci & Technol Dept BioSyst Taejon 305701 South Korea Korea Adv Inst Sci & Technol Adv Informat Technol Res Ctr Taejon 305701 South Korea Korea Adv Inst Sci & Technol Dept Elect Engn & Comp Sci Taejon 305701 South Korea

In this paper, the conventional k-modes-type algorithms for clustering categorical data are extended by representing the clusters of categorical data with k-populations instead of the hard-type centroids used in the conventional algorithms. Use of a population-based centroid representation makes it possible to preserve the uncertainty inherent in data sets as long as possible before actual decisions are made. The k-populations algorithm was found to give markedly better clustering results through various experiments. (c) 2005 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.

关键词： clustering categorical data hierarchical algorithm k-modes algorithm fuzzy k-modes algorithm

来源：评论

学校读者我要写书评

暂无评论

Statistical discovery of site inter-dependencies in sub-molecular hierarchical protein structuring

引用

EURASIP JOURNAL ON BIOINFORMATICS AND SYSTEMS BIOLOGY 2012年第1期2012卷 1-18页

作者： Durston, kirk k. Chiu, David k. Y. Wong, Andrew k. C. Li, Gary C. L. 1.School of Computer Science University of Guelph 50 Stone Road East Guelph ON N1G 2W1 Canada 2.Department of System Design Engineering University of Waterloo 200 University Ave. W Waterloo ON N2L 3G1 Canada

Background: Much progress has been made in understanding the 3D structure of proteins using methods such as NMR and X-ray crystallography. The resulting 3D structures are extremely informative, but do not always reveal which sites and residues within the structure are of special importance. Recently, there are indications that multiple-residue, sub-domain structural relationships within the larger 3D consensus structure of a protein can be inferred from the analysis of the multiple sequence alignment data of a protein family. These intra-dependent clusters of associated sites are used to indicate hierarchical inter-residue relationships within the 3D structure. To reveal the patterns of associations among individual amino acids or sub-domain components within the structure, we apply a k-modes attribute (aligned site) clustering algorithm to the ubiquitin and transthyretin families in order to discover associations among groups of sites within the multiple sequence alignment. We then observe what these associations imply within the 3D structure of these two protein families. Results: The k-modes site clustering algorithm we developed maximizes the intra-group interdependencies based on a normalized mutual information measure. The clusters formed correspond to sub-structural components or binding and interface locations. Applying this data-directed method to the ubiquitin and transthyretin protein family multiple sequence alignments as a test bed, we located numerous interesting associations of interdependent sites. These clusters were then arranged into cluster tree diagrams which revealed four structural sub-domains within the single domain structure of ubiquitin and a single large sub-domain within transthyretin associated with the interface among transthyretin monomers. In addition, several clusters of mutually interdependent sites were discovered for each protein family, each of which appear to play an important role in the molecular structure and/or function. Co

关键词： k-modes algorithm Site cluster Associations Ubiquitin Transthyretin Pattern discovery Cluster tree Attribute clustering Protein structural sub-domains

来源：评论

学校读者我要写书评

暂无评论

Improved Clustering for Categorical Data with Genetic algorithm 1st

Improved Clustering for Categorical Data with Genetic Algori...

引用

1st International Conference on Microelectronics, Computing & Communication Systems (MCCS)

作者： Sharma, Abha Thakur, R. S. Maulana Azad Natl Inst Technol Bhopal India

ISBN: (纸本)9789811055652;9789811055645

Clustering is the most significant unsupervised learning where the aim is to partition the data set into uniform groups called clusters. Many real-world data sets often contain categorical values, but many clustering algorithms work only on numeric values which limits its use in data mining The k-modes algorithm is one of the very effective for proper partitions of categorical data sets, though the algorithm stops at locally optimum solution as depended on initial cluster centres. Proposed algorithm utilizes the genetic algorithm (GA) to optimize the k-modes clustering algorithm. The reason is, considering noise as cluster centres gives the high cost which will not fit for the next iteration and also not gets stuck to the suboptimal solutions. The superiority of proposed algorithm is demonstrated for several real-life data sets in terms of accuracy and proves it is efficient and can reveal encouraging results especially for the large datasets.

关键词： Clustering Categorical data Genetic algorithm k-modes algorithm

来源：评论

学校读者我要写书评

暂无评论

Genetic k-modes based DNA Splice Site Adjacent sequences Feature Analysis

Genetic K-modes based DNA Splice Site Adjacent sequences Fea...

引用

7th World Congress on Intelligent Control and Automation

作者： Zhang, Quanwei Peng, Qinke Sun, Hequan Li, kankan Xi An Jiao Tong Univ State Key Lab Mfg Syst Engn Xian 710049 Peoples R China Xi An Jiao Tong Univ Sch Elect & Informat Engn Xian 710049 Peoples R China

ISBN: (纸本)9781424421138

DNA splice site adjacent sequences have remarkable conservative feature, and mining their underlying biological knowledge has become a key issue in the field of DNA sequences analysis. In this paper, we analyze the feature of human being's DNA splice site adjacent sequences. Firstly, we propose a kind of DNA splice site sequences clustering method based on Genetic k-odes, secondly, we analyze the frequency of various bases, di-bases and tri-bases about the experimental data set and each cluster, lastly, we propose one kind of Markov model based frequent patterns discovery algorithm and use it to mine the frequent patterns of the experimental data set and each cluster.

关键词： clustering genetic algorithm k-modes algorithm splice site Markov model

来源：评论

学校读者我要写书评

暂无评论

Research on Improvement of Text Processing and Clustering algorithms in Public Opinion Early Warning System 5

Research on Improvement of Text Processing and Clustering Al...

引用

5th International Conference on Systems and Informatics (ICSAI)

作者： Yang, kongyu Miao, Ruijie Beijing Informat Sci & Technol Univ Sch Publ Adm & Commun Beijing Peoples R China Beijing Informat Sci & Technol Univ Sch Informat Management Beijing Peoples R China

ISBN: (纸本)9781728101200

In order to provide the necessary data for Public opinion monitoring and trend warning, this paper did some researches on text processing and clustering algorithms based on hot topics of the Weibo. Data that get from Weibo were classification data which contain two properties. To adapt this feature and meet the requirement of public opinion trends warning, hamming distance was used to do text similarity computing. By improving the traditional k-means algorithm, a new k-mode algorithm which is used to text clustering on hot topics was achieved. Simulation and results analysis indicated the text processing method was accurate and suitable to the microblog public opinion early warning.

关键词： Public opinion early warning Text processing Hamming distance Text clustering k-modes algorithm

来源：评论

学校读者我要写书评

暂无评论

The role of the O2O blended teaching model in improving the teaching effectiveness of physical education classes

引用

JOURNAL OF INTELLIGENT SYSTEMS 2024年第1期33卷

作者： Qiao, Honghui Henan Vocat Univ Sci & Technol Publ Basic Educ Dept Zhoukou 466000 Peoples R China

The deep fusion of Internet technology and education is constantly pushing forward the reform of university education. Traditional educational ideas, concepts, and models cannot keep pace with the times, and hybrid teaching has become a new way of education in colleges and universities. To improve the teaching effect of physical education classes, the study used a blended teaching model and designed a teaching evaluation and performance prediction model under the blended teaching model based on an improved cluster analysis method and attention mechanism. The lab results indicated that under the blended teaching model, students' performance increased by 12.89 points, and the level of skill mastery and proficiency increased by 26.52 and 28.55%, respectively, with grades more inclined to high score distribution. "Excellent" grade clustering increased by 77.71%, and "Good" grade clustering increased by 19.01%. The minimum error sum of squares of the improved clustering algorithm was 58.18 and 36.25% lower than the other two algorithms, and the clustering results were more relevant. The two-way attention mechanism algorithm predicted higher accuracy results and performed best on all four evaluation metrics, with a prediction accuracy of 98.23%, an accuracy of 98.42%, and an F1 value of 91.78%. This hybrid teaching model is more in line with the characteristics of the physical education teaching discipline, successfully cultivates students' independent learning ability, stimulates students' love for physical education courses, and achieves better teaching results.

关键词： O2O blended learning k-modes algorithm predictive model evaluation metrics

来源：评论

学校读者我要写书评

暂无评论

Genetic k-modes based DNA Splice Site Adjacent sequences Feature Analysis

Genetic K-modes based DNA Splice Site Adjacent sequences Fea...

引用

7th World Congress on Intelligent Control and Automation (WCICA 2008), vol.6

作者： Quanwei Zhang Qinke Peng Hequan Sun kankan Li State Key Laboratory for Manufacturing Systems Engineering and School of Electronic and Information Engineering Xi''an Jiaotong University Xi'an China

DNA splice site adjacent sequences have remarkable conservative feature, and mining their underlying biological knowledge has become a key issue in the field of DNA sequences analysis. In this paper, we analyze the feature of human being's DNA splice site adjacent sequences. Firstly, we propose a kind of DNA splice site sequences clustering method based on Genetic k-modes;secondly, we analyze the frequency of various bases, di-bases and tri-bases about the experimental data set and each cluster;lastly, we propose one kind of Markov model based frequent patterns discovery algorithm and use it to mine the frequent patterns of the experimental data set and each cluster.

关键词： Clustering Genetic algorithm k-modes algorithm Splice site Markov model

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：