检索结果-内蒙古大学图书馆

biclustering data analysis: a comprehensive survey

BRIEFINGS IN BIOINFORMATICS 2024年第4期25卷 bbae342页

作者： Castanho, Eduardo N. Aidos, Helena Madeira, Sara C. Univ Lisbon Fac Ciencias LASIGE Campo Grande 16 P-1749016 Lisbon Portugal

biclustering, the simultaneous clustering of rows and columns of a data matrix, has proved its effectiveness in bioinformatics due to its capacity to produce local instead of global models, evolving from a key technique used in gene expression data analysis into one of the most used approaches for pattern discovery and identification of biological modules, used in both descriptive and predictive learning tasks. This survey presents a comprehensive overview of biclustering. It proposes an updated taxonomy for its fundamental components (bicluster, biclustering solution, biclustering algorithms, and evaluation measures) and applications. We unify scattered concepts in the literature with new definitions to accommodate the diversity of data types (such as tabular, network, and time series data) and the specificities of biological and biomedical data domains. We further propose a pipeline for biclustering data analysis and discuss practical aspects of incorporating biclustering in real-world applications. We highlight prominent application domains, particularly in bioinformatics, and identify typical biclusters to illustrate the analysis output. Moreover, we discuss important aspects to consider when choosing, applying, and evaluating a biclustering algorithm. We also relate biclustering with other data mining tasks (clustering, pattern mining, classification, triclustering, N-way clustering, and graph mining). Thus, it provides theoretical and practical guidance on biclustering data analysis, demonstrating its potential to uncover actionable insights from complex datasets.

关键词： biclustering biclustering taxonomy biclustering algorithms biclustering evaluation heterogeneous biclustering biclustering-based classification

来源：评论

学校读者我要写书评

暂无评论

Discovery of Evolving Relationships of Software Vulnerabilities 6

Discovery of Evolving Relationships of Software Vulnerabilit...

引用

6th IEEE International Conference on Trust, Privacy and Security in Intelligent Systems, and Applications, TPS-ISA 2024

作者： Sparks, Hailey Ghosh, Krishnendu College of Charleston Department of Computer Science SC United States

ISBN: (纸本)9798350386745

Discovering the risks posed by software vulnerabilities is a challenge. Software vulnerabilities are often not listed and studies have shown 50.3% of the reports do not include the list of vulnerable libraries. Thus, it becomes critical to maximize the understanding of the vulnerability reports and the trends in vulnerabilities. In this work, a novel tool is created that is able to connect the vulnerabilities with time and identify subsets of vulnerabilities that are *** the first step of the model, a text network is created from the vulnerability reports of each month in a year. Then, a temporal network is constructed from text networks. Temporal network theoretic properties are evaluated on the model to understand the vulnerability trends and the evolution of relationships amongst the vulnerabilities. The analysis leverages on the community detection algorithms on text networks. The dynamics of evolving relationships of the software vulnerabilities are extracted using biclustering algorithms, and the statistical significance of the biclusters is evaluated. Experimental results based on vulnerability reports are analyzed and presented. © 2024 IEEE.

关键词： biclustering algorithms Community Detection algorithms Temporal Networks Text Graphs Vulnerability Reports

来源：评论

学校读者我要写书评

暂无评论

RUBic: rapid unsupervised biclustering

引用

BMC BIOINFORMATICS 2023年第1期24卷 435-435页

作者： Sriwastava, Brijesh K. Halder, Anup Kumar Basu, Subhadip Chakraborti, Tapabrata Govt Coll Engn & Leather Technol Comp Sci & Engn Dept Kolkata India Warsaw Univ Technol Fac Math & Informat Sci Warsaw Poland Univ Warsaw CeNT Warsaw Poland Jadavpur Univ Dept Comp Sci & Engn Kolkata 700032 India Alan Turing Inst London England UCL London England

biclustering of biologically meaningful binary information is essential in many applications related to drug discovery, like protein-protein interactions and gene expressions. However, for robust performance in recently emerging large health datasets, it is important for new biclustering algorithms to be scalable and fast. We present a rapid unsupervised biclustering (RUBic) algorithm that achieves this objective with a novel encoding and search strategy. RUBic significantly reduces the computational overhead on both synthetic and experimental datasets shows significant computational benefits, with respect to several state-of-the-art biclustering algorithms. In 100 synthetic binary datasets, our method took similar to 71.1s to extract 494,872 biclusters. In the human PPI database of size 4085 x 4085, our method generates 1840 biclusters in similar to 48.6 s. On a central nervous system embryonic tumor gene expression dataset of size 712,940, our algorithm takes 101 min to produce 747,069 biclusters, while the recent competing algorithms take significantly more time to produce the same result. RUBic is also evaluated on five different gene expression datasets and shows significant speed-up in execution time with respect to existing approaches to extract significant KEG-Genriched bi-clustering. RUBic can operate on two modes, base and flex, where base mode generates maximal biclusters and flex mode generates less number of clusters and faster based on their biological significance with respect to KEGG pathways. The code is available at (https://***/CMATERJU-BIOINFO/RUBic) for academic use only.

关键词： Data mining Algorithm design and analysis biclustering algorithms Computational complexity

来源：评论

学校读者我要写书评

暂无评论

On biclustering of Gene Expression Data

引用

CURRENT BIOINFORMATICS 2010年第3期5卷 204-216页

作者： Mukhopadhyay, Anirban Maulik, Ujjwal Bandyopadhyay, Sanghamitra Deutsch Krebsforschungszentrum Dept Theoret Bioinformat D-69120 Heidelberg Germany Jadavpur Univ Dept Comp Sci & Engn Kolkata 700032 India Indian Stat Inst Machine Intelligence Unit Kolkata 700108 India

Microarray technology enables the monitoring of the expression patterns of a huge number of genes across different experimental conditions or time points simultaneously. biclustering of microarray data is an important technique to discover a group of genes that are co-regulated in a subset of experimental conditions. Traditional clustering algorithms find groups of genes/conditions over the complete feature space. Therefore they may fail to discover the local patterns where a subset of genes has similar behaviour over a subset of conditions. biclustering algorithms aim to discover such local patterns from the gene expression matrix, thus can be thought as simultaneous clustering of genes and conditions. In recent years, a large number of biclustering algorithms have been proposed in literature. In this article, a study has been made on various issues regarding the biclustering problem along with a comprehensive survey on available biclustering algorithms. Moreover, a survey on freely available biclustering software is also made.

关键词： Microarray gene expression biclustering bicluster types biclustering algorithms biclustering software

来源：评论

学校读者我要写书评

暂无评论

Multi-metric and multi-substructure biclustering analysis for gene expression data

Multi-metric and multi-substructure biclustering analysis fo...

引用

IEEE Computational Systems Bioinformatics Conference

作者： Kung, SY Mak, MW Tagkopoulos, I Princeton Univ Princeton NJ 08544 USA

ISBN: (纸本)0769523447

A good number of biclustering algorithins have been proposed for grouping gene expression data. Many of them have adopted matrix norms to define the similarity score of a bicluster We shall show, that almost all matrix metrics can be converted into vector norms while preserving the rank equivalence. Vector norms provide a much more efficient vehicle for biclustering analysis and computation. The advantages are two folds: ease of analysis and saving of computation.

关键词： cellular biophysics genetics medical computing medical information systems molecular biophysics pattern classification pattern clustering biclustering algorithms biclustering analysis biclustering computation biclusters identification biologically releva medical information systems cellular biophysics medical computing molecular biology Pattern recognition Cluster Analysis Genetics Genes

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：