This paper concerns the trade-off that may be made between the cost of storing intermediate data and the computing costs incurred in regenerating this data when large bioinformatics or other workflows are implemented ...
详细信息
ISBN:
(纸本)9781479925797
This paper concerns the trade-off that may be made between the cost of storing intermediate data and the computing costs incurred in regenerating this data when large bioinformatics or other workflows are implemented using cloud resources. The implementation may be required to delete some data to keep storage costs within a budget, and deciding how best to do this with minimal increase in computing costs can cause complex problems. To address these problems, a modified form of Petri net is introduced for modeling the workflow and allowing an optimization algorithm to be applied for addressing several types of problem that may arise. The proposed 'augmented Petri-net' simulates workflows with cost models included, thus providing a platform for an optimization procedure. Illustrations are presented to show that such optimization can achieve overall cost reductions in a number of different scenarios.
Progress in experimental procedures has led to rapid availability of Omics profiles. Various open-access as well as commercial tools have been developed for storage, analysis, and interpretation of transcriptomics, pr...
详细信息
Progress in experimental procedures has led to rapid availability of Omics profiles. Various open-access as well as commercial tools have been developed for storage, analysis, and interpretation of transcriptomics, proteomics, and metabolomics data. Generally, major analysis steps include data storage, retrieval, preprocessing, and normalization, followed by identification of differentially expressed features, functional annotation on the level of biological processes and molecular pathways, as well as interpretation of gene lists in the context of protein–protein interaction networks. In this chapter, we discuss a sequential transcriptomics data analysis workflow utilizing open-source tools, specifically exemplified on a gene expression dataset on familial hypercholesterolemia. less
暂无评论