检索结果-内蒙古大学图书馆

Building Containerized workflows Using the BioDepot-Workflow-Builder

CELL SYSTEMS 2019年第5期9卷 508-+页

作者： Hung, Ling-Hong Hu, Jiaming Meiss, Trevor Ingersoll, Alyssa Lloyd, Wes Kristiyanto, Daniel Xiong, Yuguang Sobie, Eric Yeung, Ka Yee Univ Washington Sch Engn & Technol Tacoma WA 98402 USA Icahn Sch Med Mt Sinai 1468 Madison Ave New York NY 10029 USA

We present the BioDepot-workflow-builder (Bwb), a software tool that allows users to create and execute reproducible bioinformatics workflows using a drag-and-drop interface. Graphical widgets represent Docker containers executing a modular task. Widgets are linked graphically to build bioinformatics workflows that can be reproducibly deployed across different local and cloud platforms. Each widget contains a form-based user interface to facilitate parameter entry and a console to display intermediate results. Bwb provides tools for rapid customization of widgets, containers, and workflows. Saved workflows can be shared using Bwb's native format or exported as shell scripts.

关键词： Docker RNA sequencing bioinformatics workflows reproducibility of research software development

来源：评论

学校读者我要写书评

暂无评论

Using BioDepot-workflow-builder to Access Public Databases in a Containerized Environment

Using BioDepot-workflow-builder to Access Public Databases i...

引用

IEEE International Conference on bioinformatics and Biomedicine (BIBM)

作者： Scott, Christin Hung, Ling-Hong Lloyd, Wes Yeung, Ka Yee Univ Washington Sch Engn & Technol Tacoma WA 98402 USA

ISBN: (纸本)9781728118673

Modern day analyses of biomedical data typically involves a series of computational tasks called workflows. Each of these modules could be written by different laboratories and thus, potentially requires different computing environments. Reproducible analyses of biomedical data is a growing field of interest [1]. This makes it important to develop tools to facilitate analyses in a reproducible manner. Software containers have been used to increase reproducibility in bioinformatics analyses [2]. However, these software containers often use command line tools, requiring expertise that make them inaccessible to many biomedical scientists. The BioDepot-workflow-builder (Bwb) enables bioinformatics research by providing a user-friendly interface to create, share and reproducibly execute workflows using graphically linked user-created widgets [3]. Each widget represents a Docker container, and can be linked to other widgets to produce a workflow while simultaneously creating a graphical representation of the pipeline. The use of software containers allow widgets to be shared between users while maintaining a uniform platform for recreating research results. Therefore, Bwb provides tools that eliminate issues in reproducing research based on different computing environments. This will facilitate collaboration between research teams and will allow readers to recreate the results of data analytics with confidence. bioinformatics workflows often involve the use of public datasets. Incorporating the data download step within the workflow inside Bwb ensures a common environment for analysis, removing variability in the data downloaded when reproducing results. In this work, we demonstrate the utility of accessing external data sources using a containerized environment by building a widget to download datasets from the Gene Expression Omnibus (GEO) database. We also illustrate the use of this widget in a workflow designed to identify differentially expressed genes from gene expr

关键词： reproducibility of research bioinformatics workflows software development Docker GEO

来源：评论

学校读者我要写书评

暂无评论

Using BioDepot-workflow-builder to Access Public Databases in a Containerized Environment

Using BioDepot-workflow-builder to Access Public Databases i...

引用

IEEE International Conference on bioinformatics and Biomedicine

作者： Christin Scott Ka Yee Yeung Ling-Hong Hung Wes Lloyd School of Engineering and Technology University of Washington

ISBN: (纸本)9781728118680

Modern day analyses of biomedical data typically involves a series of computational tasks called workflows. Each of these modules could be written by different laboratories and thus, potentially requires different computing environments. Reproducible analyses of biomedical data is a growing field of interest. This makes it important to develop tools to facilitate analyses in a reproducible manner. Software containers have been used to increase reproducibility in bioinformatics analyses. However, these software containers often use command line tools, requiring expertise that make them inaccessible to many biomedical scientists. The BioDepot-workflow-builder (Bwb) enables bioinformatics research by providing a user-friendly interface to create, share and reproducibly execute workflows using graphically linked user-created widgets. Each widget represents a Docker container, and can be linked to other widgets to produce a workflow while simultaneously creating a graphical representation of the pipeline. The use of software containers allow widgets to be shared between users while maintaining a uniform platform for recreating research results. Therefore, Bwb provides tools that eliminate issues in reproducing research based on different computing environments. This will facilitate collaboration between research teams and will allow readers to recreate the results of data analytics with confidence. bioinformatics workflows often involve the use of public datasets. Incorporating the data download step within the workflow inside Bwb ensures a common environment for analysis, removing variability in the data downloaded when reproducing results. In this work, we demonstrate the utility of accessing external data sources using a containerized environment by building a widget to download datasets from the Gene Expression Omnibus (GEO) database. We also illustrate the use of this widget in a workflow designed to identify differentially expressed genes from gene expression Autho

关键词： reproducibility of research bioinformatics workflows software development Docker GEO Workflow Genetically engineered organisms computer software dockers Software design BIOMEDICAL DATA Containerizing bioinformatics Environments CONTAINERS

来源：评论

学校读者我要写书评

暂无评论

Challenges of Large-scale Biomedical workflows on the Cloud - A Case Study on the Need for Reproducibility of Results 28

Challenges of Large-scale Biomedical Workflows on the Cloud ...

引用

28th IEEE International Symposium on Computer-Based Medical Systems (CBMS)

作者： Kanwal, Sehrish Lonie, Andrew Sinnott, Richard O. Anderson, Charlotte Univ Melbourne Dept Comp & Informat Syst Melbourne Vic 3010 Australia

ISBN: (纸本)9781467367752

Computational bioinformatics workflows are extensively used to analyse genomics data. With the unprecedented advancements in genomic sequence technology and opportunities for personalized medicines, it is essential that analysis results are repeatable by others, especially when moving into clinical environment. To cope with the complex computational demands of huge biological datasets, a shift to distributed compute resources is unavoidable. A case study was conducted in which three well-established bioinformatics analysis groups across Australia were assigned to analyse exome sequence data from a range of patients with a rare condition: disorder of sex development. Initially these groups used their own in-house data processing pipelines, and subsequently used a common bioinformatics workbench based upon Galaxy and offered through the Australia-wide National eResearch Collaboration Tools and Resources (NeCTAR) Research Cloud. This paper describes the experiences in this work and the variability of results. We put forward principles that should be used to ensure reproducibility of scientific results moving forward.

关键词： bioinformatics workflows distributed compute resources exome NeCTAR Research Cloud reproducibility

来源：评论

学校读者我要写书评

暂无评论

Evaluating Grasp-based Cloud Dimensioning for Comparative Genomics: a Practical Approach 16

Evaluating Grasp-based Cloud Dimensioning for Comparative Ge...

引用

16th IEEE International Conference on Cluster Computing (CLUSTER)

作者： Coutinho, Rafaelli Drummond, Lucia Frota, Yuri de Oliveira, Daniel Ocana, Kary Fluminense Fed Univ IC Niteroi RJ Brazil Univ Fed Rio de Janeiro COPPE Rio de Janeiro Brazil

ISBN: (纸本)9781479955480

Cloud computing establishes a new computing model where a wide range of computing resources are provided to several types of users. Especially for bioinformatics experiments modeled as scientific workflows, clouds provide several types of resources as virtual machines (VM), storage, databases and computing power that can be combined for empowering the scientific workflow execution. These workflows usually require high performance environments and parallelism techniques since their activities are data and computing intensive and can execute for a long time. There are then some Scientific Workflow Management Systems (SWfMS) that already manage the parallel execution of scientific workflows in clouds. Most of them instantiate a virtual cluster for the execution. However, they rely on the user to estimate the amount of VMs to be instantiated to create this virtual cluster. Estimating the amount of VMs to instantiate is then a crucial task to avoid negative impacts on the workflow performance with under or over estimations. This dimensioning also is not a trivial task in clouds due to the large number of VM types to choose in a cloud provider. Previously proposed approach named GraspCC already provides a near optimal estimation of the amount of VM for general applications, not scientific workflows. In this paper, we coupled the GraspCC to SciCumulus (Cloud-based Parallel Engine for Scientific workflows) engine to estimate the necessary amount of VMs for bioinformatics workflows. We have evaluated GraspCC by comparing the estimative with real executions of a set of large-scale comparative genomics workflows. It showed the suitability of GraspCC to estimate the amount of VMs in real bioinformatics cloud workflows.

关键词： Cloud Computing bioinformatics workflows Virtual Machine Allocation

来源：评论

学校读者我要写书评

暂无评论

Helping Biologists Effectively Build workflows, without Programming

Helping Biologists Effectively Build Workflows, without Prog...

引用

7th International Conference on Data Integration in the Life Sciences

作者： Gordon, Paul M. K. Barker, Ken Sensen, Christoph W. Univ Calgary Dept Comp Sci 2500 Univ Dr NW Calgary AB T2N 1N4 Canada Univ Calgary Dept Biochem & Mol Biol Calgary AB T2N 4N1 Canada

ISBN: (纸本)9783642151194

Seahawk is a browser for Moby Web services, which are online tools using a shared semantic registry and data formats. To make a wider array of tools available within Seahawk, the Daggoo system helps users adapt forms on existing Web sites to Moby's specifications. Biologists were interviewed and given workflow design tasks, which revealed the types of tools present in their conceptual analysis workflows, and the types of control flow they understood. These observations were used to enhance Seahawk so that Moby and external Web tools can be browsed to create workflows "by demonstration". A flow-up user study measured how effectively biologists could 1) demonstrate a workflow for a realistic task, 2) understand the automatically generated workflow, and 3) use the workflow in the Taverna workflow editor/enactor. The results show promise that biologists without programming experience can become self-sufficient in analysis automation, using workflow-by-demonstration as a first step.

关键词： programming by demonstration bioinformatics workflows user study semantic web services

来源：评论

学校读者我要写书评

暂无评论

Leveraging REST Web services and their semantic extensions for bioinformatic workflows: a case study using Galaxy

Leveraging REST Web services and their semantic extensions f...

引用

作者： Ganjoo, Sumedha University of Georgia

学位级别：master

The field of bioinformatics involves analysis of large sets of data. This might entail leveraging of tools scattered over many Web sites. To provide the experimental biologists with a common platform capable of such analysis, this thesis focuses on extending a bioinformatics framework with Web service invocation support. Galaxy being substantially popular for its analysis tools and workflow management capability seemed like an ideal candidate to extend. This thesis proposes adding REST Web service support to Galaxy in a way that can be easily extended to SOAP Web services in the future. Also, it introduces an approach to add dynamic tools to Galaxy. To simplify the process of repetitive analyses on different sets of data, in this thesis we discuss enabling Web service invocation in the workflow portion of Galaxy. Also this thesis shows how we can leverage semantic annotations in Web services to improve the user?s experience when interacting with Web services

关键词： Web services REST WADL WSDL 2.0 SAWADL bioinformatics workflows Galaxy Semantic Web services

来源：评论

学校读者我要写书评

暂无评论

Seven variations of an alignment workflow - An illustration of agile process design and management in Bio-jETI

引用

4th International Symposium on bioinformatics Research and Applications

作者： Lamprecht, Anna-Lena Margaria, Tiziana Steffen, Bernhard Dortmund Univ Technol Chair Programming Syst D-44227 Dortmund Germany Dortmund Univ Technol Cntr Appl Proteom D-44227 Dortmund Germany Univ Potsdam Chair Serv & Software Engn D-14482 Potsdam Germany

ISBN: (纸本)9783540794493

This paper shows how the agility provided by the Bio-jETI platform helps to interactively design bioinformatics analysis processes. Bio-jETI is a platform for the integration, orchestration and provision of services. The agility in design and execution is demonstrated by developing seven variations on a multiple sequence alignment workflow.

关键词： web services service orchestration model-driven development bioinformatics workflows

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：