检索结果-内蒙古大学图书馆

SciPipe: A workflow library for agile development of complex and dynamic bioinformatics pipelines

GIGASCIENCE 2019年第5期8卷 giz044页

作者： Lampa, Samuel Dahlo, Martin Alvarsson, Jonathan Spjuth, Ola Uppsala Univ Dept Pharmaceut Biosci Box 591 S-75124 Uppsala Sweden Uppsala Univ Sci Life Lab Box 591 S-75124 Uppsala Sweden Stockholm Univ Natl Bioinformat Infrastruct Sweden Dept Biochem & Biophys Sci Life Lab Svante Arrhenius Vag 16C S-10691 Solna Sweden

Background: The complex nature of biological data has driven the development of specialized software tools. Scientific workflow management systems simplify the assembly of such tools into pipelines, assist with job automation, and aid reproducibility of analyses. Many contemporary workflow tools are specialized or not designed for highly complex workflows, such as with nested loops, dynamic scheduling, and parametrization, which is common in, e.g., machine learning. Findings: SciPipe is a workflow programming library implemented in the programming language Go, for managing complex and dynamic pipelines in bioinformatics, cheminformatics, and other fields. SciPipe helps in particular with workflow constructs common in machine learning, such as extensive branching, parameter sweeps, and dynamic scheduling and parametrization of downstream tasks. SciPipe builds on flow-based programming principles to support agile development of workflows based on a library of self-contained, reusable components. It supports running subsets of workflows for improved iterative development and provides a data-centric audit logging feature that saves a full audit trace for every output file of a workflow, which can be converted to other formats such as HTML, TeX, and PDF on demand. The utility of SciPipe is demonstrated with a machine learning pipeline, a genomics, and a transcriptomics pipeline. Conclusions: SciPipe provides a solution for agile development of complex and dynamic pipelines, especially in machine learning, through a flexible application programming interface suitable for scientists used to programming or scripting.

关键词： scientific workflow management systems pipelines reproducibility machine learning flow-based programming Go Golang

来源：评论

学校读者我要写书评

暂无评论

Towards Domain-specific flow-based Languages 6

Towards Domain-specific Flow-based Languages

引用

6th International Conference on Model-Driven Engineering and Software Development (MODELSWARD)

作者： Zarrin, Bahram Baumeister, Hubert Sarjoughian, Hessam Tech Univ Denmark DTU Compute DK-2800 Lyngby Denmark Arizona State Univ Sch Comp Informat & Decis Syst Engn Tempe AZ 85281 USA

ISBN: (纸本)9789897582837

Due to the significant growth of the demand for data-intensive computing, in addition to the emergence of new parallel and distributed computing technologies, scientists and domain experts are leveraging languages specialized for their problem domain, i.e., domain-specific languages, to help them describe their problems and solutions, instead of using general purpose programming languages. The goal of these languages is to improve the productivity and efficiency of the development and simulation of concurrent scientific models and systems. Moreover, they help to expose parallelism and to specify the concurrency within a component or across different independent components. In this paper, we introduce the concept of domain-specific flow-based languages which allows domain experts to use flow-based languages adapted to a particular problem domain. flow-based programming is used to support concurrency, while the domain-specific part of these languages is used to define atomic processes and domain-specific validation rules for composite processes. We propose a modeling language that can be used to develop such domain-specific languages. Since this language allows one to define other languages, we often refer to it as a meta-modeling language.

关键词： Domain-specific Languages flow-based programming Metamodeling Languages Parallel Computing

来源：评论

学校读者我要写书评

暂无评论

Smart transport and logistics: A Node-RED implementation

引用

INTERNET TECHNOLOGY LETTERS 2019年第2期2卷

作者： Sicari, Sabrina Rizzardi, Alessandra Coen-Porisini, Alberto Univ Insubria Dipartimento Sci Teor & Applicate Via G Mazzini 5 I-21100 Varese Italy

A clever and efficient management of transport and logistics are fundamental in manufacturer companies, starting to adopt new methodologies, inspired to the emerging industry 4.0 principles. Such a behavior is influenced by the spreading of the Internet of Things (IoT) paradigm, helping to automate a lot of features, if not all, of products' management, from raw materials' purchase order to the final delivery to customers. Small and medium industries must face design issues and noncustomized solutions may not fit with their habitual data flow. Hence, the need of a tool, able to support designers and developers in defining the network architecture and messages' exchange, emerges. To this end, the use of Node-RED, a flow-based programming tool for the IoT, is proposed, by providing a comprehensive case study targeted to smart transport and logistics.

关键词： flow-based programming internet of things node-RED smart logistics smart transport

来源：评论

学校读者我要写书评

暂无评论

Reproducible Data Analysis in Drug Discovery with Scientific Workflows and the Semantic Web

Reproducible Data Analysis in Drug Discovery with Scientific...

引用

作者： SAMUEL LAMPA Uppsala University

学位级别：博士

The pharmaceutical industry is facing a research and development productivity crisis. At the same time we have access to more biological data than ever from recent advancements in high- throughput experimental methods. One suggested explanation for this apparent paradox has been that a crisis in reproducibility has affected also the reliability of datasets providing the basis for drug development. Advanced computing infrastructures can to some extent aid in this situation but also come with their own challenges, including increased technical debt and opaqueness from the many layers of technology required to perform computations and manage data. In this thesis, a number of approaches and methods for dealing with data and computations in early drug discovery in a reproducible way are developed. This has been done while striving for a high level of simplicity in their implementations, to improve understandability of the research done using them. based on identified problems with existing tools, two workflow tools have been developed with the aim to make writing complex workflows particularly in predictive modelling more agile and flexible. One of the tools is based on the Luigi workflow framework, while the other is written from scratch in the Go language. We have applied these tools on predictive modelling problems in early drug discovery to create reproducible workflows for building predictive models, including for prediction of off-target binding in drug discovery. We have also developed a set of practical tools for working with linked data in a collaborative way, and publishing large-scale datasets in a semantic, machine-readable format on the web. These tools were applied on demonstrator use cases, and used for publishing large-scale chemical data. It is our hope that the developed tools and approaches will contribute towards practical, reproducible and understandable handling of data and computations in early drug discovery.

关键词： Reproducibility Scientific Workflow Management Systems Workflows Pipelines flow-based programming Predictive modelling Semantic Web Linked Data Semantic MediaWiki MediaWiki RDF SPARQL Golang

来源：评论

学校读者我要写书评

暂无评论

aFlux: Graphical flow-based data analytics

引用

SOFTWARE IMPACTS 2019年 2卷

作者： Mahapatra, Tanmaya Prehofer, Christian Tech Univ Munich Fak Informat Lehrstuhl Software & Syst Engn Boltzmannstr 03 D-85748 Garching Germany

aFlux is a graphical flow-based programming tool designed to support the modelling of data analytics applications. It supports high-level programming of Big Data applications with early-stage flow validation and automatic code generation for frameworks like Spark, Flink, Pig and Hive. The graphical programming concepts used in aFlux constitute the first approach towards supporting high-level Big Data application development by making it independent of the target Big Data frameworks. This programming at a higher level of abstraction helps to lower the complexity and its ensued learning curve involved in the development of Big Data applications.

关键词： flow-based programming Graphical pipelines Mashup tools Graphical Spark programming graphical Flink programming

来源：评论

学校读者我要写书评

暂无评论

Telemetry data processing flow model: a case study

引用

AIRCRAFT ENGINEERING AND AEROSPACE TECHNOLOGY 2015年第1期87卷 52-56页

作者： Wang, Guohua Li, Qiang Sun, Jinglin Meng, Xiaofeng Beihang Univ Sch Instrumentat Sci & Optoelect Engn Beijing 100191 Peoples R China

Purpose - The purpose of this paper is to develop the model of telemetry data processing flow (TDPF) for TDPF development and the TDPF run-time infrastructure to improve the spacecraft health monitoring capability. Design/methodology/approach - This research tries to develop the TDPF by flow-based programming (FBP) method and the component-based telemetry data processing software. Findings - The result from the case study is positive, thus reflecting the appropriateness of the suggested method. Practical implications - Application of the proposed TDPF model and the component-based telemetry data processing software may result in improved development efficiency and less development costs. Originality/value - This paper provides an effective way to develop TDPF without recompiling the software. It greatly facilitates the TDPF development that hopefully will save the TDPF development cost.

关键词： Telemetry data processing flow Spacecraft health monitoring flow-based programming Component-based development Limit checking

来源：评论

学校读者我要写书评

暂无评论

MobileFBP: Designing portable reconfigurable applications for heterogeneous systems

引用

JOURNAL OF SYSTEMS ARCHITECTURE 2014年第1期60卷 40-51页

作者： Hung, Shih-Hao Tzeng, Tien-Tzong Wu, Jyun-De Tsai, Min-Yu Lu, Yi-Chih Shieh, Jeng-Peng Tu, Chia-Heng Ho, Wen-Jen Natl Taiwan Univ Dept Comp Sci & Informat Engn Taipei Taiwan Inst Informat Ind Smart Network Syst Inst Taipei Taiwan

Power-efficiency has been a key issue for today's application and system design, ranging from embedded systems to data centers. While application-specific designs and optimizations may improve the power efficiency, it requires significant efforts to co-design the hardware and software, which are difficult to re-use. On the hardware front, the trend of heterogeneous computing enables custom designs for specific applications by integrating different types of processors and reconfigurable hardware to handle computeintensive tasks. However, what is still missing is an elegant application framework, i.e., a programming environment and a runtime system, to develop portable applications which can offload tasks or be reconfigured dynamically to run on a variety of systems efficiently. Our ongoing work, MobileFBP, provides an application framework which aims to support heterogeneous and reconfigurable systems. Using the framework, the developers build portable applications with a dataflow programming paradigm, and the MobileFBP runtime system dynamically schedules the task components to run on available computing resources locally or remotely based on the application profiles. We hope that this ability produces high-level portable applications and reduces the efforts and skills needed for the developers to optimize their applications on a range of systems. This paper describes this work and presents our preliminary results. (C) 2013 Elsevier B.V. All rights reserved.

关键词： Multicore Reconfiurable computing Heterogeneous systems Android flow-based programming Inter-task communications

来源：评论

学校读者我要写书评

暂无评论

Dflows: A flow-based programming Approach for a Polyglot Design-Space Exploration Framework

引用

ACM Transactions on Reconfigurable Technology and Systems 1000年

作者： Francesco Peverelli Daniele Paletti Davide Conficconi Politecnico di Milano Italy

Current architectural Design-Space Exploration (DSE) tools specify the exploration problem through annotations or pragmas. However, this approach is inherently language-dependent and limits the applicability to one specific target language and synthesis toolchain. Additionally, the rapid development of new hardware Domain-Specific Languages, programming models, and different exploration heuristics calls for a language-agnostic and modular approach. To address this need, we present a DSE formalization to facilitate the integration of new components and customized flows and leverage it to implement Dflows, a flow-based-programming DSE tool that decouples problem definition, code generation, exploration, and evaluation strategies. Dflows’s compiler-based frontend provides language-agnostic generation of design points through Abstract Syntax Tree manipulation. We show how Dflows can integrate custom performance models from complex state-of-the-art accelerators for Verilog, VHDL, Chisel, and HLS designs. We compare the runtimes of our DSE process against a state-of-the-art Chisel-based DSE tool, achieving up to 3.74× speedup while identifying the same set of optimal solutions. Additionally, we integrate in Dflows a custom exploration heuristic leveraging genetic algorithms and a novel online learning fitness function approximation methodology. This approximation yields a negligible hypervolume difference with the exhaustive search Pareto-front while improving DSE runtime by up to 2.67×.

关键词： DSE Polyglot flow-based programming Online Learning

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：