Dramatic changes in the technology landscape marked by increasing scales and pervasiveness of compute and data have resulted in the proliferation of edge applications aimed at effectively processing data in a timely m...
详细信息
Dramatic changes in the technology landscape marked by increasing scales and pervasiveness of compute and data have resulted in the proliferation of edge applications aimed at effectively processing data in a timely manner. As the levels and fidelity of instrumentation increases and the types and volumes of available data grow, new classes of applications are being explored that seamlessly combine real-time data with complex models and data analytics to monitor and manage systems of interest. However, these applications require a fluid integration of resources at the edge, the core, and along the data path to support dynamic and data-driven application workflows, that is, they need to leverage a computing continuum. In this article, we present our vision for enabling such a computing continuum and specifically focus on enabling edge-to-cloud integration to support data-driven workflows. The research is driven by an online data-driven tsunami warning use case that is supported by the deployment of large-scale national environment observation systems. This article presents our overall approach as well as current status and next steps.
Large scale observatories are shared-use resources that provide open access to data from geographically distributed sensors and instruments. This data has the potential to accelerate scientific discovery. However, sea...
详细信息
ISBN:
(纸本)9781538626863
Large scale observatories are shared-use resources that provide open access to data from geographically distributed sensors and instruments. This data has the potential to accelerate scientific discovery. However, seamlessly integrating the data into scientific workflows remains a challenge. In this paper, we summarize our ongoing work in supporting data-driven and data-intensive workflows and outline our vision for how these observatories can improve large-scale science. Specifically, we present programming abstractions and runtime management services to enable the automatic integration of data in scientific workflows. Further, we show how approximation techniques can be used to address network and processing variations by studying constraint limitations and their associated latencies. We use the Ocean Observatories Initiative (OOI) as a driving use case for this work.
Cloud computing is emerging as a viable platform for scientific exploration. Elastic and on-demand access to resources (and other services), the abstraction of "unlimited" resources, and attractive pricing m...
详细信息
Cloud computing is emerging as a viable platform for scientific exploration. Elastic and on-demand access to resources (and other services), the abstraction of "unlimited" resources, and attractive pricing models provide incentives for scientists to move their workflows into clouds. Generalizing these concepts beyond a single virtualized datacenter, it is possible to create federated marketplaces where different types of resources (e.g., clouds, HPC grids, supercomputers) that may be geographically distributed, are collectively exposed as a single elastic infrastructure. This presents opportunities for optimizing the execution of application workflows with heterogeneous and dynamic requirements, and tackling larger scale problems. In this paper, we introduce a framework to manage the end-to-end execution of data-intensive application workflows in dynamic software-defined resource federation. This framework enables the autonomic execution of workflows by elastically provisioning an appropriate set of resources that meet application requirements, and by adapting this set of resources at runtime as the requirements change. It also allows users to customize scheduling policies that drive the way resources federated and used. To demonstrate the benefits of our approach, we study the execution of two different data-intensive scientific workflows in a multi-cloud federation using different policies and objective functions.
data-intensive applications aim at discovering valuable knowledge from large amounts of data coming from real-world sources. Typically, workflow languages are used to specify these applications, and their associated e...
详细信息
data-intensive applications aim at discovering valuable knowledge from large amounts of data coming from real-world sources. Typically, workflow languages are used to specify these applications, and their associated engines enable the execution of the specifications. However, as these applications become commonplace, new challenges arise. Existing workflow languages are normally platform-specific, which severely hinders their interoperability with other languages and execution engines. This also limits their reusability outside the platforms for which they were originally defined. Following the Design Science Research methodology, the paper presents SWEL (Scientific Workflow Execution Language). SWEL is a domain-specific modeling language for the specification of data-intensive workflows that follow the model-driven engineering principles, covering the high-level definition of tasks, information sources, platform requirements, and mappings to the target technologies. SWEL is platform-independent, enables collaboration among data scientists across multiple domains and facilitates interoperability. The evaluation results show that SWEL is suitable enough to represent the concepts and mechanisms of commonly used data-intensive workflows. Moreover, SWEL facilitates the development of related technologies such as editors, tools for exchanging knowledge assets between workflow management systems, and tools for collaborative workflow development.
Register automata have been used as a convenient model for specifying and verifying database driven systems. An important problem in such systems is to provide views that hide or restructure certain information about ...
详细信息
ISBN:
(纸本)9781450371087
Register automata have been used as a convenient model for specifying and verifying database driven systems. An important problem in such systems is to provide views that hide or restructure certain information about the data or process, extending classical notions of database views. In this paper we carry out a formal investigation of views of register automata by considering simple views that project away some of the registers. We show that classical register automata are not able to describe such projections and introduce more powerful register automata that are able to do so. We also show useful properties of these automata such as closure under projection and decidability of verifying temporal properties of their runs.
Pervasive computational ecosystems that combine data sources and computing/communication resources in self-managed environments, such as the ones powered by Internet of Things (IoT) devices, have the potential to auto...
详细信息
ISBN:
(纸本)9781467372879
Pervasive computational ecosystems that combine data sources and computing/communication resources in self-managed environments, such as the ones powered by Internet of Things (IoT) devices, have the potential to automate and facilitate many aspects of our lives, and impact a variety of applications, from the management of extreme events to the optimization of everyday processes. However, this vision remains mostly unrealized despite the fact that the technology to achieve it exists, largely because of the gap between our ability to collect data and our ability to gain insight from it. In this paper, we discuss the challenges associated with providing a pervasive computational ecosystem. We then present our vision of how to best support data-driven computational ecosystems and propose a conceptual architecture that leverages ideas from software-defined environments in order to combine data, computing, and communication resources. In addition, we show how this proposed architecture enables the execution of data-driven workflows on top of these resources.
暂无评论