检索结果-内蒙古大学图书馆

35th Annual Computer Security Applications Conference (ACSA)

作者： Mofrad, Saeid Ahmed, Ishtiaq Lu, Shiyong Yang, Ping Cui, Heming Zhang, Fengwei Wayne State Univ Dept Comp Sci Detroit MI 48202 USA SUNY Binghamton Dept Comp Sci Binghamton NY USA Univ Hong Kong Dept Comp Sci Hong Kong Peoples R China WSU Detroit MI USA SUSTech Shenzhen Peoples R China

ISBN: (纸本)9781450376280

big data workflow management systems (BDWFMSs) have recently emerged as popular platforms to perform large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Although a few data analytics systems were developed to address this problem, they are limited to specific structures such as Map-Reduce-style workflows and SQL queries. This paper proposes SecdataVIEW, a BDWFMS that leverages Intel Software Guard eXtensions (SGX) and AMD Secure Encrypted Virtualization (SEV) to develop a heterogeneous trusted execution environment for workflows. SecdataVIEW aims to (1) provide the confidentiality and integrity of code and data for workflows running on public untrusted clouds, (2) minimize the TCB size for a BDWFMS, (3) enable the trade-off between security and performance for workflows, and (4) support the execution of Java-based workflow tasks in SGX. Our experimental results show that SecdataVIEW imposes 1.69x to 2.62x overhead on workflow execution time on SGX worker nodes, 1.04x to 1.29x overhead on SEV worker nodes, and 1.20x to 1.43x overhead on a heterogeneous setting in which both SGX and SEV worker nodes are used.

关键词： trusted computing Intel SGX AMD SEV big data workflow heterogeneous cloud

来源：评论

学校读者我要写书评

暂无评论

big data BPMN workflow resource optimization in the cloud

引用

PARALLEL COMPUTING 2023年第1期117卷

作者： Simic, Srdan Daniel Tankovic, Nikola Etinger, Darko Juraj Dobrila Univ Pula Fac Informat Rovinjska 14 Pula 52100 Croatia

Cloud computing is one of the critical technologies that meet the demand of various businesses for the high -capacity computational processing power needed to gain knowledge from their ever-growing business data. When utilizing cloud computing resources to deal with big data processing, companies face the challenge of determining the optimal use of resources within their business processes. The miscalculation of the necessary resources directly affects their budget and can cause delays in the cycle time of their key processes. This study investigates the simulation of cloud resource optimization for big data workflows modeled with the Business Process Modeling Notation (BPMN). To this end, a BPMN performance evaluation framework was developed. The framework's capabilities were presented using real-world data science workflow and later evaluated on workflows consisting of 13, 52, and 104 tasks. The results show that the developed framework is adequate for estimating the overall run-time distribution and optimizing the cloud resource deployment and that the BPMN can be utilized for big data processing workflows. Therefore, this study contributes to BPMN practitioners by providing a tool to apply BPMN for their big data workflows and decision-makers by giving them critical insights into their key business processes. The framework source code is available at https://***/ntankovic/python-bpmn-engine.

关键词： BPMN Cloud resource allocation Optimization big data workflow Run-time distribution

来源：评论

学校读者我要写书评

暂无评论

Diftong: a tool for validating big data workflows

引用

JOURNAL OF big data 2019年第1期6卷 1页

作者： Rizk, Raya McKeever, Steve Petrini, Johan Zeitler, Erik Uppsala Univ Dept Informat & Media Kyrkogadsgatan 10 S-75313 Uppsala Sweden Klarna Bank AB Sveavagen 46 S-11134 Stockholm Sweden

data validation is about verifying the correctness of data. When organisations update and refine their data transformations to meet evolving requirements, it is imperative to ensure that the new version of a workflow still produces the correct output. We motivate the need for workflows and describe the implementation of a validation tool called Diftong. This tool compares two tabular databases resulting from different versions of a workflow to detect and prevent potential unwanted alterations. Row-based and column-based statistics are used to quantify the results of the database comparison. Diftong was shown to provide accurate results in test scenarios, bringing benefits to companies that need to validate the outputs of their workflows. By automating this process, the risk of human error is also eliminated. Compared to the more labour-intensive manual alternative, it has the added benefit of improved turnaround time for the validation process. Together this allows for a more agile way of updating data transformation workflows.

关键词： big data data testing data validation data quality big data validation process big data validation tool big data workflow

来源：评论

学校读者我要写书评

暂无评论

Storage-aware Task Scheduling for Performance Optimization of big data workflows 16

Storage-aware Task Scheduling for Performance Optimization o...

引用

16th IEEE ISPA / 17th IEEE IUCC / 8th IEEE BDCloud / 11th IEEE SocialCom / 8th IEEE SustainCom

作者： Ye, Qianwen Wu, Chase Q. Cao, Huiyan Rao, Nageswara S. V. Hou, Aiqin New Jersey Inst Technol Dept Comp Sci Newark NJ 07102 USA Oak Ridge Natl Lab Comp Sci & Math Div Oak Ridge TN 37831 USA Northwest Univ Sch Info Sci & Tech Xian 710127 Shaanxi Peoples R China

ISBN: (纸本)9781728111414

Many large-scale applications in various domains are generating big data, which are increasingly processed and analyzed by MapReduce-based workflows deployed in Hadoop systems. In addition to computing time, the makespan of such data-intensive workflows is also largely affected by communication cost. Particularly, there are two levels of data movement during the execution of distributed workflows in Hadoop: i) from map tasks to reduce tasks within each individual MapReduce module and ii) between each pair of adjacent modules in the workflow. Traditionally, these two aspects of network traffic have been treated separately as data locality at the task and module or job level, respectively. However, the interactions between these two levels of data movement may create complicated dynamics and their compound effects remain largely unexplored. In this paper, we formulate a task scheduling problem that considers data movement at both levels to minimize the end-to-end delay of a MapReduce-based workflow. We show this problem to be NP-complete, and design a storage-aware big data workflow scheduling algorithm, referred to as SA-BWS, to optimize workflow performance in Hadoop environments. The performance superiority of SA-BWS is illustrated by extensive simulations in comparison with the default workflow engine in Hadoop and existing scheduling methods.

关键词： MapReduce workflow scheduling data locality workflow optimization big data workflow

来源：评论

学校读者我要写书评

暂无评论

TPS: A Task Placement Strategy for big data workflows 3

TPS: A Task Placement Strategy for Big Data Workflows

引用

IEEE International Conference on big data

作者： Ebrahimi, Mahdi Mohan, Aravind Lu, Shiyong Reynolds, Robert Wayne State Univ Detroit MI 48202 USA

ISBN: (纸本)9781479999255

workflow makespan is the total execution time for running a workflow in the Cloud. The workflow makespan significantly depends on how the workflow tasks and datasets are allocated and placed in a distributed computing environment such as Clouds. Incorporating data and task allocation strategies to minimize makespan delivers significant benefits to scientific users in receiving their results in time. The main goal of a task placement algorithm is to minimize the total amount of data movement between virtual machines during the execution of the workflows. In this paper, we do the following: 1) formalize the task placement problem in big data workflows;2) propose a task placement strategy (TPS) that considers both initial input datasets and intermediate datasets to calculate the dependency between workflow tasks;and 3) perform extensive experiments in the distributed environment to demonstrate that the proposed strategy provides an effective task distribution and placement tool.

关键词： big data workflow Task Placement Cloud Computing Evolutionary Algorithms Genetic Algorithms

来源：评论

学校读者我要写书评

暂无评论

Electrochemical Biosensor Design Through data-Driven Modeling Incorporating Meta-Analysis and big dataworkflow 22nd

Electrochemical Biosensor Design Through Data-Driven Modelin...

引用

22nd International Conference on Artificial Intelligence and Soft Computing (ICAISC)

作者： Vasyl, Martsenyuk Aleksandra, Klos-Witkowska Andrii, Semenets Univ Bielsko Biala Willowa 2 PL-43300 Bielsko Biala Poland Ivan Horbachevsky Ternopil Natl Med Univ 1 Maidan Voli UA-46001 Ternopol Ukraine

ISBN: (纸本)9783031425073;9783031425080

The objective of the work is to offer workflow enabling us to execute both empirical and analytical studies of enzyme kinetics. For this purpose, on the one hand, we are based on a series of experimental research involving the traditional methods and techniques used when studying biochemical reactions and designing electrochemical biosensors: conductance research, spectroscopy, and electromagnetic field study. On the other hand, when studying enzyme kinetics analytically we employ the Michaelis-Menten approach while modelling enzyme-substrate-inhibitor interactions and extend it to multi-substrate multi-inhibitor complexes. Enforcing traditional big data workflow is offered with the help of meta-analysis facilities of existing repositories of biochemical studies located on the BRENDA platform.

关键词： Electrochemical biosensor enzyme kinetics big data workflow parameter estimation

来源：评论

学校读者我要写书评

暂无评论

big data Applications Using workflows for data Parallel Computing

引用

COMPUTING IN SCIENCE & ENGINEERING 2014年第4期16卷 11-21页

作者： Wang, Jianwu Crawl, Daniel Altintas, Ilkay Li, Weizhong Univ Calif San Diego Workflows Data Sci WorDS Ctr Excellence San Diego CA 92103 USA Univ Calif San Diego San Diego Supercomp Ctr San Diego CA 92103 USA Univ Calif San Diego Workflows Data Sci Ctr Excellence San Diego Supercomp Ctr San Diego CA 92103 USA Univ Calif San Diego Ctr Res Biol Syst San Diego CA 92103 USA

In the big data era, workflow systems must embrace data parallel computing techniques for efficient data analysis and analytics. Here, an easy-to-use, scalable approach is presented to build and execute big data applications using actor-oriented modeling in data parallel computing. Two bioinformatics use cases for next-generation sequencing data analysis demonstrate the approach's feasibility.

关键词： big data workflow actor-oriented programming bioinformatics application data parallelization scientific computing scientific programming

来源：评论

学校读者我要写书评

暂无评论

Securing big data Scientific workflows via Trusted Heterogeneous Environments

引用

IEEE TRANSACTIONS ON DEPENDABLE AND SECURE COMPUTING 2022年第6期19卷 4187-4203页

作者： Mofrad, Saeid Ahmed, Ishtiaq Zhang, Fengwei Lu, Shiyong Yang, Ping Cui, Heming Wayne State Univ Dept Comp Sci COMPASS Lab & Big Data Res Lab Detroit MI 48202 USA Wayne State Univ Dept Comp Sci Big Data Res Lab Detroit MI 48202 USA Southern Univ Sci & Technol Dept Comp Sci & Engn COMPASS Lab Shenzhen 518055 Guangdong Peoples R China SUNY Binghamton Dept Comp Sci Binghamton NY 13902 USA Univ Hong Kong Dept Comp Sci HKU Syst Software Grp Hong Kong Peoples R China

big data workflow management systems (BDWMS)s have recently emerged as popular data analytics platforms to conduct large-scale data analytics in the cloud. However, the protection of data confidentiality and secure execution of workflow applications remains an important and challenging problem. Although a few data analytics systems, such as VC3 and Opaque, were developed to address security problems, they are limited to specific domains such as Map-Reduce-style and SQL query workflows. A generic secure framework for BDWMSs is still missing. In this article, we propose SecdataVIEW, a distributed BDWMS that employs heterogeneous workers, such as Intel SGX and AMD SEV, to protect both workflow and workflow data execution, addressing three major security challenges: (1) Reducing the TCB size of the big data workflow management system in the untrusted cloud by leveraging the hardware-assisted TEE and software attestation;(2) Supporting Java-written workflow tasks to overcome the limitation of SGX's lack of support for Java programs;and (3) Reducing the adverse impact of SGX enclave memory paging overhead through a "Hybrid" workflow task scheduling system that selectively deploys sensitive tasks to a mix of SGX and SEV worker nodes. Our experimental results show that SecdataVIEW imposes moderate overhead on the workflow execution time.

关键词： Task analysis big data Cloud computing Security Codes Java Computer science Trusted computing Intel SGX AMD SEV big data workflow heterogeneous cloud

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：