Resource allocation policies in public Clouds are today largely agnostic to requirements that distributedapplications have from their underlying infrastructure. As a result, assumptions about data-center topology tha...
详细信息
ISBN:
(纸本)9781467302685
Resource allocation policies in public Clouds are today largely agnostic to requirements that distributedapplications have from their underlying infrastructure. As a result, assumptions about data-center topology that are built-into distributed data-intensive applications are often violated, impacting performance and availability goals. In this paper we describe a management system that discovers a limited amount of information about Cloud allocation decisions -in particular VMs of the same user that are collocated on a physical machine- so that data-intensiveapplications can adapt to those decisions and achieve their goals. Our distributed discovery process is based on either application-level techniques (measurements) or a novel lightweight and privacy-preserving Cloud management API proposed in this paper. Using the distributed Hadoop file system as a case study we show that VM collocation in a Cloud setup occurs in commercial platforms and that our methodologies can handle its impact in an effective, practical, and scalable manner.
The success of large-scale multi-national projects like the forthcoming analysis of the LHC particle collision data at CERN relies to a great extent on the ability to efficiently utilize computing and data-storage res...
详细信息
ISBN:
(纸本)0769512607
The success of large-scale multi-national projects like the forthcoming analysis of the LHC particle collision data at CERN relies to a great extent on the ability to efficiently utilize computing and data-storage resources at geographically distributed sites. Currently, much effort is spent on the design of Grid management software (datagrid, Globus, etc.), while the effective integration of computing nodes has been largely neglected up to now. This is the focus of our work. We present a framework for a high-performance cluster that can be used as a reliable computing node in the Grid. We outline the cluster architecture, the management of distributeddata and the seamless integration of the cluster into the Grid environment.
暂无评论