MapReduce is currently an attractive model for data intensive application due to easy interface of programming, high scalability and fault tolerance capability. It is well suited for applications requiring processing ...
详细信息
MapReduce is currently an attractive model for data intensive application due to easy interface of programming, high scalability and fault tolerance capability. It is well suited for applications requiring processing large data with distributed processing resources such as web data analysis, bio informatics, and high performance computing area. There are many studies of job scheduling mechanism in shared cluster for MapReduce. However there is a need for scheduling workflow service composed of multiple MapReduce tasks with precedence dependency in multiple processing nodes. The contribution of this paper is proposing a scheduling mechanism for a workflow service containing multiple MapReduce jobs. The workflow application has precedence dependency constraints among multiple tasks, represented as directed acyclic graph (DAG). Also, for less data transfer cost in limited bisection bandwidth, data dependency criterion should be considered for scheduling multiple map-reduce jobs in a workflow. The proposed scheduling mechanism provides 1) scheduling MapReduce tasks regarding precedence constraints and 2) pre-data placement method considering data dependency constraints for saving data transfer cost over network.
MapReduce is an emerging paradigm processing massive data over computing *** provides an easy programming interface,high scalability,and fault *** achieving better performances,there were many scheduling issues for ma...
详细信息
MapReduce is an emerging paradigm processing massive data over computing *** provides an easy programming interface,high scalability,and fault *** achieving better performances,there were many scheduling issues for map-reduce jobs in shared cluster *** particular,there is a need for scheduling workflow services composed of multiple MapReduce tasks with precedence dependency in shared cluster *** using the list scheduling approach,the issue of precedence constraints can be *** major factor affecting the performances of map-reduce jobs is locality constraints for reducing data transfer cost in limited bisection *** multiple map-reduce jobs in a workflow are running in shared clusters,when placing data sets,concurrency also should be considered for locality *** proposed scheduling approach provides 1) a data pre-placement strategy for improvement of locality and concurrency and 2) a scheduling algorithm considering locality and concurrency.
In e-service environment, service contract is important for assurance of business interoperability and quality of services. Combining service contract and process model will facilitate analyzing service process and mo...
详细信息
This paper presents a novel WQA (Web Question Answering) approach based on the combination CCG (Combinatory Categorial Grammar) and DL (Description Logic) ontology, in order to promote semantic-level accuracy through ...
详细信息
ISBN:
(纸本)9781467312882
This paper presents a novel WQA (Web Question Answering) approach based on the combination CCG (Combinatory Categorial Grammar) and DL (Description Logic) ontology, in order to promote semantic-level accuracy through deep text understanding capabilities. We propose to take DL based semantic modeling, i.e., translating lambda-expression encoding of question meaning into DL based semantic representations. The advantage of such approach is a seamless exploitation of existing semantic resource coded as DL ontology, which is widespread in such area as the Semantic Web and conceptual modeling. The experiments are conducted with a repository of complex Chinese questions which involves the satisfaction of some object property restrictions. The experimental results show that producing the semantic representations with the combination of CCG parsing and DL reasoning is an effective approach for question understanding at semantic level, in terms of both understanding accuracy promotion and semantic resource exploitation.
MapReduce is an emerging paradigm for data intensive processing with support of cloudcomputing *** provides convenient programming interfaces to distribute data intensive works in a cluster *** strengths of MapReduce...
详细信息
MapReduce is an emerging paradigm for data intensive processing with support of cloudcomputing *** provides convenient programming interfaces to distribute data intensive works in a cluster *** strengths of MapReduce are fault tolerance,an easy programming structure and high scalability.A variety of applications have adopted MapReduce including scientific analysis,web data processing and high performance *** Intensive computingsystems,such as Hadoop and Dryad,should provide an efficient scheduling mechanism for enhanced utilization in a shared cluster *** problems of scheduling mapreduce jobs are mostly caused by locality and synchronization ***,there is a need to schedule multiple jobs in a shared cluster with fairness *** introducing the scheduling problems with regards to locality,synchronization and fairness constraints,this paper reviews a collection of scheduling methods for handling these issues in *** addition,this paper compares different scheduling methods evaluating their features,strengths and *** resolving synchronization overhead,two categories of studies;asynchronous processing and speculative execution are *** fairness constraints with locality improvement,delay scheduling in Hadoop and Quincy scheduler in Dryad are discussed.
Since both consumers and data centers of a cloud service provider can be distributed in geographically, the provider needs to allocate each consumer request to an appropriate data center among the distributed data cen...
详细信息
Since both consumers and data centers of a cloud service provider can be distributed in geographically, the provider needs to allocate each consumer request to an appropriate data center among the distributed data centers, so that the consumers can satisfy with the service in terms of fast allocation time and execution response time. In this paper, we propose an adaptive resource allocation model that allocates the consumer's job to an appropriate data center. The method to adaptively find a proper data center is based on two evaluations: 1) the geographical distance (network delay)between a consumer and data centers, and 2) the workload of each data center. The proposed model is implemented in an agent based test bed. The test bed simulates a cloudcomputing environment adopting the proposed adaptive resource allocation model. Empirical results were obtained from simulations using the test bed. The results suggest that the proposed model can successfully allocate consumers' requests to the data center closest to each consumer. Also, the proposed model shows a better response time for allocation than related resource allocation models.
The effort of "Smart Grid" is to modernize grid infrastructure and build-in intelligence to power grids and delivery systems, and their interfaces to customer premises. However, the perspectives range from a...
详细信息
ISBN:
(纸本)9781457710018
The effort of "Smart Grid" is to modernize grid infrastructure and build-in intelligence to power grids and delivery systems, and their interfaces to customer premises. However, the perspectives range from an emphasis on infrastructure to an emphasis on new paradigm-shifting applications. Alternatively, the smart grid can be thought of as the advanced information technologies that enable the desired analytical applications. As a general understanding, we believe that Smart Grid needs to integrate power system analysis, computing and economics to enhance grid reliability, efficiency, and security, and contributes to the climate change strategic goal. In this Supersession, the analytics that empowers smart grid applications is discussed;and the practical implementation and integration challenges will be presented.
暂无评论