In this paper, we present a contribution for the Single Source Shortest Path Problem (SSSPP) in large-scale graph with A* algorithm. A* is one of the most efficient graph traversal algorithm because it is driven by a ...
详细信息
ISBN:
(纸本)9783319681795;9783319681788
In this paper, we present a contribution for the Single Source Shortest Path Problem (SSSPP) in large-scale graph with A* algorithm. A* is one of the most efficient graph traversal algorithm because it is driven by a heuristic which determines the optimal path. A* approach is not efficient when the graph is too large to be processed due to exponential time complexity. We propose a MapReduce-based approach called MRA*: MapReduce-A* which consists to combine the A* algorithm with MapReduce paradigm to compute the shortest path in parallel and distributed environment. We perform experiments in a Hadoop multi-node cluster and our results prove that the proposed approach outperforms A* algorithm and reduces significantly the computational time.
In heterogeneous computing systems, general purpose CPUs are coupled with co-processors of different architectures, like GPUs and FPGAs. Applications may take advantage of this heterogeneous device ensemble to acceler...
详细信息
ISBN:
(纸本)9781728199245
In heterogeneous computing systems, general purpose CPUs are coupled with co-processors of different architectures, like GPUs and FPGAs. Applications may take advantage of this heterogeneous device ensemble to accelerate execution. However, developing heterogeneous applications requires specific programming models, under which applications unfold into code components targeting different computing devices. OpenCL is one of the main programming models for heterogeneous applications, set apart from others due to its openness, vendor independence and support for different co-processors. In the original OpenCL application model, a heterogeneous application starts in a certain host node, and then resorts to the local co-processors attached to that host. Therefore, co-processors at other nodes, networked with the host node, are inaccessible and cannot be used to accelerate the application. rOpenCL (remote OpenCL) overcomes this limitation for a significant set of the OpenCL 1.2 API, offering OpenCL applications transparent access to remote devices through a TPC/IP based network. This paper presents the architecture and the most relevant implementation details of rOpenCL, together with the results of a preliminary set of reference benchmarks. These prove the stability of the current prototype and show that, in many scenarios, the network overhead is smaller than expected.
Spatial association mining, as one of important techniques for spatial data mining, is used to discover interesting relationship patterns among spatial features based on spatial proximity from a large spatial database...
详细信息
ISBN:
(纸本)9781538650356
Spatial association mining, as one of important techniques for spatial data mining, is used to discover interesting relationship patterns among spatial features based on spatial proximity from a large spatial database. Explosive growth in georeferenced data has emphasized the need to develop computationally efficient methods for analyzing big spatial data. parallel and distributed computing is effective and mostly-used strategy for speeding up large scale dataset algorithms. This work presents parallel spatial association mining on the Spark RDD framework -a specially-designed in-memory parallelcomputing model to support iterative algorithms. The initial experiment result shows that the Spark-based algorithm has significantly improved performance than the method with MapReduce in spatial association pattern mining.
A process of Knowledge Discovery in Databases (KDD) involving large amounts of data requires a considerable amount of computational power. The process may be done on a dedicated and expensive machinery or, for some ta...
详细信息
ISBN:
(纸本)9783540734345
A process of Knowledge Discovery in Databases (KDD) involving large amounts of data requires a considerable amount of computational power. The process may be done on a dedicated and expensive machinery or, for some tasks, one can use distributedcomputing techniques on a network of affordable machines. In either approach it is usual the user to specify the workflow of the sub-tasks composing the whole KDD process before execution starts. In this paper we propose a technique that we call distributed Generative Data Mining. The generative feature of the technique is due to its capability of generating new sub-tasks of the Data Mining analysis process at execution time. The workflow of sub-tasks of the DM is, therefore, dynamic. To deploy the proposed technique we extended the distributed Data Mining system HARVARD and adapted an Inductive Logic Programming system (IndLog) used in a Relational Data Ming task. As a proof-of-concept, the extended system was used to analyse an artificial dataset of a credit scoring problem with eighty million records.
The amount of information available through the Internet has been showing a significant growth in the last decade. The information can result from various sources such as scientific experiments resulting from particle...
详细信息
The amount of information available through the Internet has been showing a significant growth in the last decade. The information can result from various sources such as scientific experiments resulting from particle acceleration, recording the flight data of a commercial aircraft, or sets of documents from a given domain such as medical articles, news headlines from a newspaper, or social networks contents. Due to the volume of data that must be analyzed, it is necessary to endow the search engines with new tools that allow the user to obtain the desired information in a timely and accurate manner. One approach is the annotation of documents with their relevant expressions. The extraction of relevant expressions from natural language text documents can be accomplished by the use of semantic, syntactic, or statistical techniques. Although the latter tend to be not so accurate, they have the advantage of being independent of the language. This investigation was performed in the context of LocalMaxs, which is a statistical method, thus language-independent, capable of extracting relevant expressions from natural language corpora. However, due to the large volume of data involved, the sequential implementations of the above techniques have severe limitations both in terms of execution time and memory space. In this thesis we propose a distributed architecture and strategies for parallel implementations of statistical-based extraction of relevant expressions from large corpora. A methodology was developed for modeling and evaluating those strategies based on empirical and theoretical approaches to estimate the statistical distribution of n-grams in natural language corpora. These approaches were applied to guide the design and evalu- ation of the behavior of LocalMaxs parallel and distributed implementations on cluster and cloud computing platforms. The implementation alternatives were compared regar- ding their precision and recall, and their performance metrics, namel
This paper presents the design and experimental evaluation of two dynamic load partitioning and balancing strategies for parallel Structured Adaptive Mesh Refinement (SAMR) applications: the Level-based Partitioning A...
详细信息
ISBN:
(纸本)354040788X
This paper presents the design and experimental evaluation of two dynamic load partitioning and balancing strategies for parallel Structured Adaptive Mesh Refinement (SAMR) applications: the Level-based Partitioning Algorithm (LPA) and the Hierarchical Partitioning Algorithm (HPA). These techniques specifically address the computational and communication heterogeneity across refinement levels of the adaptive grid hierarchy underlying these methods. An experimental evaluation of the partitioning schemes is also presented.
Realization control volume method for calculation of non-stationary task of heat conductivity is presented in the article. This method can be applied for simulation of metal casting solidification in sand mold and oth...
详细信息
ISBN:
(纸本)9781467381147
Realization control volume method for calculation of non-stationary task of heat conductivity is presented in the article. This method can be applied for simulation of metal casting solidification in sand mold and other application. Distinctive feature of the developed method is possibility of application of the distributedcomputing which provides a good result on calculation speed, however demands the bigger volume of random access memory. In the calculation example, for a grid from 1 million elements the increase in calculation speed by 15 divided by 20% was reached when using 2 cores of the Intel Core I-5 processor and by 30% when using 3 cores. Also there is a possibility of increase of calculations accuracy for the account of increase in quantity of calculated elements at identical calculation time.
Dependable computing Covering dependability from software and hardware perspectives Dependable computing: Design and Assessment looks at both the software and hardware aspects of dependability. This book: Provides an...
详细信息
ISBN:
(数字)9781119743460;9781119743446
ISBN:
(纸本)9781118709443
Dependable computing Covering dependability from software and hardware perspectives Dependable computing: Design and Assessment looks at both the software and hardware aspects of dependability. This book:
Provides an in-depth examination of dependability/fault tolerance topics
Describes dependability taxonomy, and briefly contrasts classical techniques with their modern counterparts or extensions
Walks up the system stack from the hardware logic via operating systems up to software applications with respect to how they are hardened for dependability
Describes the use of measurement-based analysis of computing systems
Illustrates technology through real-life applications
Discusses security attacks and unique dependability requirements for emerging applications, e.g., smart electric power grids and cloud computing
Finally, using critical societal applications such as autonomous vehicles, large-scale clouds, and engineering solutions for healthcare, the book illustrates the emerging challenges faced in making artificial intelligence (AI) and its applications dependable and trustworthy.
This book is suitable for those studying in the fields of computer engineering and computer science. Professionals who are working within the new reality to ensure dependable computing will find helpful information to support their efforts. With the support of practical case studies and use cases from both academia and real-world deployments, the book provides a journey of developments that include the impact of artificial intelligence and machine learning on this ever-growing field. This book offers a single compendium that spans the myriad areas in which dependability has been applied, providing theoretical concepts and applied knowledge with content that will excite a beginner, and rigor that will satisfy an expert. Accompanying the book is an online repository of problem sets and solutions, as well as slides fo
Comprehensive guide to the principles, algorithms, and techniques underlying resource management for clouds, big data, and sensor-based systems Resource Management on distributed Systems provides helpful guidance by d...
详细信息
ISBN:
(数字)9781119912965
ISBN:
(纸本)9781119912934
Comprehensive guide to the principles, algorithms, and techniques underlying resource management for clouds, big data, and sensor-based systems Resource Management on distributed Systems provides helpful guidance by describing algorithms and techniques for managing resources on parallel and distributed systems, including grids,
暂无评论