parallel processing of big graph-shaped data still presents many challenges. Several approaches have appeared recently, and a strong trend focusing on understanding graph computation as iterative vertex-centric comput...
详细信息
ISBN:
(纸本)9781450344333
parallel processing of big graph-shaped data still presents many challenges. Several approaches have appeared recently, and a strong trend focusing on understanding graph computation as iterative vertex-centric computations has emerged. There have been several systems in the vertex-centric approach, for example Pregel, Giraph, GraphLab and PowerGraph. Though programs developed in these systems run efficiently in parallel, writing vertex-programs usually results in code with poor readability, that is full of side effects and control statements unrelated to the algorithm. In this paper we introduce "s6raph", a new vertex-centric graph processing framework with a functional interface that allows the user to write clear and concise functions. The user can choose one of several default behaviours provided for most common graph algorithms. We discuss the design of the functional interface and introduce our prototype implementation in Erlang.
Algorithmic skeletons are used as building-blocks to ease the task of parallel programming by abstracting the details of parallel implementation from the developer. Most existing libraries provide implementations of s...
详细信息
Algorithmic skeletons are used as building-blocks to ease the task of parallel programming by abstracting the details of parallel implementation from the developer. Most existing libraries provide implementations of skeletons that are defined over flat data types such as lists or arrays. However, skeleton-based parallel programming is still very challenging as it requires intricate analysis of the underlying algorithm and often uses inefficient intermediate data structures. Further, the algorithmic structure of a given program may not match those of list-based skeletons. In this paper, we present a method to automatically transform any given program to one that is defined over a list and is more likely to contain instances of list-based skeletons. This facilitates the parallel execution of a transformed program using existing implementations of list-based parallel skeletons. Further, by using an existing transformation called distillation in conjunction with our method, we produce transformed programs that contain fewer inefficient intermediate data structures.
Increasingly sophisticated, complex, and energy-efficient cyber-physical systems and wireless sensor networks are emerging, facilitated by recent advances in computing and sensor technologies. Integration of cyber-phy...
详细信息
ISBN:
(纸本)9781509027712
Increasingly sophisticated, complex, and energy-efficient cyber-physical systems and wireless sensor networks are emerging, facilitated by recent advances in computing and sensor technologies. Integration of cyber-physical systems and wireless sensor networks with other contemporary technologies, such as unmanned aerial vehicles and fog or edge computing, enable creation of completely new smart solutions. We present the concept of a Smart Mobile Access Point (SMAP), which is a key building block for a smart network, and propose an efficient placement approach for such SMAPs. SMAPs predict the behavior of the network, based on information collected from the network, and select the best approach to support the network at any given time. When needed, they autonomously change their positions to obtain a better configuration from the network performance perspective. Therefore, placement of SMAPs is an important issue in such a system. Initial placement of SMAPs is an NP problem, and evolutionary algorithms provide an efficient means to solve it. Specifically, we present a parallel implementation of the imperialistic competitive algorithm and an efficient evaluation or fitness function to solve the initial placement of SMAPs in the fog computing context.
Streaming applications can analyze vast data streams and requires both high throughput and low latency. They are comprised of operator graphs which produce and consume data tuples where operators are stateful, selecti...
详细信息
ISBN:
(纸本)9781509032914
Streaming applications can analyze vast data streams and requires both high throughput and low latency. They are comprised of operator graphs which produce and consume data tuples where operators are stateful, selective and user-defined. The streaming programming model logically exposes task and pipeline parallelism, enabling it to develop parallel systems. Naturally it doesnot expose data parallelism, which must be extracted from streaming applications. This paper presents a compiler and runtime system that automatically extract data parallelism for distributed stream processing. Our approach is safety guarantee in presence of stateful, selective and user-defined operators. Data parallelization is secure if the sequential semantics of the applications are preserved, also the compiler ensures safety by considering dependencies on other operators in the graph and selectivity, state, partitioning of operator. The distributed runtime system ensures that tuples always exit parallel regions in the same order they would without data parallelism, using the most efficient strategy as identified by the compiler.
This article brings out the usefulness of improving classical algorithms, analyzing, and optimizing the efficiency of parallel execution time at any price, depending on the situation of use. It can be said that nowada...
详细信息
ISBN:
(纸本)9781467386067
This article brings out the usefulness of improving classical algorithms, analyzing, and optimizing the efficiency of parallel execution time at any price, depending on the situation of use. It can be said that nowadays the searching algorithms are involved in multiple actions in different domains starting from the data bases, the online searching engine, GPS systems or in the emergency situations. Also these algorithms are to be found in the network where we can talk about real priority schemes and data transfer speed that flatters a lot today. Even in the top management systems these algorithms are very useful today because every decision has an important component of time and the searching for information is direct related with these type of algorithms. This study offers an innovative and efficient approach of Dijkstra's algorithm, Bellman Ford algorithm, Floyd-Warshall algorithm and Viterbi algorithm through parallel programming and analysis of the results obtained in different tests but also a comparison of those searching strategies on graph systems. This study can generate new approaches over those strategies and can generate new discussions over the necessity of improving the old algorithms.
Recent advancements in machine learning algorithms have transformed the data analytics domain and provided innovative solutions to inherently difficult problems. However, training models at scale over large data sets ...
详细信息
ISBN:
(纸本)9781509036820
Recent advancements in machine learning algorithms have transformed the data analytics domain and provided innovative solutions to inherently difficult problems. However, training models at scale over large data sets remains a daunting challenge. One such problem is the detection of overlapping communities within graphs. For example, a social network can be modeled as a graph where the vertices and edges represent individuals and their relationships. As opposed to the problem of graph partitioning or clustering, an individual can be part of multiple communities which significantly increases the problem complexity. In this paper, we present and evaluate an efficient parallel and distributed implementation of a Stochastic Gradient Markov Chain Monte Carlo algorithm that solves the overlapping community detection problem. We show that the algorithm can scale and process graphs consisting of billions of edges and tens of millions of vertices on a compute cluster of 65 nodes. To the best of our knowledge, this is the first time that the problem of deducing overlapping communities has been learned for problems of such a large scale.
Scheduling of programs for hierarchical architectures of Chip Multi-Processor (CMP) modules interconnected by global data networks is the subject of this paper. The CMP modules are of double nature: architecturally sp...
详细信息
ISBN:
(纸本)9783319321523;9783319321516
Scheduling of programs for hierarchical architectures of Chip Multi-Processor (CMP) modules interconnected by global data networks is the subject of this paper. The CMP modules are of double nature: architecturally specialized modules which execute time-critical computations and standard CMP modules which interconnect the specialized ones. Inside application programs, so called architecturally supported regions are identified meant for efficient execution on dedicated architecturally supported modules. Programs are represented by macro dataflow graphs built of architecturally supported nodes and program glue nodes. The paper proposes a new task scheduling algorithm for programs meant for execution in such CMP-based systems. The algorithm is based on list scheduling with modified ETF (Earliest Task First) heuristics. It is assessed by experiments based on simulation of program execution which shows parallel speedup improvements.
MapReduce is a widely adopted computing framework for data-intensive applications running on clusters. We propose an approach to exploit data parallelisms in XML processing using MapReduce in Hadoop. Our solution seam...
详细信息
ISBN:
(纸本)9781509044573
MapReduce is a widely adopted computing framework for data-intensive applications running on clusters. We propose an approach to exploit data parallelisms in XML processing using MapReduce in Hadoop. Our solution seam-lessly integrates data storage, labelling, indexing, and parallel queries to process a massive amount of XML data. Specifically, we introduce an SDN labelling algorithm and a distributed hierarchical index using DHTs;we develop an efficient data retrieval approach called B-SLCA. More importantly, we design an advanced two-phase MapReduce solution that is able to efficiently address the issues of labelling, indexing, and query processing on big XML data. We implemented our solution on a real-world Hadoop cluster processing the real-world datasets. Our experimental results show that SDN outperforms NCIM by up to a factor of 1.36 with an average of 1.17;our B-SLCA outperforms BwdSLCA by up to a factor of 1.96 with an average of 1.2.
This paper presents the design of RISC architecture based multicore processor using the Xilinx (R) development platform for designing and Spartan-6 FPGA for the implementation of the architecture. The light weight mul...
详细信息
ISBN:
(纸本)9781509032105
This paper presents the design of RISC architecture based multicore processor using the Xilinx (R) development platform for designing and Spartan-6 FPGA for the implementation of the architecture. The light weight multithreaded kernel module is implemented on the top of the architecture to demonstrate the parallel programming potentials on the same. A task assigned to the processor is managed by the OS which also controls the traffic. At the end of the paper benchmarking and profiling results are mentioned, which bring to the close that the design mentioned in this paper gives better results with its counterparts.
Fortran coarrays have been used as an extension to the standard for over 20 years, mostly on Cray systems. Their appeal to users increased substantially when they were standardised in 2010. In this work we show that c...
详细信息
ISBN:
(纸本)9781509052141
Fortran coarrays have been used as an extension to the standard for over 20 years, mostly on Cray systems. Their appeal to users increased substantially when they were standardised in 2010. In this work we show that coarrays offer simple and intuitive data structures for 3D cellular automata (CA) modelling of material microstructures. We show how coarrays can be used together with an MPI finite element (FE) library to create a two-way concurrent hierarchical and scalable multi-scale CAFE deformation and fracture framework. Design of a coarray cellular automata microstructure evolution library CGPACK is described. A highly portable MPI FE library ParaFEM was used in this work. We show that independently CGPACK and ParaFEM programs can scale up well into tens of thousands of cores. Strong scaling of a hybrid ParaFEM/CGPACK MPI/coarray multi-scale framework was measured on an important solid mechanics practical example of a fracture of a steel round bar under tension. That program did not scale beyond 7 thousand cores. Excessive synchronisation might be one contributing factor to relatively poor scaling. Therefore we conclude with a comparative analysis of synchronisation requirements in MPI and coarray programs. Specific challenges of synchronising a coarray library are discussed.
暂无评论