For the design of classic computers the Parallel programming concept is used to abstract HW/SW interfaces during high level specification of application software. The software is then adapted to existing multiprocesso...
详细信息
ISBN:
(纸本)1595933816
For the design of classic computers the Parallel programming concept is used to abstract HW/SW interfaces during high level specification of application software. The software is then adapted to existing multiprocessor platforms using a low level software layer that implements the programming model. Unlike classic computers, the design of heterogeneous MPSoC includes also building the processors and other kind of hardware components required to execute the software. In this case, the programming model hides both hardware and software refinements. This paper deals with parallel programming models to abstract both hardware and software interfaces in the case of heterogeneous MPSoC design. Different abstraction levels will be needed. For the long term, the use of higher level programming models will open new vistas for optimization and architecture exploration like CPU/RTOS tradeoffs.
This paper surveys the use of mathematical programming models for controlling environmental quality. The scope includes air, water, and land quality, stemming from the first works in the 1960s. It also includes integr...
详细信息
This paper surveys the use of mathematical programming models for controlling environmental quality. The scope includes air, water, and land quality, stemming from the first works in the 1960s. It also includes integrated models, generally that are economic equilibrium models which have an equivalent mathematical program or use mathematical programming to compute a fixed point. A primary goal of this survey is to identify interesting research avenues for people in mathematical programming with an interest in applying it to help control our environment with as little economic sacrifice as possible.
We introduce steady state policies for a class of infinite horizon, deterministic, discounted dynamic programming models with compact–convex state spaces, compact action spaces and continuous cost and transition func...
详细信息
We introduce steady state policies for a class of infinite horizon, deterministic, discounted dynamic programming models with compact–convex state spaces, compact action spaces and continuous cost and transition functions. Steady state policies are stationary policies under which a system moves to a steady state in a given number of steps. Because of their simple structure, it is sometimes easy to solve problems where one considers only steady state policies. Although such a restriction is sometimes costly (e.g., in many inventory models), there are cases where it is quite reasonable. In this paper we find conditions under which there exists a good steady state policy, i.e., one whose cost differs from the optimal cost by an amount whose bound is independent of the discount factor. (Such policies are optimal with respect to the average cost criterion.) We also give examples of models which satisfy these conditions. Our results depend on Brouwer’s fixed-point theorem.
The present work deals with the usual stationary decision model of dynamic programming. The imposed convergence condition on the expected total rewards is so general that both the negative (unbounded) case and the pos...
详细信息
The present work deals with the usual stationary decision model of dynamic programming. The imposed convergence condition on the expected total rewards is so general that both the negative (unbounded) case and the positive (unbounded) case are included. However, the gambling model studied by Dubins and Savage is not covered by the present model. In addition to the convergence condition, a continuity and compactness condition is imposed. The main result states that the supremum of the expected total rewards under all stationary policies is equal to the supremum under all (possibly randomized and non-Markovian) policies.
High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism. The most popular PMs, such as OpenMP or OmpSs, are directive...
详细信息
High-level parallel programming models (PMs) are becoming crucial in order to extract the computational power of current on-node multi-threaded parallelism. The most popular PMs, such as OpenMP or OmpSs, are directive-based: the complexity of the hardware is hidden by the underlying runtime system, improving coding productivity. The implementations of OpenMP usually rely on POSIX threads (pthreads), offering excellent performance for coarse-grained parallelism and a perfect match with the current hardware. OmpSs is a task oriented PM based on an ad hoc runtime solution called Nanos++;it is the precursor of the tasking parallelism in the OpenMP tasking specification. A recent trend in runtimes and applications points to leveraging massive on-node parallelism in conjunction with fine-grained and dynamic scheduling paradigms. In this paper we analyze the behavior of the OpenMP and OmpSs PMs on top of the recently emerged Generic Lightweight Threads (GLT) API GLT exposes a common API for lightweight thread (LWT) libraries that offers the possibility of running the same application over different native LWT solutions. We describe the design details of those high-level PMs implemented on top of GLT and analyze different scenarios in order to assess where the use of LWTs may benefit application performance. Our work reveals those scenarios where LWTs overperform pthread-based solutions and compares the performance between an ad hoc solution and a generic implementation. (C) 2018 Elsevier B.V. All rights reserved.
In today's multiprocessor SoCs (MPSoCs), parallel programming models are needed to fully exploit hardware capabilities and to achieve the 100 Gops/W energy efficiency target required for Ambient Intelligence Appli...
详细信息
In today's multiprocessor SoCs (MPSoCs), parallel programming models are needed to fully exploit hardware capabilities and to achieve the 100 Gops/W energy efficiency target required for Ambient Intelligence Applications. However, mapping abstract programming models onto tightly power-constrained hardware architectures imposes overheads which might seriously compromise performance and energy efficiency. The objective of this work is to perform a comparative analysis of message passing versus shared memory as programming models for single-chip multiprocessor platforms. Our analysis is carried out from a hardware-software viewpoint: We carefully tune hardware architectures and software libraries for each programming model. We analyze representative application kernels from the multimedia domain, and identify application-level parameters that heavily influence performance and energy efficiency. Then, we formulate guidelines for the selection of the most appropriate programming model and its architectural support.
Recently, General Purpose Graphical Processing Units (GP-GPUs) have been identified as an intriguing technology to accelerate numerous data-parallel algorithms. Several GPU architectures and programming models are beg...
详细信息
Recently, General Purpose Graphical Processing Units (GP-GPUs) have been identified as an intriguing technology to accelerate numerous data-parallel algorithms. Several GPU architectures and programming models are beginning to emerge and establish their niche in the High-Performance Computing (HPC) community. New massively parallel architectures such as the Nvidia's Fermi and AMD/ATi's Radeon pack tremendous computing power in their large number of multiprocessors. Their performance is unleashed using one of the two GP-GPU programming models: Compute Unified Device Architecture (CUDA) and Open Computing Language (OpenCL). Both of them offer constructs and features that have direct bearing on the application runtime performance. In this paper, we compare the two GP-GPU architectures and the two programming models using a two-level character recognition network. The two-level network is developed using four different Spiking Neural Network (SNN) models, each with different ratios of computation-to-communication requirements. To compare the architectures, we have chosen the two extremes of the SNN models for implementation of the aforementioned two-level network. An architectural performance comparison of the SNN application running on Nvidia's Fermi and AMD/ATi's Radeon is done using the OpenCL programming model exhausting all of the optimization strategies plausible for the two architectures. To compare the programming models, we implement the two-level network on Nvidia's Tesla C2050 based on the Fermi architecture. We present a hierarchy of implementations, where we successively add optimization techniques associated with the two programming models. We then compare the two programming models at these different levels of implementation and also present the effect of the network size (problem size) on the performance. We report significant application speed-up, as high as 1095x for the most computation intensive SNN neuron model, against a serial implementation on th
In recent years, processing and analysing large graphs has become a major need in many research areas. Distributed graph processing programming models and frameworks arised as a natural solution to process linked data...
详细信息
In recent years, processing and analysing large graphs has become a major need in many research areas. Distributed graph processing programming models and frameworks arised as a natural solution to process linked data of large volumes, such as data originating from social media. These solutions are distributed by design and help developers to perform operations on the graph, sometimes reaching almost real-time performance even on huge graphs. Some of the available graph processing frameworks exploit generic data processing models, like MapReduce, while others were specifically built for graph processing, introducing techniques such as vertex or edge partitioning and graph-oriented programming models. In this work, we analyse the properties of recent and widely popular frameworks - from the perspective of the adopted programming model - designed to process large-scale graphs with the goal of assisting software developers/designers in choosing the most adequate tool.
One of the most important issues in the path to the convergence of HPC and Big Data is caused by the differences in their software stacks. Despite some research efforts, the interoperability between their programming ...
详细信息
One of the most important issues in the path to the convergence of HPC and Big Data is caused by the differences in their software stacks. Despite some research efforts, the interoperability between their programming models and languages is still limited. To deal with this problem we introduce a new computing framework called IgnisHPC, whose main objective is to unify the execution of Big Data and HPC workloads in the same framework. IgnisHPC has native support for multi-language applications using JVM and non-JVM-based languages. Since MPI was used as its backbone technology, IgnisHPC takes advantage of many communication models and network architectures. Moreover, MPI applications can be directly executed in an efficient way in the framework. The main consequence is that users could combine in the same multi-language code HPC tasks (using MPI) with Big Data tasks (using MapReduce operations). The experimental evaluation demonstrates the benefits of our proposal in terms of performance and productivity with respect to other frameworks. IgnisHPC is publicly available for the Big Data and HPC research community. (c) 2022 Elsevier B.V. All rights reserved.
Weak conditions are presented for approximating dynamic programming models. For a sequence of these models, continuous convergence of the sequence of associated optimal value functions is obtained under the condition ...
详细信息
Weak conditions are presented for approximating dynamic programming models. For a sequence of these models, continuous convergence of the sequence of associated optimal value functions is obtained under the condition that state and action space converge in the sense of Kuratowski, and that the mappings of admissible actions as well as the transition law, the discount factors and the reward functions converge continuously. Further a relation for the associated sets of optimal actions is given. The analysis is based on results about convergence preserving properties of supremum value functions and integrals. The approximation results are extended to so-called upper-semi-continuous convergent sequences and are related to discretization procedures by using projections.
暂无评论