作者:
Shook, EricWang, Shaowen
Department of Geography University of Illinois at Urbana-Champaign Urbana IL 61801 United States
University of Illinois at Urbana-Champaign Urbana IL 61801 United States
With recent advances in data collection technologies such as remote sensing and global positioning systems, the amount of spatial data being produced has been increasing at a staggering rate. Simultaneously, a shift i...
详细信息
Virtual prototyping of parallel and embedded systems increases insight into existing computer systems. It further allows to explore properties of new systems already during their specification phase. Virtual prototype...
详细信息
In the age of Big Data, scalable algorithm implementations as well as powerful computational resources are required. For data mining and data analytics the support of big data platforms is becoming increasingly import...
详细信息
ISBN:
(纸本)9781538653951
In the age of Big Data, scalable algorithm implementations as well as powerful computational resources are required. For data mining and data analytics the support of big data platforms is becoming increasingly important, since they provide algorithm implementations with all the resources needed for their execution. However, choosing the best platform might depend on several constraints, including but not limited to computational resources, storage resources, target tasks, service costs. Sometimes it may be necessary to switch from one platform to another depending on the constraints. As a consequence, it is desirable to reuse as much algorithm code as possible, so as to simplify the setup in new target platforms. Unfortunately each big data platform has its own peculiarity, especially to deal with parallelism. This impacts on algorithm implementation, which generally needs to be modified before being executed. This work introduces functional parallel primitives to define the parallelizable parts of algorithms in a uniform way, independent of the target platform. Primitives are then transformed by a compiler into skeletons, which are finally deployed on vendor-dependent frameworks. The procedure proposed aids not only in terms of code reuse but also in terms of parallelization, because programmer's expertise is not demanded. Indeed, it is the compiler that entirely manages and optimizes algorithm parallelization. The experiments performed show that the transformation process does not negatively affect algorithm performance.
The paper presents efficient scalable algorithms for performing prefix (PC) and general prefix (GPC) computations on a distributed shared memory, (DSM) system with applications. PC and GPC are generic techniques that ...
详细信息
The paper presents efficient scalable algorithms for performing prefix (PC) and general prefix (GPC) computations on a distributed shared memory, (DSM) system with applications. PC and GPC are generic techniques that can be used to design sequential and parallel algorithms for a number of problems from diverse areas (K. Arvind et al., 1995; V. Kamakoti and C. Pandurangan, 1992).
XML has become a widely used standard for data exchange among applications. Consequently, a large amount of data is distributed on the Web and stored in different persistence models. DBMSs provide concurrency control ...
详细信息
XML has become a widely used standard for data exchange among applications. Consequently, a large amount of data is distributed on the Web and stored in different persistence models. DBMSs provide concurrency control techniques to manage such data. However, the structure of XML data makes the application of these techniques difficult. Regarding distributed environments, there are few papers available and they all have limitations. This paper introduces DTX, a mechanism for distributed concurrency control for XML data. In order to evaluate DTX, experiments that measure its performance are presented.
Convolutional Neural Networks (CNNs) are widely applied in various machine learning applications and very time-consuming. Most of CNNs' execution time is consumed by convolutional layers. A common approach to impl...
详细信息
ISBN:
(纸本)9783030576752;9783030576745
Convolutional Neural Networks (CNNs) are widely applied in various machine learning applications and very time-consuming. Most of CNNs' execution time is consumed by convolutional layers. A common approach to implementing convolutions is the FFT-based one, which can reduce the arithmetic complexity of convolutions without losing too much precision. As the performance of ARMv8 multi-core CPUs improves, they can also be utilized to perform CNNs like Intel X86 CPUs. In this paper, we present a new parallel FFT-based convolution implementation on ARMv8 multi-core CPUs. The implementation makes efficient use of ARMv8 multi-core CPUs through a series of computation and memory optimizations. The experiment results on two ARMv8 multicore CPUs demonstrate that our new implementation gives much better performance than two existing approaches in most cases.
The solution of sparse linear systems of large dimension is a critical step in problems that span a diverse range of applications. For this reason, a number of iterative solvers have been developed, among which ILUPAC...
详细信息
ISBN:
(纸本)9783319589435;9783319589428
The solution of sparse linear systems of large dimension is a critical step in problems that span a diverse range of applications. For this reason, a number of iterative solvers have been developed, among which ILUPACK integrates an inverse-based multilevel ILU preconditioner with appealing numerical properties. In this paper, we enhance the computational performance of ILUPACK by off-loading the execution of several key computational kernels to a Graphics processing Unit (GPU). In particular, we target the preconditioned GMRES and BiCG methods for sparse general systems and the preconditioned SQMR method for sparse symmetric indefinite problems in ILUPACK. The evaluation on a NVIDIA Kepler GPU shows a sensible reduction of the execution time, while maintaining the convergence rate and numerical properties of the original ILUPACK solver.
In real-time systems, it is often more desirable for a job to produce an approximate, imprecise result by its deadline than to produce a precise result late. In this paper, we evaluate by simulation the performance of...
详细信息
ISBN:
(纸本)9783642217135
In real-time systems, it is often more desirable for a job to produce an approximate, imprecise result by its deadline than to produce a precise result late. In this paper, we evaluate by simulation the performance of a heterogeneous distributed real-time system, where composite jobs with end-to-end deadlines are scheduled dynamically as they arrive in the system, utilizing imprecise computations. Each job is a directed acyclic graph of component tasks, where the output data of a task may be used as input by another task. In case the input data of a component task are imprecise, the processing time of the task is extended, in order to correct the error and produce a result of acceptable quality. The impact of input error on the system performance is investigated under various workloads and input error limits.
This book constitutes the proceedings of the 13th internationalconference on parallel Computing Technologies, PaCT 2015, held in Petrozavodsk, Russia, during August / September 2015. The 37 full papers and 14 short p...
详细信息
ISBN:
(数字)9783319219097
ISBN:
(纸本)9783319219080
This book constitutes the proceedings of the 13th internationalconference on parallel Computing Technologies, PaCT 2015, held in Petrozavodsk, Russia, during August / September 2015. The 37 full papers and 14 short papers presented were carefully reviewed and selected from 87 submissions. The papers are organized in topical sections on parallel models, algorithms and programming methods; unconventional computing; cellular automata; distributed computing; special processors programming techniques; applications.
The Internet of Things (IoT) is able to connect billions of devices and services at anytime in any place, with various applications. Recently, the IoT became an emerging technology. One of the most significant current...
详细信息
ISBN:
(纸本)9781538644270
The Internet of Things (IoT) is able to connect billions of devices and services at anytime in any place, with various applications. Recently, the IoT became an emerging technology. One of the most significant current research discussion topics on the IoT is about the smart car parking. A modern urban city has over a million of cars on its roads but it does not have enough parking space. Moreover, most of the contemporary researchers propose management of the data on cloud. However, this method may be considered as an issue since the raw data is sent promptly from distributed sensors to the parking area via cloud and then received back after it is processed. This is considered as an expensive technique in terms of the data transmission as well as the energy cost and consumption. While the majority of proposed solutions address the problem of finding unoccupied parking space and ignore some other critical issues such as information about the nearest car parking and the roads traffic congestion, this paper goes beyond and proposes the alternative method. The paper proposes a smart car parking system that will assist users to solve the issue of finding a parking space and to minimise the time spent in searching for the nearest available car park. In addition, it provides users with roads traffic congestion status. Moreover, the proposed system collects the raw data locally and extracts features by applying data filtering and fusion techniques to reduce the transmitted data over the network. After that, the transformed data is sent to the cloud for processing and evaluating by using machine learning algorithms.
暂无评论