Words like multicore, manycore, Moore's law ending, have been around for more than a decade. How do these words describe the current status quo of computer architecture? How do they give a glimpse of the future? I...
详细信息
Words like multicore, manycore, Moore's law ending, have been around for more than a decade. How do these words describe the current status quo of computer architecture? How do they give a glimpse of the future? In this paper, we will present the status quo of the current multicore/manycore processors, and the expected future directions in light of several advances both in process technology and in system software.
In this paper we consider the cognitive process as a set of different tasks. In particular tasks of clustering, classification and search of association. Described the parameters of similarity of these tasks, hypothes...
详细信息
In this paper we consider the cognitive process as a set of different tasks. In particular tasks of clustering, classification and search of association. Described the parameters of similarity of these tasks, hypothesized the possibility of creating a unified methodology for cognitive systems. Shown the possible original architecture of the system, its description, command and data formats, the principles of operation. Describes the implementation of a system model for the GPU.
the efficient processing of large collections of patterns expressed as Boolean expressions over event streams plays a central role in major data intensive applications ranging from user-centric processing and personal...
详细信息
the efficient processing of large collections of patterns expressed as Boolean expressions over event streams plays a central role in major data intensive applications ranging from user-centric processing and personalization to real-time data analysis. On the one hand, emerging user-centric applications, including computational advertising and selective information dissemination, demand determining and presenting to an end-user the relevant content as it is published. On the other hand, applications in real-time data analysis, including push-based multi-query optimization, computational finance and intrusion detection, demand meeting stringent subsecond processing requirements and providing high-frequency event processing. We achieve these event processing requirements by exploiting the shift towards multi-core architectures by proposing novel adaptive parallel compressed event matching algorithm (A-PCM) and online event stream re-ordering technique (OSR) that unleash an unprecedented degree of parallelism amenable for highly parallel event processing. In our comprehensive evaluation, we demonstrate the efficiency of our proposed techniques. We show that the adaptive parallel compressed event matching algorithm can sustain an event rate of up to 233,863 events/second while state-of-the-art sequential event matching algorithms sustains only 36 events/second when processing up to five million Boolean expressions.
parallel computing has been the enabling technology of high-end machines for many years. Now, it has finally become the ubiquitous key to the efficient use of any kind of multi-processor computer architecture, from sm...
详细信息
ISBN:
(数字)9781614993810
ISBN:
(纸本)9781614993803;9781614993810
parallel computing has been the enabling technology of high-end machines for many years. Now, it has finally become the ubiquitous key to the efficient use of any kind of multi-processor computer architecture, from smart phones, tablets, embedded systems and cloud computing up to exascale computers. this book presents the proceedings of ParCo2013 – the latest edition of the biennial internationalconference on parallel Computing – held from 10 to 13 September 2013, in Garching, Germany. the conference focused on several key parallel computing areas. themes included parallel programming models for multi- and manycore CPUs, GPUs, FPGAs and heterogeneous platforms, the performance engineering processes that must be adapted to efficiently use these new and innovative platforms, novel numerical algorithms and approaches to large-scale simulations of problems in science and engineering. the conference programme also included twelve mini-symposia (including an industry session and a special PhD Symposium), which comprehensively represented and intensified the discussion of current hot topics in high performance and parallel computing. these special sessions covered large-scale supercomputing, novel challenges arising from parallelarchitectures (multi-/manycore, heterogeneous platforms, FPGAs), multi-level algorithms as well as multi-scale, multi-physics and multi-dimensional problems. It is clear that parallel computing – including the processing of large data sets (“Big Data”) – will remain a persistent driver of research in all fields of innovative computing, which makes this book relevant to all those with an interest in this field.
Application mapping algorithm for reconfigurable architecture is one of the major research direction in reconfigurable computing. In this paper, we analyze the data memory bank conflict problem of the ACRPs (Applicati...
详细信息
Application mapping algorithm for reconfigurable architecture is one of the major research direction in reconfigurable computing. In this paper, we analyze the data memory bank conflict problem of the ACRPs (Application Customized Reconfigurable Pipelines) when exploiting the data parallelism and a conflict-free iteration-data mapping algorithm based on the operation mapping results is proposed to conquer the bank conflict problem. the experiments showed that ACRP can exploit the CFU, pipeline- and data-level parallelism efficiently. More specifically, the iteration-data mapping algorithm avoids the bank conflicts while exploiting the data parallelism to increase the throughput.
the proceedings contain 56 papers. the special focus in this conference is on Databases. the topics include: Using the model of continuous dynamical system with viscous resistance forces for improving distribution pre...
ISBN:
(纸本)9783319069319
the proceedings contain 56 papers. the special focus in this conference is on Databases. the topics include: Using the model of continuous dynamical system with viscous resistance forces for improving distribution prediction based on evolution of quantiles;granular indices for HQL analytic queries;XML warehouse modelling and querying;unifying mobility data warehouse models using UML profile;reasoning with projection in multimodular description logics knowledge bases;SMAQ a semantic model for analytical queries;grouping multiple RDF graphs in the collections;optimization of approximate decision rules relative to coverage;nondeterministic decision rules in rule-based classifier;the incompleteness factor method as a support of inference in decision support systems;a novel clustering approach;time series forecasting with volume weighted support vector machines;fuzzy interface for historical monuments databases;optimization of mechanical structures using artificial immune algorithm;multivariate estimation of resource utilization bounds of any variable schedule in a computing system;an improved algorithm for fast and accurate classification of sequences;methods of gene ontology term similarity analysis in graph database environment;mining of eye movement data to discover people intentions;fast and accurate hand shape classification;protection tool for distributed denial of services attack;a keystroke dynamics based approach for continuous authentication;the extended structure of multi-resolution database;platform for storing and searching different formats of spatial data;path features in spatial data generation;importance of some topics of data management in cloud-based maritime fleet management software;preconditions for processing electronic medical records;database application in visualization of process data;a data model for heterogeneous data integration architecture;applying web 2.0 concepts to creating energy planning portal and the concept of transformation of XM
While GPU is becoming a compelling acceleration solution for a series of scientific applications, most existing work on climate models only achieved limited speedup. this is due to partial porting of the huge code and...
详细信息
While GPU is becoming a compelling acceleration solution for a series of scientific applications, most existing work on climate models only achieved limited speedup. this is due to partial porting of the huge code and the memory bound inherence of these models. In this work, we design and implement a customized GPU-based acceleration of the Princeton Ocean Model (gpuPOM) based on mpiPOM, which is one of the parallel versions of the Princeton Ocean Model. Based on Nvidia's state-of-the-art GPU architectures (K20X and K40m), we rewrite the full mpiPOM model from the original Fortran version into the CUDA-C version. We present the GPU acceleration methods used in the gpuPOM, especially the techniques to ease its memory bound problem through better use of GPU's memory hierarchy. the experimental results indicate that the gpuPOM with one K40m GPU achieves from 6.3-fold to 16.7-fold speedup over different Intel multi-core CPUs and one K20X GPU achieves from 5.8-fold to 15.5-fold speedup.
the processing and mining information in large scale graph data have proven to be *** bulk synchronous parallel(BSP)computing model is suitable for this *** this paper,we implement the multi-level step-wise partitioni...
详细信息
ISBN:
(纸本)9781479951499
the processing and mining information in large scale graph data have proven to be *** bulk synchronous parallel(BSP)computing model is suitable for this *** this paper,we implement the multi-level step-wise partitioning(MSP)algorithm in BSP programming model,and replace the original graph partition *** results on both experimental data and real world data proved this improvement achieved better data locality,reduced communication between work nodes,and it made a better performance than the original method.
Conventional Brownian dynamics (BD) simulations with hydrodynamic interactions utilize 3n×3n dense mobility matrices, where n is the number of simulated particles. this limits the size of BD simulations, particul...
详细信息
Conventional Brownian dynamics (BD) simulations with hydrodynamic interactions utilize 3n×3n dense mobility matrices, where n is the number of simulated particles. this limits the size of BD simulations, particularly on accelerators with low memory capacities. In this paper, we formulate a matrix-free algorithm for BD simulations, allowing us to scale to very large numbers of particles while also being efficient for small numbers of particles. We discuss the implementation of this method for multicore and many core architectures, as well as a hybrid implementation that splits the workload between CPUs and Intel Xeon Phi coprocessors. For 10,000 particles, the limit of the conventional algorithm on a 32 GB system, the matrix-free algorithm is 35 times faster than the conventional matrix based algorithm. We show numerical tests for the matrix-free algorithm up to 500,000 particles. For large systems, our hybrid implementation using two Intel Xeon Phi coprocessors achieves a speedup of over 3.5x compared to the CPU-only case. Our optimizations also make the matrix-free algorithm faster than the conventional dense matrix algorithm on as few as 1000 particles.
the number of space debris has increased tremendously in the last decade, arousing the interest of the experts in the field. the surveillance of the space is a first step in monitoring the traffic of floating objects ...
详细信息
the number of space debris has increased tremendously in the last decade, arousing the interest of the experts in the field. the surveillance of the space is a first step in monitoring the traffic of floating objects and has several applications such as the correction of orbit coordinates for satellites or collision avoidance. An improved and flexible framework for real-time detection of satellites using a cheap optical surveillance system is proposed in this paper. the detection method is based on the Radon Transform. the satellite candidates resulted after processingthe Radon space are validated by imposing constraints over the satellites length and brightness, and over the stereo matching. We additionally propose a parallel approach for Radon transform on GPU in order to fulfill the real-time constraints. We test our method on a large and variate data set, containing satellites from different orbit ranges, namely medium and high orbits. A high accuracy over 95% was obtained in average for real time satellites detection with minimal false positives.
暂无评论