检索结果-内蒙古大学图书馆

A view of programming scalable data analysis: from clouds to exascale

JOURNAL OF CLOUD COMPUTING-ADVANCES SYSTEMS AND APPLICATIONS 2019年第1期8卷 1-16页

作者： Talia, Domenico Univ Calabria DIMES Arcavacata Di Rende Italy

Scalability is a key feature for big data analysis and machine learning frameworks and for applications that need to analyze very large and real-time data available from data repositories, social media, sensor networks, smartphones, and the Web. Scalable big data analysis today can be achieved by parallel implementations that are able to exploit the computing and storage facilities of high performance computing (HPC) systems and clouds, whereas in the near future Exascale systems will be used to implement extreme-scale data analysis. Here is discussed how clouds currently support the development of scalable data mining solutions and are outlined and examined the main challenges to be addressed and solved for implementing innovative data analysis applications on Exascale systems.

关键词： Big data analysis Cloud computing Exascale computing Data mining parallel programming Scalability

来源：评论

学校读者我要写书评

暂无评论

Novel circuit designs of memristor synapse and neuron

引用

NEUROCOMPUTING 2019年 330卷 11-16页

作者： Hong, Qinghui Zhao, Liang Wang, Xiaoping Huazhong Univ Sci & Technol Sch Automat Wuhan 430074 Hubei Peoples R China

In this work, novel circuits based on memristors for implementing electronic synapse and artificial neuron are designed. First, two simple synaptic circuits for implementing weighting calculations of voltage and current modes using twin memristors are proposed. A synaptic weighting operation is defined as a difference function between the twin memristors, which can be adjusted in reverse by applying programmed signals and conducting positive, zero, and negative synaptic weights. Second, two neuron circuits using the proposed memristor synapses, in which parallel computing and programming can be achieved, are designed. Finally, performances of the proposed memristor synapses and neuron circuits, such as weight programming, neuron computing, and parallel operation, are analyzed through PSpice simulations. (C) 2018 Elsevier B.V. All rightsreserved.

关键词： Memristor Synaptic circuit Neuron circuit parallel programming

来源：评论

学校读者我要写书评

暂无评论

Full-neighbor-list based numerical reproducibility method for parallel molecular dynamics simulations

引用

parallel COMPUTING 2019年 85卷 109-118页

作者： Xu, Liyang Ren, Xiaoguang Wang, Qian Xu, Xinhai Yang, Xuejun Natl Univ Def Technol State Key Lab High Performance Comp Changsha Hunan Peoples R China Natl Innovat Inst Def Technol Artificial Intelligence Res Ctr Beijing Peoples R China Natl Innovat Inst Def Technol Beijing Peoples R China

The numerical nonreproducibility in parallel molecular dynamics (MD) simulations, which relates to the non-associate accumulation of float point data, leads to great challenges for development, debugging and validation. The most common solutions to this problem are using a high-precision data type or operation sorting, but these solutions are accompanied by significant computational overhead. This paper analyzes the sources of nonreproducibility in parallel MD simulations in detail. Two general solutions, namely, sorting by force component value and using an 80-bit long double data type, are implemented and evaluated in LAMMPS. To optimize the computational cost, a full-list based method with operation order sorted by particle distance is proposed, which is inspired by the spatial characteristics of MD simulations. An experiment on a system with constant energy dynamics shows that the new method can ensure reproducibility at any parallelism with an extra 50% computational overhead. (C) 2019 Published by Elsevier B.V.

关键词： Molecular dynamics Numerical reproducibility parallel programming LAMMPS Floating-Point arithmetic

来源：评论

学校读者我要写书评

暂无评论

DuctTeip: An efficient programming model for distributed task based parallel computing

arXiv

引用

arXiv 2018年

作者： Zafari, Afshin Larsson, Elisabeth Tillenius, Martin Uppsala University Department of Information Technology Box 337 UppsalaSE-751 05 Sweden

Current high-performance computer systems used for scientific computing typically combine shared memory computational nodes in a distributed memory environment. Extracting high performance from these complex systems requires tailored approaches. Task based parallel programming has been successful both in simplifying the programming and in exploiting the available hardware parallelism for shared memory systems. In this paper we focus on how to extend task parallel programming to distributed memory systems. We use a hierarchical decomposition of tasks and data in order to accommodate the different levels of hardware. We test the proposed programming model on two different applications, a Cholesky factorization, and a solver for the Shallow Water Equations. We also compare the performance of our implementation with that of other frameworks for distributed task parallel programming, and show that it is competitive. Copyright © 2018, The Authors. All rights reserved.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

OpenACC parallelization of Stochastic Simulations on GPUs

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2019年第8期E102D卷 1565-1568页

作者： Kang, Pilsung Sun Moon Univ Dept Comp Engn Asan South Korea

We present an OpenACC-based parallelization implementation of stochastic algorithms for simulating biochemical reaction networks on modern GPUs (graphics processing units). To investigate the effectiveness of using OpenACC for leveraging the massive hardware parallelism of the GPU architecture, we carefully apply OpenACC's language constructs and mechanisms to implementing a parallel version of stochastic simulation algorithms on the GPU. Using our OpenACC implementation in comparison to both the NVidia CUDA and the CPU-based implementations, we report our initial experiences on OpenACC's performance and programming productivity in the context of GPU-accelerated scientific computing.

关键词： GPU computing OpenACC parallel programming stochastic simulation

来源：评论

学校读者我要写书评

暂无评论

Accelerating Stochastic Simulations on GPUs Using OpenCL

引用

IEICE TRANSACTIONS ON INFORMATION AND SYSTEMS 2019年第11期E102D卷 2253-2256页

作者： Kang, Pilsung Sun Moon Univ Div Comp Sci & Engn Asan South Korea

Since first introduced in 2008 with the 1.0 specification, OpenCL has steadily evolved over the decade to increase its support for heterogeneous parallel systems. In this paper, we accelerate stochastic simulation of biochemical reaction networks on modern GPUs (graphics processing units) by means of the OpenCL programming language. In implementing the OpenCL version of the stochastic simulation algorithm, we carefully apply its data-parallel execution model to optimize the performance provided by the underlying hardware parallelism of the modern GPUs. To evaluate our OpenCL implementation of the stochastic simulation algorithm, we perform a comparative analysis in terms of the performance using the CPU-based cluster implementation and the NVidia CUDA implementation. In addition to the initial report on the performance of OpenCL on GPUs, we also discuss applicability and programmability of OpenCL in the context of GPU-based scientific computing.

关键词： GPU computing OpenCL parallel programming stochastic simulation

来源：评论

学校读者我要写书评

暂无评论

Detection of Simulated Brain Strokes Using Microwave Tomography

引用

IEEE JOURNAL OF ELECTROMAGNETICS RF AND MICROWAVES IN MEDICINE AND BIOLOGY 2019年第4期3卷 254-260页

作者： Coli, Vanna Lisa Tournier, Pierre-Henri Dolean, Victorita El Kanfoud, Ibtissam Pichot, Christian Migliaccio, Claire Blanc-Feraud, Laure Univ Cote dAzur CNRS CEPAM F-06357 Nice 4 France Univ Paris Diderot SPC Univ Sorbonne CNRS INRIALJLL F-75005 Paris France Univ Cote dAzur LJAD CNRS F-06108 Nice 2 France Univ Strathclyde Dept Math & Stat Glasgow G1 1XH Lanark Scotland Univ Cote dAzur CNRS LEAT F-06903 Sophia Antipolis France Univ Cote dAzur CNRS INRIA I3S F-06900 Sophia Antipolis France

Brain strokes are one of the leading causes of disability and mortality in adults in developed countries. Ischemic stroke (85% of total cases) and hemorrhagic stroke (15%) must be treated with opposing therapies, and thus, the nature of the stroke must be determined quickly in order to apply the appropriate treatment. Recent studies in biomedical imaging have shown that strokes produce variations in the complex electric permittivity of brain tissues, which can be detected by means of microwave tomography. Here, we present some synthetic results obtained with an experimental microwave tomography-based portable system for the early detection andmonitoring of brain strokes. The determination of electric permittivity first requires the solution of a coupled forward-inverse problem. We make use of massive parallel computation from domain decomposition method and regularization techniques for optimization methods. Synthetic data are obtained with electromagnetic simulations corrupted by noise, which have been derived from measurements errors of the experimental imaging system. Results demonstrate the possibility to detect hemorrhagic strokes with microwave systems when applying the proposed reconstruction algorithm with edge preserving regularization.

关键词： Microwave imaging biomedical imaging inverse problems mathematical programming optimizationmethods signal reconstruction dielectric constant medical image processing parallel programming brain stroke imaging domain-specific language gradient based minimization algorithm regularization methods total variation hemorrhagic brain stroke detection high-speed parallel computing iterative microwave tomographic imaging massively parallel computing numerical modeling open source FreeFem plus plus solver whole-microwave measurement system brain modeling computational modeling tomography

来源：评论

学校读者我要写书评

暂无评论

The programming of sequences of saccades

引用

EXPERIMENTAL BRAIN RESEARCH 2019年第4期237卷 1009-1018页

作者： McSorley, Eugene Gilchrist, Iain D. McCloy, Rachel Univ Reading Sch Psychol & Clin Language Sci Reading RG6 6AL Berks England Univ Bristol Sch Expt Psychol Bristol BS8 1TU Avon England

Saccadic eye movements move the high-resolution fovea to point at regions of interest. Saccades can only be generated serially (i.e., one at a time). However, what remains unclear is the extent to which saccades are programmed in parallel (i.e., a series of such moments can be planned together) and how far ahead such planning occurs. In the current experiment, we investigate this issue with a saccade contingent preview paradigm. Participants were asked to execute saccadic eye movements in response to seven small circles presented on a screen. The extent to which participants were given prior information about target locations was varied on a trial-by-trial basis: participants were aware of the location of the next target only, the next three, five, or all seven targets. The addition of new targets to the display was made during the saccade to the next target in the sequence. The overall time taken to complete the sequence was decreased as more targets were available up to all seven targets. This was a result of a reduction in the number of saccades being executed and a reduction in their saccade latencies. Surprisingly, these results suggest that, when faced with a demand to saccade to a large number of target locations, saccade preparation about all target locations is carried out in parallel.

关键词： Saccade Sequences parallel programming Eye movements

来源：评论

学校读者我要写书评

暂无评论

Scheduling Mutual Exclusion Accesses in Equal-Length Jobs

引用

ACM TRANSACTIONS ON parallel COMPUTING 2019年第2期6卷 1–26页

作者： Kagaris, Dimitri Dutta, Sourav Southern Illinois Univ Elect & Comp Engn Dept 1230 Lincoln Dr Carbondale IL 62901 USA

A fundamental problem in parallel and distributed processing is the partial serialization that is imposed due to the need for mutually exclusive access to common resources. In this article, we investigate the problem of optimally scheduling (in terms of makespan) a set of jobs, where each job consists of the same number L of unit-duration tasks, and each task either accesses exclusively one resource from a given set of resources or accesses a fully shareable resource. We develop and establish the optimality of a fast polynomial-time algorithm to find a schedule with the shortest makespan for any number of jobs and for any number of resources for the case of L = 2. In the notation commonly used for job-shop scheduling problems, this result means that the problem J vertical bar d(ij) = 1, n(j) =2 vertical bar C-max ax is polynomially solvable, adding to the polynomial solutions known for the problems J2 vertical bar n(j) <= 2 vertical bar C-max and J2 vertical bar d(ij) = 1 vertical bar C-max (whereas other closely related versions such as J2 vertical bar n(j)<= 3 vertical bar C-max, J2 vertical bar d(ij) is an element of {1,2}C-max, J3 vertical bar d(ij) =1 vertical bar C-max, J3 vertical bar d(ij) =1 vertical bar and J vertical bar d(ij) =1, n(j) <= 3 vertical bar C-max are all known to be NP-complete). For the general case L > 2 (i.e., for the job-shop problem J vertical bar d(ij) =1, nj = L >2 vertical bar C-max) we present a competitive heuristic and provide experimental comparisons with other heuristic versions and, when possible, with the ideal integer linear programming formulation.

关键词： Job-shop scheduling mutual exclusion critical resources parallel programming polynomial-time algorithm

来源：评论

学校读者我要写书评

暂无评论

parallel programming for the web 4

Parallel programming for the web

引用

4th USENIX Workshop on Hot Topics in parallelism, HotPar 2012

作者： Herhut, Stephan Hudson, Richard L. Shpeisman, Tatiana Sreeram, Jaswanth Intel Labs

parallel hardware is today's reality and language extensions that ease exploiting its promised performance ourish. For most mainstream languages, one or more tailored solutions exist that address the specific needs of the language to access parallel hardware. Yet, one widely used language is still stuck in the sequential past: JavaScript, the lingua franca of the web. Our position is that existing solutions do not transfer well to the world of JavaScript due to differences in programming models, the additional requirements of the web, like safety, and to developer expectations. To address this we propose River Trail, a new parallel programming API designed specifically for JavaScript and we show how it satisfies the needs of the web. To prove that our approach is viable, we have implemented a prototype JIT compiler in Firefox that shows an order of magnitude performance improvement for a realistic web application. © HotPar 2012.

关键词： parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：