检索结果-内蒙古大学图书馆

12th international Scientific conference on parallel Computational Technologies (PCT)

作者： Tyutlyaeva, Ekaterina Konyukhov, Sergey Odintsov, Igor Moskovsky, Alexander Zhizhin, Mikhail ZAO RSC Technol Moscow Russia Univ Colorado Inst Space Res Denver CO 80202 USA

ISBN: (纸本)9783319996738;9783319996721

the main goal of this work is to analyze the behavior of a nighttime image processing module and find out basic estimates of required computational time and energy consumption for processing large data archives. As part of this work, we have performed the code refactoring of the most computing-intensive module in a system for detecting fishing boat lights. the algorithm is capable of detecting isolated bright spikes that are sharply visible on the sea surface at night. the refactored module has been optimized for effective usage of multi- and many-core Intel Xeon architectures. In the paper, we describe the algorithmic complexity for all computational stages of the module. Also, we have collected detailed statistic data for two data sets, different input parameter sets, and three test beds: Intel (R) Xeon (R) E5-2697A (codename Broadwell), Intel (R) Xeon (R) Gold 6148 (Skylake), and Intel (R) Xeon Phi (R) 7250 (KNL). Key correlations between module behavior and energy consumption are also included in the paper. the results of the study were used for calculations of the estimate time and energy requirements for a whole year archive of day/night band (DNB) images from the Visible Infrared Imaging Radiometer Suite (VIIRS). Moreover, driving factors, including price and legacy software systems, are presented for discussion.

关键词： Nighttime imaging processing Energy consumption analysis Nighttime image processing module Archive processing analysis

来源：评论

学校读者我要写书评

暂无评论

Greed is Good: parallel algorithms for Bipartite-Graph Partial Coloring on Multicore architectures 46

Greed is Good: Parallel Algorithms for Bipartite-Graph Parti...

引用

46th international conference on parallel processing Workshops (ICPPW)

作者： Tas, Mustafa Kemal Kaya, Kamer Saule, Erik Sabanci Univ Comp Sci & Engn Istanbul Turkey Ohio State Univ Dept Biomed Informat Columbus OH 43210 USA Univ N Carolina Comp Sci Charlotte NC 28223 USA

ISBN: (纸本)9781538610428

In parallel computing, a valid graph coloring yields a lock-free processing of the colored tasks, data points, etc., without expensive synchronization mechanisms. However, coloring is not free and the overhead can be significant. In particular, for the bipartite-graph partial coloring (BGPC) and distance-2 graph coloring (D2GC) problems, which have various use-cases within the scientific computing and numerical optimization domains, the coloring overhead can be in the order of minutes with a single thread for many real-life graphs. In this work, we propose parallel algorithms for bipartite-graph partial coloring on shared-memory architectures. Compared to the existing shared-memory BGPC algorithms, the proposed ones employ greedier and more optimistic techniques that yield a better parallel coloring performance. In particular, on 16 cores, the proposed algorithms are more than 4x faster than their counterparts in the ColPack library which is, to the best of our knowledge, the only publicly-available coloring library for multicore architectures. In addition to BGPC, the proposed techniques are employed to devise parallel distance-2 graph coloring algorithms and similar performance improvements have been observed. Finally, we propose two costless balancing heuristics for BGPC that can reduce the skewness and imbalance on the cardinality of color sets (almost) for free. the heuristics can also be used for the D2GC problem and in general, they will probably yield a better color-based parallelization performance especially on many-core architectures.

关键词： Greedy graph coloring bipartite-graph coloring distance-2 coloring shared-memory parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Accelerating processing of Scale-Free Graphs on Massively-parallel architectures 1

引用

17th international conference on algorithms and architectures for parallel processing (ICA3PP)

作者： Chernoskutov, Mikhail Ural Fed Univ Krasovskii Inst Math & Mech Ekaterinburg Russia

ISBN: (数字)9783319654829

ISBN: (纸本)9783319654829;9783319654812

processing of big scale-free graphs on parallel architectures with high parallelization opportunities connected with a lot of overheads. Due to skewed degree distribution each thread receives different amount of computational workload. In this paper we present a method devoted to address this challenge by modificating CSR data structure and redistributing work across threads. the method was implemented in breadth-first search and single source shortest path algorithms for GPU architecture.

关键词： parallel processing Graph algorithms Workload balancing

来源：评论

学校读者我要写书评

暂无评论

Supporting Energy-Efficient Computing on Heterogeneous CPU-GPU architectures 5

Supporting Energy-Efficient Computing on Heterogeneous CPU-G...

引用

IEEE 5th international conference on Future Internet of things and Cloud (FiCloud)

作者： Siehl, Kyle Zhao, Xinghui Washington State Univ Sch Engn & Comp Sci Vancouver WA 98686 USA

ISBN: (纸本)9781538620748

Modern high performance computing and cloud computing infrastructures often leverage Graphic processing Units (GPUs) to provide accelerated, massively parallel computational power. this performance gain, however, may also introduce higher energy consumption. the energy challenge has become more and more pronounced when the system scales. To address this challenge, we propose Archon, a framework for supporting energy-efficient computing on CPU-GPU heterogeneous architectures. Specifically, Archon takes user's programs as input, automatically distribute the workload between CPU and GPU, and dynamically tunes the distribution ratio at runtime for an energy-efficient execution. Experiments have been carried out to evaluate the effectiveness of Archon, and the results show that it can achieve considerable energy savings at runtime, without significant efforts from the programmers.

关键词： Energy Efficiency GPGPU Computing Heterogeneous architectures Hybrid Computing

来源：评论

学校读者我要写书评

暂无评论

Multidimensional Performance and Scalability Analysis for Diverse Applications Based on System Monitoring Data 12th

Multidimensional Performance and Scalability Analysis for Di...

引用

12th international conference on parallel processing and Applied Mathematics (PPAM)

作者： Neytcheva, Maya Holmgren, Sverker Bull, Jonathan Dorostkar, Ali Kruchinina, Anastasia Nikitenko, Dmitry Popova, Nina Shvets, Pavel Teplov, Alexey Voevodin, Vadim Voevodin, Vladimir Uppsala Univ Dept Informat Technol Uppsala Sweden Lomonosov Moscow State Univ Res Comp Ctr Moscow Russia

ISBN: (纸本)9783319780245;9783319780238

the availability of high performance computing resources enables us to perform very large numerical simulations and in this way to tackle challenging real life problems. At the same time, in order to efficiently utilize the computational power at our disposal, the ever growing complexity of the computer architecture poses high demands on the algorithms and their implementation. Performing large scale high performance simulations can be done by utilizing available general libraries, writing libraries that suit particular classes of problems or developing software from scratch. Clearly, the possibilities to enhance the efficiency of the software tools in the three cases is very different, ranging from nearly impossible to full capacity. In this work we exemplify the efficiency of the three approaches on benchmark problems, using monitoring tools that provide a very rich spectrum of data on the performance of the applied codes as well as on the utilization of the supercomputer itself.

关键词： Supercomputing application efficiency analysis parallel program High-performance computing

来源：评论

学校读者我要写书评

暂无评论

Efficient RDF dictionaries with B+ trees 18

Efficient RDF dictionaries with B+ trees

引用

ACM India Joint 5th international conference on Data Science and 23rd conference on Management of Data, CoDS-COMAD 2018

作者： Singh, Gurkirat Upadhyay, Dhawal Atre, Medha Computer Science and Engg Indian Institute of Technology Kanpur India

ISBN: (纸本)9781450363419

Resource Description Framework (RDF) graphs are widely used for representing semantically linked data in various domains. Many modern RDF specific storage, indexing, and query optimization systems internally represent the node and edge labels of the RDF graphs as integer IDs. Hence they require dictionaries for converting the strings in a SPARQL query into their corresponding IDs, and the SPARQL query results in the ID form into their corresponding strings. Most of the SPARQL query processing systems have focused on the techniques for indexing of RDF graphs and the optimization of the joins in the SPARQL Basic Graph Pattern (BGP) queries, but the dictionaries that map RDF graph string labels to the IDs and back have remained a neglected component. Dictionaries are important for an "end-to-end user experience" of SPARQL query processing over large RDF graphs. Hence, in this paper, we have specifically focused on building efficient RDF dictionaries using B+ trees. Our key contributions are – (a) building an ensemble of B+ trees, instead of one giant B+ tree, to maintain a low average height across the ensemble, (b) a hashing technique for storing the string labels as search-keys to reduce the space consumption, maintain a higher B+ tree order, and more uniform search-key distribution across memory pages, (c) using multi-core parallel processing for fast dictionary construction, and (d) novel bulk reverse lookup methods. We have also presented an extensive experimental evaluation of our techniques over a set of 126,444,964 labels of a real-life DBPedia RDF graph. © 2018 Association for Computing Machinery.

关键词： Query processing

来源：评论

学校读者我要写书评

暂无评论

NVIDIA GPUs Scalability to Solve Multiple (Batch) Tridiagonal Systems Implementation of cuthomasBatch 12th

NVIDIA GPUs Scalability to Solve Multiple (Batch) Tridiagona...

引用

12th international conference on parallel processing and Applied Mathematics (PPAM)

作者： Valero-Lara, Pedro Martinez-Perez, Ivan Sirvent, Raul Martorell, Xavier Pena, Antonio J. BSC Barcelona Spain Univ Politecn Cataluna Barcelona Spain

ISBN: (纸本)9783319780245;9783319780238

the solving of tridiagonal systems is one of the most computationally expensive parts in many applications, so that multiple studies have explored the use of NVIDIA GPUs to accelerate such computation. However, these studies have mainly focused on using parallel algorithms to compute such systems, which can efficiently exploit the shared memory and are able to saturate the GPUs capacity with a low number of systems, presenting a poor scalability when dealing with a relatively high number of systems. We propose a new implementation (cuthomasBatch) based on the thomas algorithm. To achieve a good scalability using this approach is necessary to carry out a transformation in the way that the inputs are stored in memory to exploit coalescence (contiguous threads access to contiguous memory locations). the results given in this study proves that the implementation carried out in this work is able to beat the reference code when dealing with a relatively large number of Tridiagonal systems (2,000-256,000), being closed to 3x (in double precision) and 4x (in single precision) faster using one Kepler NVIDIA GPU.

关键词： Tridiagonal linear systems Scalability thomas algorithm PCR CR parallel processing cuSPARSE CUDA

来源：评论

学校读者我要写书评

暂无评论

Improving matrix-based dynamic programming on massively parallel accelerators

引用

INFORMATION SYSTEMS 2017年第Mar.期64卷 175-193页

作者： Bednarek, David Brabec, Michal Krulis, Martin Charles Univ Prague Fac Math & Phys Parallel Architectures Algorithms Applicat Res Gr Malostranske Nam 25 Prague Czech Republic

Dynamic programming techniques are well-established and employed by-various practical algorithms, including the edit-distance algorithm or the dynamic time warping algorithm. these algorithms usually operate in an iteration-based manner where new values are computed from values of the previous iteration. the data dependencies enforce synchronization which limits possibilities for internal parallel processing. In this paper, we investigate parallel approaches to processing matrix-based dynamic programming algorithms on modern multicore CPUs, Intel Xeon Phi accelerators, and general purpose GPUs. We address both the problem of computing a single distance on large inputs and the problem of computing a number of distances of smaller inputs simultaneously (e.g., when a similarity query is being resolved). Our proposed solutions yielded significant improvements in performance and achieved speedup of two orders of magnitude when compared to the serial baseline. (C) 2016 Elsevier Ltd. All rights reserved.

关键词： parallel Multicore GPU Intel Xeon Phi Dynamic programming Edit distance Dynamic time warping

来源：评论

学校读者我要写书评

暂无评论

PMORSy: parallel sparse matrix ordering software for fill-in minimization

引用

OPTIMIZATION MEthODS & SOFTWARE 2017年第2期32卷 274-289页

作者： Pirova, Anna Meyerov, Iosif Kozinov, Evgeniy Lebedev, Sergey Lobachevsky State Univ Nizhni Novgorod Inst Informat Technol Math & Mech Dept Software & Supercomp Technol Nizhnii Novgorod Russia

In this paper we present PMORSya new parallel software package for symmetric sparse matrix ordering on shared memory systems. the NP-complete fill-in minimization problem is solved by means of multilevel nested dissection algorithm with modifications for vertex separators. parallel processing is done in a task-based fashion with the granularity tuning. We employ threading techniques on shared memory using OpenMP 3.0 technology as opposed to the Message Passing Interface-based approach widely used for parallel sparse matrix ordering. Experimental results on symmetric matrices from the University of Florida Sparse Matrix Collection and matrices from finite-element analysis of three-dimensional strength problems show that our implementation is competitive to the ParMETIS and PT-Scotch libraries both in ordering quality and performance. the PMORSy library is publicly available from the Lobachevsky State University Supercomputing Center web-site.

关键词： fill-in minimization multilevel nested dissection sparse matrix ordering Cholesky factorization parallel computing task-based parallel processing

来源：评论

学校读者我要写书评

暂无评论

A java code protection scheme via dynamic recovering runtime instructions 18th

A java code protection scheme via dynamic recovering runtime...

引用

18th international conference on algorithms and architectures for parallel processing, ICA3PP 2018

作者： Jiajia, Sun Jinbao, Gao Yu-an, Tan Yu, Zhang Xiao, Yu School of Computer Science and Technology Beijing Institute of Technology Beijing100081 China School of Electrical and Information Engineering Beijing Key Laboratory of Intelligent Processing for Building Big Data Beijing University of Civil Engineering and Architecture Beijing100044 China China University of Mining and Technology Beijing100083 China Department of Computer Science and Technology Shandong University of Technology ZiboShandong255022 China

ISBN: (纸本)9783030050627

As Android operating system and applications on the device play important roles, the security requirements of Android applications increased as well. With the upgrade of Android system, Android runtime mode (ART mode) has gradually become the mainstream architecture of the Android operating system. ART introduces several improvements in Android, but it also introduces new ways to enhance malicious activities. this paper proposed a confidential finer granularity protection scheme for application programs under ART mode of ROOT Android devices. Taking Java method as the protection granularity, the protection scheme increased the accuracy of protecting targets. In addition, the protection scheme provided a more thorough protection for applications by combining dynamic loading technology and encryption technology in ART mode, and improved the security of Android applications. Experiments showed that the proposed protection scheme is effective. © Springer Nature Switzerland AG 2018.

关键词： Android (operating system)

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：