检索结果-内蒙古大学图书馆

Evaluating Large-Scale Biomedical Ontology Matching Over parallel Platforms

IETE TECHNICAL REVIEW 2016年第4期33卷 415-427页

作者： Amin, Muhammad Bilal Khan, Wajahat Ali Hussain, Shujaat Bui, Dinh-Mao Banos, Oresti Kang, Byeong Ho Lee, Sungyoung Kyung Hee Univ Dept Comp Engn Ubiquitous Comp Lab Yongin South Korea Univ Tasmania Sch Comp & Informat Syst Hobart Tas Australia

Biomedical systems have been using ontology matching as a primary technique for heterogeneity resolution. However, the natural intricacy and vastness of biomedical data have compelled biomedical ontologies to become large-scale and complex;consequently, biomedical ontology matching has become a computationally intensive task. Our parallel heterogeneity resolution system, i.e., SPHeRe, is built to cater the performance needs of ontology matching by exploiting the parallelism-enabled multicore nature of today's desktop PC and cloud infrastructure. In this paper, we present the execution and evaluation results of SPHeRe over large-scale biomedical ontologies. We evaluate our system by integrating it with the interoperability engine of a clinical decision support system (CDSS), which generates matching requests for large-scale NCI, FMA, and SNOMED-CT biomedical ontologies. Results demonstrate that our methodology provides an impressive performance speedup of 4.8 and 9.5times over a quad-core desktop PC and a four virtual machine (VM) cloud platform, respectively.

关键词： Biomedical informatics Multithreading Biomedical ontologies Ontology matching parallel processing parallel programming Semantic web

来源：评论

学校读者我要写书评

暂无评论

OPENCL: A parallel programming STANDARD FOR HETEROGENEOUS COMPUTING SYSTEMS

引用

COMPUTING IN SCIENCE & ENGINEERING 2010年第3期12卷 66-72页

作者： Stone, John E. Gohara, David Shi, Guochun Univ Illinois Beckman Inst Adv Sci & Technol Theoret & Computat Biophys Grp Urbana IL USA Univ Illinois CUDA Ctr Excellence Urbana IL USA Washington Univ Sch Med Dept Biochem & Biophys St Louis MO 63130 USA Washington Univ Sch Med Ctr Computat Biol St Louis MO 63130 USA Penn State Univ University Pk PA 16802 USA Harvard Univ Sch Med Cambridge MA 02138 USA Univ Illinois Natl Ctr Supercomp Applicat Urbana IL USA

The OpenCL standard offers a common API for program execution on systems composed of different types of computational devices such as multicore CPUs, GPUs, or other accelerators.

关键词： API GPU OpenCL standard application program interfaces computational devices computer graphic equipment coprocessors heterogeneous computing systems multicore CPU parallel programming parallel programming standard program execution

来源：评论

学校读者我要写书评

暂无评论

Tweakable parallel OFB mode of operation with delayed thread synchronization

引用

SECURITY AND COMMUNICATION NETWORKS 2016年第10期9卷 1119-1131页

作者： Damjanovic, Boris Simic, Dejan Univ Belgrade Fac Org Sci Dept Informat Syst Jove Ilica 154 Belgrade Serbia Univ Belgrade Fac Org Sci Dept Informat Technol Jove Ilica 154 Belgrade Serbia

Introduction of various cryptographic modes of operation is induced with noted imperfections of symmetric block algorithms. Design of some cryptographic modes of operation has already been exploited as an idea for parallelization of certain algorithms execution. To the best of our knowledge, there is no evidence in the available literature that output feedback (OFB) mode, which is used in satellite communications, has ever been parallelized. In this paper, we consider the performance of a convenient mode of operation, which performs tweakable parallel encryption using xor encrypt xor (XEX) and xor encrypt (XE) constructions in OFB like mode. We make use of an idea similar to the XTS-AES in order to create two parallel tweakable block ciphers. The first of them is designed using XEX construction, while the second is based on XE construction. Each cipher uses two threads to produce corresponding keystreams. Keystreams are first merged with each other and then used in modified tweakable parallel OFB mode of operation. As a proof of the concept, we have implemented a Java application in which these parallel solutions are applied to collect empirical data. The results obtained show that under certain conditions tweakable parallel OFB modes using XEX and XE constructions can achieve performance accelerations up to 10% and to 20%, respectively. Copyright (c) 2015 John Wiley & Sons, Ltd

关键词： cryptography parallel programming performance analysis AES

来源：评论

学校读者我要写书评

暂无评论

Acceleration of Tear Film Map Definition on Multicore Systems

引用

Procedia Computer Science 2016年 80卷 41-51页

作者： Jorge González-Domínguez Beatriz Remeseiro María J. Martín Grupo de Arquitectura de Computadores Universidade da Coruña Campus de Elviña s/n 15071 A Coruña Spain INESC TEC - INESC Technology and Science Campus da FEUP Rua Dr. Roberto Frias 4200–465 Porto Portugal

Dry eye syndrome is a public health problem, and one of the most common conditions seen by eye care specialists. Among the clinical tests for its diagnosis, the evaluation of the interference patterns observed in the tear film lipid layer is often employed. In this sense, tear film maps illustrate the spatial distribution of the patterns over the whole tear film and provide useful information to practitioners. However, the creation of a single map usually takes tens of minutes. Medical experts currently demand applications with lower response time in order to provide a faster diagnosis for their patients. In this work, we explore different parallel approaches to accelerate the definition of the tear film map by exploiting the power of today's ubiquitous multicore systems. They can be executed on any multicore system without special software or hardware requirements. The experimental evaluation determines the best approach (on-demand with dynamic seed distribution) and proves that it can significantly decrease the runtime. For instance, the average runtime of our experiments with 50 real-world images on a system with AMD Opteron processors is reduced from more than 20 minutes to one minute and 12 seconds.

关键词： parallel programming Multithreading Image Segmentation Dry Eye

来源：评论

学校读者我要写书评

暂无评论

Hardware transactional memory: A high performance parallel programming model

引用

JOURNAL OF SYSTEMS ARCHITECTURE 2010年第8期56卷 384-391页

作者： Fu, Chen Wen, Dongxin Wang, Xiaoqun Yang, Xiaozong Harbin Inst Technol Sch Comp Sci & Technol Harbin 150001 Peoples R China

The transactional memory in multicore processors has been a major research area over past several years. Many transactional memory systems have been proposed to be used to solve the synchronization problem of multicore processors. Hardware transactional memory is one of the critical methods to speedup communications in multicore environment. In this paper, we give a review of the current hardware transactional memory systems for multicore processors. We take a top-down approach to characterizing and classifying various hardware transactional design issues and present a taxonomy of hardware transactional memory systems which is consist of the five fundamental design issues: version management, conflict detection, contention management, virtualization and nesting. Finally, we discussed the active research challenge: the relationship between transactional memory and Input/Output operations and system calls. Crown Copyright (C) 2010 Published by Elsevier BM. All rights reserved.

关键词： Multicore processor Transactional memory Hardware parallel programming Synchronization

来源：评论

学校读者我要写书评

暂无评论

Eve: A parallel Event-Driven programming Language

Eve: A Parallel Event-Driven Programming Language

引用

20th Euro-Par International Workshops

作者： Fonseca, Alcides Rafael, Joao Cabral, Bruno Univ Coimbra P-3000 Coimbra Portugal

ISBN: (纸本)9783319143132;9783319143125

We propose a model for event-oriented programming under shared memory based on access permissions with explicit parallelism. In order to obtain safe parallelism, programmers need to specify the variable permissions of functions. Blocking operations are non existent, and callback-based APIs are used instead, which can be called in parallel for different events as long as the access permissions are guaranteed. This model scales for both IO and CPU-bounded programs. We have implemented this model in the Eve language, which includes a compiler that generates parallel tasks with synchronization on top of variables, and a work-stealing runtime that uses the epoll interface to manage the event loop. We have also evaluated that model in micro-benchmarks in programs that are either CPU-intensive or IO-intensive with and without shared data. In CPU-intensive programs, it achieved results very close to multithreaded approaches. In the share-nothing IO-intensive benchmark it outperformed all other solutions. In shared-memory IO-intensive benchmark it outperformed other solutions with a more or equal value of writes than read operations.

关键词： Event-oriented parallel programming IO performance

来源：评论

学校读者我要写书评

暂无评论

Cache Aware Dynamics Data Layout for Efficient Shared Memory parallelisation of EUROPLEXUS

引用

Procedia Computer Science 2016年 80卷 1083-1092页

作者： Marwa Sridi Bruno Raffin Vincent Faucher CEA DEN DANS DM2S SEMT DYN F-91191 Gif sur Yvette France University Grenoble Alpes INRIA France CEA DEN Cadarache DTN/Dir F-13108 St Paul lez Durance France

parallelizing industrial simulation codes like the EUROPLEXUS software dedicated to the analysis of fast transient phenomena, is challenging. In this paper we focus on the efficient parallelization on a multi-core shared memory node. We propose to have each thread gather the data it needs for processing a given iteration range, before to actually advance the computation by one time step on this range. This lazy cache aware layout construction enables to keep the original data structure and leads to very localised code modifications. We show that this approach can improve the execution time by up to 40% when the task size is set to have the data fit in the L2 cache.

关键词： EUROPLEXUS Shared Memory Cache-aware Data Layout parallel programming

来源：评论

学校读者我要写书评

暂无评论

HPCmatlab: A Framework for Fast Prototyping of parallel Applications in Matlab

引用

Procedia Computer Science 2016年 80卷 1461-1472页

作者： Xinchen Guo Mukul Dave Mohamed Sayeed ASU Research Computing Arizona State University Tempe Arizona U.S.

The HPCmatlab framework has been developed for Distributed Memory programming in Matlab/Octave using the Message Passing Interface (MPI). The communication routines in the MPI library are implemented using MEX wrappers. Point-to-point, collective as well as one-sided communication is supported. Benchmarking results show better performance than the Mathworks Distributed Computing Server. HPCmatlab has been used to successfully parallelize and speed up Matlab applications developed for scientific computing. The application results show good scalability, while preserving the ease of programmability. HPCmatlab also enables shared memory programming using Pthreads and parallel I/O using the ADIOS package.

关键词： parallel programming Message Passing Interface Matlab MEX Functions parallel I/O

来源：评论

学校读者我要写书评

暂无评论

parallelization Using Task parallel Library with Task-Based programming Model 5

Parallelization Using Task Parallel Library with Task-Based ...

引用

5th IEEE International Conference on Software Engineering and Service Science (ICSESS)

作者： Hei, Xinhong Zhang, Jinlong Wang, Bin Jin, Haiyan Giacaman, Nasser Xian Univ Technol Sch Engn & Comp Sci Xian Shaanxi Provinc Peoples R China Shaanxi Key Lab Network Comp & Secur Technol Xian Shaanxi Provinc Peoples R China Univ Auckland Dept Elect & Comp Engn Auckland 1 New Zealand

ISBN: (纸本)9781479932795

In order to reduce the complexity of traditional multithreaded parallel programming, this paper explores a new task-based parallel programming using the Microsoft. NET Task parallel Library (TPL). Firstly, this paper proposes a custom data partitioning optimization method to achieve an efficient data parallelism, and applies it to the matrix multiplication. The result of the application supports the custom data partitioning optimization method. Then we develop a task parallel application: Image Blender, and this application explains the efficiency and pitfall aspects associated with task parallelism. Finally, the paper analyzes the performance of our applications. Experiments results show that TPL can dramatically alleviate programmer burden and boost the performance of programs with its task-based parallel programming mechanism.

关键词： parallel programming Task-based TPL Data parallelism Task parallelism

来源：评论

学校读者我要写书评

暂无评论

Implementation of Image Enhancement Algorithms and Recursive Ray Tracing using CUDA

引用

Procedia Computer Science 2016年 79卷 516-524页

作者： Mr. Diptarup Saha Mr. Karan Darji Narendra Patel Darshak Thakore Birla Vishvakarama Mahavidyalaya Vallabh Vidyanagar Anand Gujarat India

This paper intends to achieve high performance in terms of time by implementing various time consuming application on NVIDIA Graphics Processing Unit (GPU) by using parallel programming model NVIDIA Compute Unified Device Architecture (CUDA). NVIDIA CUDA provides platform for developing parallel applications on NVIDIA GPUs. So it gives developers a platform to build high-end parallel processing applications. This paper implements various image processing algorithms on both Central Processing Unit (CPU) and GPU. Implemented point-to-point image processing algorithms are brightening filter, darkening filter, negative filter and RGB to Grayscale filter. Along with various convolution algorithms that consider value of its neighboring pixels are also implemented. Implemented convolution algorithms are sobel filter for edge detection, low pass filter and high pass filter. Performance analysis of the implemented image processing algorithms is done on both CPU and GPU. Analysis is made on images of resolution 3000 X 3000. Color-ed images are used for point-to-point pixel processing algorithms. Grayscale images are used for all convolution algorithms. Performance analysis done for point-to-point processing algorithms by varying number of threads per block. Recursive ray tracing is also implemented on GPU, and found performance gain compare to serial algorithm run on CPU.

关键词： CUDA Image Processing NVIDIA GPU parallel programming

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：