检索结果-内蒙古大学图书馆

16th International Symposium on Problems of Redundancy in Information and Control Systems (REDUNDANCY)

作者： Rumenko, Nikita Kostyuck, Alexander Moscow Tech Univ Commun & Informat Moscow Russia

ISBN: (纸本)9781728119441

In this article we consider using random mappings to solve sparse binary subset sums via collision search. A mapping is constructed that suits our purpose and two parallel algorithms are proposed based on known collision-finding techniques. Following the applicability of binary subset sums, results of this paper are relevant to learning parities with noise, decoding random codes and related problems.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms For Enabling Fast And Scalable Analysis Of High-Throughput Sequencing Datasets

Parallel Algorithms For Enabling Fast And Scalable Analysis ...

引用

作者： Nagakishore Jammula Georgia Institute of Technology

学位级别：博士

The objective of this research is to develop parallel algorithms for enabling fast and scalable analysis of large-scale high-throughput sequencing datasets. Genome of an organism consists of one or more long DNA sequences called chromosomes, each a sequence of bases. Depending on the organism, the length of the genome can vary from several thousand bases to several billion bases. Genome sequencing, which involves deciphering the sequence of bases of the genome, is an important tool in genomics research. Sequencing instruments widely deployed today can only read short DNA sequences. However, these instruments can read up to several billion such sequences at a time, and are used to sequence a large number of randomly generated short fragments from the genome. These fragments are a few hundred bases long and are commonly referred to as âreadsâ. This work specifically tackles three problems associated with high-throughput sequencing short read datasets: (1) parallel read error correction for large-scale genomics datasets, (2) Partitioning of large-scale high-throughput sequencing datasets, and (3) parallel compression of large-scale genomics datasets.

关键词： High-throughput sequencing parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

A Family of Fast parallel Greedy algorithms for the Steiner Forest Problem 36

A Family of Fast Parallel Greedy Algorithms for the Steiner ...

引用

36th IEEE International parallel and Distributed Processing Symposium (IEEE IPDPS)

作者： Ghalami, Laleh Grosu, Daniel Wayne State Univ Dept Comp Sci Detroit MI 48202 USA

ISBN: (纸本)9781665497473

We introduce a family of fast parallel greedy algorithms for the Steiner Forest Problem, a fundamental combinatorial optimization problem in graphs. Given an undirected graph with non-negative weights for edges and a set of pairs of vertices called terminals, the Steiner Forest Problem is to find the minimum cost subgraph that connects each of the terminal pairs together. We design a family of parallel algorithms based on a sequential heuristic greedy algorithm called Paired Greedy which iteratively connects the terminal pairs that have the minimum distance. The family of parallel algorithms consists of a set of algorithms exhibiting various degrees of parallelism determined by the number of pairs that are connected in parallel in each iteration of the algorithms. We implement and run the algorithms on a multi-core system and perform an extensive experimental analysis. The results show that our proposed parallel algorithms achieve significant speedup with respect to the sequential Paired Greedy algorithm and provide solutions with costs that are very close to those of the solutions obtained by the sequential Paired Greedy algorithm.

关键词： Steiner forest parallel algorithms multi-core

来源：评论

学校读者我要写书评

暂无评论

Non-Blocking Technique for parallel algorithms with Global Barrier Synchronization

Non-Blocking Technique for Parallel Algorithms with Global B...

引用

International Conference on Computational Science and Computational Intelligence (CSCI)

作者： Arturo Garza Claudio A. Parra Isaac D. Scherson Department of Computer Science University of California Irvine Irvine CA USA

ISBN: (纸本)9781665458429

Sharing data among asynchronous processes is considered to be a hard systems problem in multithreaded modern shared-memory multicore systems. Throughout the literature, multiple solutions have been proposed, like the so-called barrier synchronization. A Barrier is a synchronization primitive that provides guarantees that any thread will not continue execution from a given point until all threads have reached that point. This primitive is widely used in different parallel programming models, but it can easily become a hot-spot for performance critical applications due to its global nature as one preempted thread will stop execution of all other threads waiting at the barrier. This paper suggests a technique to change the global nature of barrier synchronization into a non-blocking synchronization model with lock-free thread progression guarantees. The main idea is to exploit algorithm-based memory access patterns to implement self-synchronizable threads to protect concurrent reads and writes in a shared data structure without explicit use of a barrier primitive. To the best of our knowledge, this is the first attempt to provide a different synchronization mechanism based on the algorithm intrinsic characteristics rather than an explicit use of a global barrier in shared-memory architectures. Our experimental results show factors of performance improvement against its global barrier-based algorithm counterpart.

关键词： Scientific computing parallel programming Multicore processing Instruction sets Data structures Synchronization parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

MULTILEVEL ALGEBRAIC APPROACH FOR PERFORMANCE ANALYSIS OF parallel algorithms

引用

COMPUTING AND INFORMATICS 2019年第4期38卷 817-850页

作者： D'Amore, Luisa Mele, Valeria Romano, Diego Laccetti, Giuliano Univ Naples Federico II Naples Italy CNR Inst High Performance Comp & Networking ICAR Naples Italy

In order to solve a problem in parallel we need to undertake the fundamental step of splitting the computational tasks into parts, i.e. decomposing the problem solving. A whatever decomposition does not necessarily lead to a parallel algorithm with the highest performance. This topic is even more important when complex parallel algorithms must be developed for hybrid or heterogeneous architectures. We present an innovative approach which starts from a problem decomposition into parts (sub-problems). These parts will be regarded as elements of an algebraic structure and will be related to each other according to a suitably defined dependency relationship. The main outcome of such framework is to define a set of block matrices (dependency, decomposition, memory accesses and execution) which simply highlight fundamental characteristics of the corresponding algorithm, such as inherent parallelism and sources of overheads. We provide a mathematical formulation of this approach, and we perform a feasibility analysis for the performance of a parallel algorithm in terms of its time complexity and scalability. We compare our results with standard expressions of speed up, efficiency, overhead, and so on. Finally, we show how the multilevel structure of this framework eases the choice of the abstraction level (both for the problem decomposition and for the algorithm description) in order to determine the granularity of the tasks within the performance analysis. This feature is helpful to better understand the mapping of parallel algorithms on novel hybrid and heterogeneous architectures.

关键词： Complexity and performance of numerical algorithms performance metrics data decomposition concurrency parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel Connectivity algorithms 3

Parallel Connectivity Algorithms

引用

3rd International Informatics and Software Engineering Conference, IISEC 2022

作者： Erciyes, Kayhan Saribatir, Behcet Melih Yaşar University Computer Engineering Dept. İzmir Turkey

ISBN: (数字)9781665459952

ISBN: (纸本)9781665459952

We propose and implement two parallel algorithms to test the connectivity and find the connected components of a network in parallel. In both cases, the connectivity matrix of the graph is partitioned to p processors. The first parallel algorithm (Alg. 2) processors test connectivity in their partitions and then cooperate to decide. The second parallel algorithm (Alg. 4) forms a labelled connectivity matrix and then partitions this matrix to processors to find the components of a disconnected graph. We show both algorithms achieve significant speedups even with only few processors. © 2022 IEEE.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Fast parallel algorithms for Counting and Listing Triangles in Big Graphs

引用

ACM TRANSACTIONS ON KNOWLEDGE DISCOVERY FROM DATA 2020年第1期14卷 5-5页

作者： Arifuzzaman, Shaikh Khan, Maleq Marathe, Madhav Univ New Orleans Comp Sci Dept 2000 Lakeshore DrMath 349 New Orleans LA 70122 USA Texas A&M Univ Kingsville Dept Elect Engn & Comp Sci 700 Univ Blvd Kingsville TX 78363 USA Univ Virginia Dept Comp Sci 85 Engineers Way Charlottesville VA 22904 USA

Big graphs (networks) arising in numerous application areas pose significant challenges for graph analysts as these graphs grow to billions of nodes and edges and are prohibitively large to fit in the main memory. Finding the number of triangles in a graph is an important problem in the mining and analysis of graphs. In this article, we present two efficient MPI-based distributed memory parallel algorithms for counting triangles in big graphs. The first algorithm employs overlapping partitioning and efficient load balancing schemes to provide a very fast parallel algorithm. The algorithm scales well to networks with billions of nodes and can compute the exact number of triangles in a network with 10 billion edges in 16 minutes. The second algorithm divides the network into non-overlapping partitions leading to a space-efficient algorithm. Our results on both artificial and real-world networks demonstrate a significant space saving with this algorithm. We also present a novel approach that reduces communication cost drastically leading the algorithm to both a space- and runtime-efficient algorithm. Further, we demonstrate how our algorithms can be used to list all triangles in a graph and compute clustering coefficients of nodes. Our algorithm can also be adapted to a parallel approximation algorithm using an edge sparsification method.

关键词： Triangle-counting clustering-coefficient massive networks parallel algorithms social networks graph mining

来源：评论

学校读者我要写书评

暂无评论

Efficient CGM-based parallel algorithms for the longest common subsequence problem with multiple substring-exclusion constraints

引用

parallel COMPUTING 2020年 91卷 102598-102598页

作者： Tchendji, Vianney Kengne Ngomade, Armel Nkonjoh Zeutouo, Jerry Lacmou Myoupo, Jean Frederic Univ Dschang Dept Math & Comp Sci Dschang Cameroon Univ Picardie Jules Verne Comp Sci Lab MIS Amiens France

A variant of the Longest Common Subsequence (LCS) problem is the LCS problem with multiple substring-exclusion constraints (M-STR-EC-LCS), which has great importance in many fields especially in bioinformatics. This problem consists to compute the LCS of two strings X and Y of length n and m respectively that excluded a set of d constraints P = {P-1, P-2, ..., P-d) of total length r. Recently, Wang et al. proposed a sequential solution based on the dynamic programming technique that requires O(nmr) execution time and space. To the best of our knowledge, there is no parallel solutions for this problem. This paper describes new efficient parallel algorithms on Coarse Grained Multicomputer model (CGM) to solve this problem. Firstly, we propose a multi-level Direct Acyclic Graph (DAG) that determines the correct evaluation order of sub-problems in order to avoid redundancy due to overlap. Secondly, we propose two CGM parallel algorithms based on our DAG. The first algorithm is based on a regular partitioning of the DAG and requires O(nmr/p) execution time with O(p) communication rounds where p is the number of processors used. Its main drawback is high idleness time of processors because due to the dependencies between the nodes in the DAG, over time it has many idle processors. The second algorithm uses an irregular partitioning of the DAG that minimizes this idleness time by allowing the processors to stay active as long as possible. It requires O(nmr/p) execution time with O(kp) communication rounds. k is a constant integer allowing to setup the irregular partitioning. The both algorithms require O(r vertical bar Sigma vertical bar/p) preprocessing time where vertical bar Sigma vertical bar is the length of the alphabet. The experimental results performed show a good agreement with theoretical predictions. (C) 2019 Elsevier B.V. All rights reserved.

关键词： parallel algorithms Coarse grained multicomputer Dynamic programming Multiple-constrained LCS Direct acyclic graph

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms FOR MODELLING TWO-DIMENSIONAL NON-EQUILIBRIUM SALT TRANSFER PROCESSES ON THE BASE OF FRACTIONAL DERIVATIVE MODEL

引用

FRACTIONAL CALCULUS AND APPLIED ANALYSIS 2018年第3期21卷 654-671页

作者： Bohaienko, Vsevolod Natl Acad Sci Ukraine VM Glushkov Inst Cybernet Acad Glushkov Ave 40 UA-03187 Kiev Ukraine

Modelling of salt transfer processes in fractal structured media has been considered on the base of fractional derivative equations with Caputo-Gerasimov derivatives with respect to space variables. Initial-boundary problem has been solved using locally one-dimensional finite difference scheme. Procedure of fractional derivative approximation has been proposed to lower computational complexity of solution process. parallel algorithms for distributed memory systems and GPU have been considered. Analysis of using one-dimensional and red-black data partitioning schemes is presented and new parametric scheme which have better characteristics in the determined conditions has been proposed.

关键词： parallel algorithms GPU fractional order equations approximated solutions finite difference schemes salt transfer

来源：评论

学校读者我要写书评

暂无评论

parallel algorithms for reducing derivation time of distinguishing experiments for nondeterministic finite state machines

引用

INTERNATIONAL JOURNAL OF parallel EMERGENT AND DISTRIBUTED SYSTEMS 2018年第2期33卷 197-210页

作者： El-Fakih, Khaled Barlas, Gerassimos Ali, Mustafa Yevtushenko, Nina Amer Univ Sharjah Dept Comp Sci & Engn Sharjah U Arab Emirates Tomsk State Univ Dept Informat Technol Tomsk Russia

Many approaches have been proposed for deriving tests from finite state machine (FSM) specifications with respect to some established coverage criteria. A fundamental core problem in FSM-based testing relates to the derivation of input sequences that can distinguish states of an FSM specification, aka distinguishing sequences. A major effort in the construction of these sequences is based on the derivation of a successors search-tree labeled by sets of pairs of states of the given machine. We aim at reducing the time associated with such constructions through the use of state-of-the-art parallel technologies. Namely, we propose a parallel algorithm that we implement and evaluate on multicore CPUs and on many-core GPUs. We evaluate two alternative GPU implementations that use the CUDA and Thrust software platforms and a network of workstations based solution. The latter sports a workload partitioning based on Divisible Load Theory. A rigorous set of experiments highlights the differences of the proposed implementations in terms of execution time and speedup. [GRAPHICS] We aim at reducing the time associated with the construction of the successors of all state pairs of a given non-deterministic finite state machine. We propose a parallel algorithm that we implement and evaluate on multicore CPUs and on many-core GPUs. We evaluate two alternative GPU implementations that use the CUDA and Thrust software platforms. Additionally, we propose and evaluate a Network of Workstations solution based on Divisible Load Theory. A rigorous set of experiments highlights the differences of the proposed implementations in terms of execution time and speedup.

关键词： Conformance testing distinguishing experiments nondeterministic finite state machines divisible load theory parallel algorithms CUDA Thrust network of workstations

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：