检索结果-内蒙古大学图书馆

41st ACM SIGMOD-SIGACT-SIGAI Symposium on Principles of Database Systems (PODS)

作者： Gottlob, Georg Lanzinger, Matthias Okulmus, Cem Pichler, Reinhard Univ Oxford Oxford England TU Wien Vienna Austria

ISBN: (纸本)9781450392600

Various classic reasoning problems with natural hypergraph representations are known to be tractable when a hypertree decomposition (HD) of low width exists. The resulting algorithms are attractive for practical use in fields like databases and constraint satisfaction. However, algorithmic use of HDs relies on the difficult task of first computing a decomposition of the hypergraph underlying a given problem instance, which is then used to guide the algorithm for this particular instance. The performance of purely sequential methods for computing HDs is inherently limited, yet the problem is, theoretically, amenable to parallelisation. In this paper we propose the first algorithm for computing hypertree decompositions that is well-suited for parallelisation. The newly proposed algorithm log-k-decomp requires only a logarithmic number of recursion levels and additionally allows for highly parallelised pruning of the search space by restriction to so-called balanced separators. We provide a detailed experimental evaluation over the HyperBench benchmark and demonstrate that log-k-decomp outperforms the current state-of-the-art significantly.

关键词： hypergraph decomposition hypertree width parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel COMPOSITION OF WEIGHTED FINITE-STATE TRANSDUCERS 47

PARALLEL COMPOSITION OF WEIGHTED FINITE-STATE TRANSDUCERS

引用

47th IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)

作者： Sengupta, Shubho Pratap, Vineel Hannun, Awni Facebook AI Res Menlo Pk CA 94025 USA Zoom AI San Jose CA USA

ISBN: (纸本)9781665405409

Finite-state transducers (FSTs) are frequently used in speech recognition. Transducer composition is an essential operation for combining different sources of information at different granularities. However, composition is also one of the more computationally expensive operations. Due to the heterogeneous structure of FSTs, parallel algorithms for composition are suboptimal in efficiency, generality, or both. We propose an algorithm for parallel composition and implement it on graphics processing units. We benchmark our parallel algorithm on the composition of random graphs and the composition of graphs commonly used in speech recognition. The parallel composition scales better with the size of the input graphs and for large graphs can be as much as 10 to 30 times faster than a sequential CPU algorithm.

关键词： finite-state transducers parallel algorithms GPUs

来源：评论

学校读者我要写书评

暂无评论

MPI-Based Methods for Network Reliability Calculation

引用

LOBACHEVSKII JOURNAL OF MATHEMATICS 2024年第7期45卷 3130-3137页

作者： Migov, D. A. Novosibirsk State Tech Univ Novosibirsk 630073 Russia

The paper considers the problem of exact network reliability calculation. We assume that a network has unreliable communication links and perfectly reliable nodes. The reliability for such network is defined as a probability that every pair of nodes of network is connected by an operational path. The problem of computing this characteristic is known to be NP-hard. For supercomputers with distributed memory, we study the ways of parallelization of the well-know recursive factoring method. The best parallel algorithm among approaches considered is the algorithm based on a Master-Slave scheme using a threshold for the minimal size of a graph for sending it to a new process without recursion backtracking. This algorithm has a linear or even superlinear speedup up to 768 cores. The numerical results show that the scalability depends on the chosen threshold for the minimal size of a graph for sending to a new process, which, in turn, depends on the density of the graph.

关键词： network reliability parallel algorithms random graph connectivity distributed memory MPI factoring

来源：评论

学校读者我要写书评

暂无评论

An Efficient parallel Implementation of a Perfect Hashing Method for Hypergraphs 36

An Efficient Parallel Implementation of a Perfect Hashing Me...

引用

36th IEEE International parallel and Distributed Processing Symposium (IEEE IPDPS)

作者： Singh, Somesh Ucar, Bora Univ Lyon ENS Lyon INRIA Lyon France Univ Lyon ENS Lyon INRIA LIP CNRS Lyon France Univ Lyon ENS Lyon INRIA CNRS Lyon France

ISBN: (纸本)9781665497473

Querying the existence of an edge in a given graph or hypergraph is a building block in several algorithms. Hashing-based methods can be used for this purpose, where the given edges are stored in a hash table in a preprocessing step, and then the queries are answered using the lookup operations. While the general hashing methods have fast lookup times in the average case, the worst case run time is much higher. Perfect hashing methods take advantage of the fact that the items to be stored are all available and construct a collision free hash function for the given input, resulting in an optimal lookup time even in the worst case. We investigate an efficient shared-memory parallel implementation of a recently proposed perfect hashing method for hypergraphs. We experimentally compare the resulting parallel algorithms with the state-of-the-art and demonstrate better run time and scalability on a set of hypergraphs corresponding to real-life sparse tensors.

关键词： Hash functions Distributed processing Tensors Scalability Conferences Sparse matrices parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Brief Announcement: Distributed Unconstrained Local Search for Multilevel Graph Partitioning 24

Brief Announcement: Distributed Unconstrained Local Search f...

引用

36th ACM Symposium on parallelism in algorithms and Architectures (SPAA)

作者： Sanders, Peter Seemaier, Daniel Karlsruhe Inst Technol Karlsruhe Germany

ISBN: (纸本)9798400704161

Partitioning a graph into blocks of roughly equal weight while cutting only few edges is a fundamental problem in computer science with numerous practical applications. While shared-memory parallel partitioners have recently matured to achieve the same quality as widely used sequential partitioners, there is still a pronounced quality gap between distributed partitioners and their sequential counterparts. In this work, we shrink this gap considerably by describing the engineering of an unconstrained local search algorithm suitable for distributed partitioners. We integrate the proposed algorithm in a distributed multilevel partitioner. Our extensive experiments show that the resulting algorithm scales to thousands of PEs while computing cuts that are, on average, only 3.5% larger than those of a state-of-the-art high-quality shared-memory partitioner. Compared to previous distributed partitioners, we obtain on average 6.8% smaller cuts than the best-performing competitor while being more than 9 times faster.

关键词： graph partitioning distributed algorithms parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Efficient parallel algorithms for the maximum subarray problem 14

Efficient parallel algorithms for the maximum subarray probl...

引用

12th Australasian Symposium on parallel and Distributed Computing, AusPDC 2014

作者： Takaoka, Tadao Department of Computer Science University of Canterbury Christchurch New Zealand

ISBN: (纸本)9781921770340

parallel algorithm design is generally hard. parallel program verification is even harder. We take an ex-ample from the maximum subarray problem and and show those two problems of design and verification. The best known communication steps for a mesh architecture for the maximum subarray problem is 2n - 1. We give a formal proof for the parallel al-gorithm on the mesh architecture based on Hoare logic. The main part of the proof is to establish sev-eral space/time invariants with three indices (i, j, k). The indices) pair specifies the invariant at the) grid point of the mesh and k specifies the k-th step in the computation. Then ignoring additive constants, the communication steps are improved to (3/2)n steps and finally n steps, which is optimal in terms of communication steps. Also the first algo-rithm is implemented on a Blue Gene parallel com-puter and performance measurements conducted are shown. © 2014, Australian Computer Society, Inc.

关键词： parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

Mini-batching with Fused Training and Testing for Data Streams Processing on the Edge 21

Mini-batching with Fused Training and Testing for Data Strea...

引用

21st ACM International Conference on Computing Frontiers (CF)

作者： Luna, Reginaldo Cassales, Guilherme Pfahringer, Bernhard Bifet, Albert Gomes, Heitor Murilo Senger, Hermes Univ Fed Sao Carlos Sao Carlos Brazil Univ Waikato Hamilton New Zealand Victoria Univ Wellington Wellington New Zealand

ISBN: (纸本)9798400705977

Edge Computing (EC) has emerged as a solution to reduce energy demand and greenhouse gas emissions from digital technologies. EC supports low latency, mobility, and location awareness for delay-sensitive applications by bridging the gap between cloud computing services and end-users. Machine learning (ML) methods have been applied in EC for data classification and information processing. Ensemble learners have often proven to yield high predictive performance on data stream classification problems. Mini-batching is a technique proposed for improving cache reuse in multi-core architectures of bagging ensembles for the classification of online data streams, which benefits application speedup and reduces energy consumption. However, the original mini-batching presents limited benefits in terms of cache reuse and it hinders the accuracy of the ensembles (i.e., their capacity to detect behavior changes in data streams). In this paper, we improve mini-batching by fusing continuous training and test loops for the classification of data streams. We evaluated the new strategy by comparing its performance and energy efficiency with the original mini-batching for data stream classification using six ensemble algorithms and four benchmark datasets. We also compare mini-batching strategies with two hardware-based strategies supported by commodity multi-core processors commonly used in EC. Results show that mini-batching strategies can significantly reduce energy consumption in 95% of the experiments. Mini-batching improved energy efficiency by 96% on average and 169% in the best case. Likewise, our new mini-batching strategy improved energy efficiency by 136% on average and 456% in the best case. These strategies also support better control of the balance between performance, energy efficiency, and accuracy.

关键词： Computing methodologies parallel computing methodologies parallel algorithms Shared memory algorithms

来源：评论

学校读者我要写书评

暂无评论

Probe Machine Based Computing Model for Maximum Clique Problem

引用

Chinese Journal of Electronics 2022年第2期31卷 304-312页

作者： CUI Jianzhong YIN Zhixiang TANG Zhen YANG Jing Department of Computer Huainan Union University School of Electronic and Information Engineering Anhui University of Science & Technology School of Mathematics Physics and Statistics Shanghai University of Engineering Science School of Mathematics and Big Data Anhui University of Science & Technology

Probe machine(PM) is a recently reported mathematic model with massive parallelism. Herein,we presented searching the maximum clique of an undirected graph with six vertices. We constructed data library containing n sublibraries, each sublibrary corresponded to a vertex in the given graph. Then, probe library according to the induced subgraph was designed in order to search and generate all maximal cliques. Subsequently,we performed probe operation, and all maximal cliques were generated in parallel. The advantages of the proposed model lie in two aspects. On one hand, solution to NP-complete problem is generated in just one step of probe operation rather than found in vast solution *** the other hand, the proposed model is highly *** work demonstrates that PM is superior to TM in terms of searching capacity when tackling NP-complete problem.

关键词： maximal cliques maximum clique problem searching capacity data library NP-complete problem computational complexity model lie induced subgraph probe operation undirected graph mathematic model graph theory parallel algorithms optimisation parallelism computing model vertex probe library probe machine

来源：评论

学校读者我要写书评

暂无评论

Efficient 3D Hilbert Curve Encoding and Decoding algorithms

引用

Chinese Journal of Electronics 2022年第2期31卷 277-284页

作者： JIA Lianyin LIANG Binbin LI Mengjuan LIU Yong CHEN Yinong DING Jiaman Faculty of Information Engineering and Automation Kunming University of Science and Technology Yunnan Key Laboratory of Artificial Intelligence Kunming University of Science and Technology Library Yunnan Normal University School of Information Science and Engineering Guangxi University for Nationalities School of Computing Informatics and Decision Systems Arizona State University

Hilbert curve describes a one-to-one mapping between multidimensional space and 1 D *** traditional 3D Hilbert encoding and decoding algorithms work on order-wise manner and are not aware of the difference between different input data and spend equivalent computing costs on them, thus resulting in a low efficiency. To solve this problem, in this paper we design efficient 3D state views for fast encoding and decoding. Based on the state views designed, a new encoding algorithm(JFK-3HE) and a new decoding algorithm(JFK-3HD) are proposed. JFK-3HE and JFK-3HD can avoid executing iteratively encoding or decoding each order by skipping the first 0 s in input data, thus decreasing the complexity and improving the efficiency. Experimental results show that JFK-3HE and JFK-3HD outperform the state-of-the-arts algorithms for both uniform and skew-distributed data.

关键词： order-wise manner multidimensional space encoding state-of-the-arts algorithms encoding algorithm parallel algorithms JFK-3HE statistical distributions decoding solid modelling design efficient 3D JFK-3HD traditional 3D Hilbert encoding equivalent computing costs different input data decoding algorithm Hilbert curve state views iterative methods

来源：评论

学校读者我要写书评

暂无评论

parallel integer multiplication 30

Parallel integer multiplication

引用

30th Euromicro International Conference on parallel, Distributed and Network-Based Processing (PDP)

作者： Samuel, Vivien PSL Res Univ CNRS Ecole Normale Super Dept Informat ENS 45 Rue Ulm Paris France Univ Lorraine CNRS INRIA LORIA Nancy France

ISBN: (纸本)9781665469586

Multiplication is a fundamental step in many algorithms. If the multiplication of two integers of n words has a complexity of M(n), divisions and squares can be computed in O(M(n)) as well and the greatest common divisor can be computed in O(M(n) log n). Thus being able to have a small value for M(n) is extremely important. To this day, the best known algorithm for reachable values is the Schonhage-Strassen algorithm which is implemented by a few arithmetic libraries. Asymptotically faster algorithms exist, however no computer is able to hold numbers big enough for those algorithms to outrun Schonhage-Strasser. The GNU Multiple Precision (GMP) library has a sequential-only implementation of Schonhage-Strassen. However some algorithms contains a step which is a single big multiplication. Thus when trying to parallelize such an algorithm, one requires a parallel algorithm for multiplication. An example of such an algorithm is the batch factorization for Number Field Sieve. Thus people trying to implement a parallel version of such algorithms need to find an arithmetic library that implements a parallel integer multiplication. An example of such a library is the Flint (Fast Library for Number Theory) library that contains a parallel implementation of Schonhage-Strassen. In this article we present an implementation of Schonhage-Strassen, that reaches a speedup of 20 for the multiplication of two integers of 10(7) words of 64 bits using a Xeon Gold with 32 cores.

关键词： Gold Codes Libraries Complexity theory parallel algorithms Arithmetic

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：