检索结果-内蒙古大学图书馆

34th ACM Symposium on parallelism in algorithms and Architectures (SPAA)

作者： Shen, Zheqi Wan, Zijin Gu, Yan Sun, Yihan UC Riverside Riverside CA 92521 USA

ISBN: (纸本)9781450391467

Some recent papers showed that many sequential iterative algorithms can be directly parallelized, by identifying the dependences between the input objects. This approach yields many simple and practical parallel algorithms, but there are still challenges to achieve work-efficiency and high-parallelism. Work-efficiency means that the number of operations is asymptotically the same as the best sequential solution. This can be hard for certain problems where the number of dependences between objects is asymptotically more than optimal sequential work, and we cannot even afford the cost to generate them. To achieve high-parallelism, we always want it to process as many objects as possible in parallel. The goal is to achieve (O) over tilde (D) span for a problem with the deepest dependence length D. We refer to this property as round-efficiency. This paper presents work-efficient and round-efficient algorithms for a variety of classic problems and propose general approaches to do so. To efficiently parallelize many sequential iterative algorithms, we propose the phase-parallel framework. The framework assigns a rank to each object and processes the objects based on the order of their ranks. All objects with the same rank can be processed in parallel. To enable work-efficiency and high parallelism, we use two types of general techniques. Type 1 algorithms aim to use range queries to extract all objects with the same rank to avoid evaluating all the dependences. We discuss activity selection, and Dijkstra's algorithm using Type 1 framework. Type 2 algorithms aim to wake up an object when the last object it depends on is finished. We discuss activity selection, longest increasing subsequence (LIS), greedy maximal independent set (MIS), and many other algorithms using Type 2 framework. All of our algorithms are (nearly) work-efficient and round-efficient, and some of them (e.g., LIS) are the first to achieve the both. Many of them improve the previous best bounds. Moreover,

关键词： parallel algorithms phase-parallel framework parallel programming sequential iterative algorithms activity selection longest increasing subsequence maximal independent set independence system

来源：评论

学校读者我要写书评

暂无评论

parallel Optimization of Program Instructions Using Genetic algorithms

引用

Computers, Materials & Continua 2021年第6期67卷 3293-3310页

作者： Petre Anghelescu partment of Electronics Communications and ComputersUniversity of PitestiPitesti110040Romania

This paper describes an efficient solution to parallelize softwareprogram instructions, regardless of the programming language in which theyare written. We solve the problem of the optimal distribution of a set ofinstructions on available processors. We propose a genetic algorithm to parallelize computations, using evolution to search the solution space. The stagesof our proposed genetic algorithm are: The choice of the initial populationand its representation in chromosomes, the crossover, and the mutation operations customized to the problem being dealt with. In this paper, geneticalgorithms are applied to the entire search space of the parallelization ofthe program instructions problem. This problem is NP-complete, so thereare no polynomial algorithms that can scan the solution space and solve theproblem. The genetic algorithm-based method is general and it is simple andefficient to implement because it can be scaled to a larger or smaller number ofinstructions that must be parallelized. The parallelization technique proposedin this paper was developed in the C# programming language, and our resultsconfirm the effectiveness of our parallelization method. Experimental resultsobtained and presented for different working scenarios confirm the theoreticalresults, and they provide insight on how to improve the exploration of a searchspace that is too large to be searched exhaustively.

关键词： parallel instruction execution parallel algorithms genetic algorithms parallel genetic algorithms artificial intelligence techniques evolutionary strategies

来源：评论

学校读者我要写书评

暂无评论

parallel Minimum Spanning Tree algorithms via Lattice Linear Predicate Detection 36

Parallel Minimum Spanning Tree Algorithms via Lattice Linear...

引用

36th IEEE International parallel and Distributed Processing Symposium (IEEE IPDPS)

作者： Alves, David R. Garg, Vijay K. Univ Texas Austin Dept Elect & Comp Engn Austin TX 78712 USA

ISBN: (纸本)9781665497473

We show that the problem of computing the minimum spanning tree can be formulated as special case of detecting Lattice Linear Predicate (LLP). In general, formulating problems as LLP presents two main advantages: 1) Different problems are formulated under a single, general framework, which defines the problem in terms of simple local predicates that must hold for the all the elements of a lattice, making the problem (and the solution) compact and easy to understand. 2) improvements on one set of problems can be transferable to other sets of problems;3) since the problems are stated as a set of local predicates, which can be often tested with little or no synchronization it is often the case that new opportunities for parallelism present themselves. In this paper we introduce two parallel algorithms LLP-Prim and LLP-Boruvka that improve on the non-LLP counterparts in several ways. LLP-Prim reduces the number of heap operations required by Prim by allowing edges to be selected without entering the heap thus allowing for parallelism. LLP-Boruvka improves on Boruvka by reducing synchronization and thus once more improving parallelism opportunities. Our experimental evaluation shows that LLP-Prim is faster than Prim's algorithm in both single threaded and multithreaded scenarios and that it provides a good tradeoff between parallelism and efficiency at low core counts. For higher core count scenarios we show how LLP-Boruvka improves on an efficient implementation of a parallel version of Boruvka.

关键词： Minimum Spanning Tree parallel algorithms

来源：评论

学校读者我要写书评

暂无评论

parallel iterative stabilized finite element algorithms for the Navier-Stokes equations with nonlinear slip boundary conditions

引用

INTERNATIONAL JOURNAL FOR NUMERICAL METHODS IN FLUIDS 2021年第4期93卷 1074-1109页

作者： Zhou, Kangrui Shang, Yueqiang Southwest Univ Sch Math & Stat Chongqing 400715 Peoples R China

Based on full domain partition technique, some parallel iterative pressure projection stabilized finite element algorithms for the Navier-Stokes equations with nonlinear slip boundary conditions are designed and analyzed. In these algorithms, the lowest equal-orderP(1) - P(1)elements are used for finite element discretization and a local pressure projection stabilized method is used to counteract the invalidness of the discrete inf-sup condition. Each subproblem is solved on a global composite mesh with the vast majority of the degrees of freedom associated with the particular subdomain that it is responsible for and hence can be solved in parallel with other subproblems by using an existing sequential solver without extensive recoding. All of the subproblems are nonlinear and are independently solved by three kinds of iterative methods. We estimate the optimal error bounds of the approximate solutions with the use of some (strong) uniqueness conditions. Numerical results are also given to demonstrate the effectiveness of the parallel algorithms.

关键词： full domain partition Navier-Stokes equations nonlinear slip boundary conditions parallel algorithms pressure projection

来源：评论

学校读者我要写书评

暂无评论

Multi-level parallel chaotic Jaya optimization algorithms for solving constrained engineering design problems

引用

JOURNAL OF SUPERCOMPUTING 2021年第11期77卷 12280-12319页

作者： Migallon, H. Jimeno-Morenilla, A. Rico, H. Sanchez-Romero, J. L. Belazi, A. Miguel Hernandez Univ Dept Comp Engn Elche 03202 Spain Univ Alicante Dept Comp Technol Alicante 03071 Spain Tunis El Manar Univ Lab RISC ENIT LR 16 E507 Tunis 1002 Tunisia

Several heuristic optimization algorithms have been applied to solve engineering problems. Most of these algorithms are based on populations that evolve according to different rules and parameters to reach the optimal value of a function cost through an iterative process. Different parallel strategies have been proposed to accelerate these algorithms. In this work, we combined coarse-grained strategies, based on multi-populations, with fine-grained strategies, based on a diffusion grid, to efficiently use a large number of processes, thus drastically decreasing the computing time. The Chaotic Jaya optimization algorithm has been considered in this work due to its good optimization and computational behaviors in solving both the constrained optimization engineering problems (seven problems) and the unconstrained benchmark functions (a set of 18 functions). The experimental results show that the proposed parallel algorithms outperform the state-of-the-art algorithms in terms of optimization behavior, according to the quality of the obtained solutions, and efficiently exploit shared memory parallel platforms.

关键词： Optimization Constrained engineering problem Jaya algorithm Chaotic map parallel algorithms OpenMP

来源：评论

学校读者我要写书评

暂无评论

parallel Inspection Route Optimization With Priorities for 5G Base Station Networks

引用

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2025年 22卷 10860-10870页

作者： Dai, Xiangqi Liang, Zhenglin Tsinghua Univ Dept Ind Engn Beijing 100084 Peoples R China

5G base station networks generate numerous alarms daily. With the increasing demand for digital services, it is vital to inspect and rectify anomalies to uphold user satisfaction. This study explores the potential of unmanned aerial vehicle (UAV) empowered opportunistic inspection based on alarm data. We formulate the inspection routing problem as a prioritized traveling salesman problem (PTSP) encompassing two categories of base stations. Priority is assigned to stations generating more alarms, while others are subject to opportunistic inspection. To expedite large-scale opportunistic inspection routes, we introduce a novel transformer-based parallelizable routing algorithm (TPRA). TPRA is an intelligent optimization that orchestrates multiple parallelized constrained reinforcement learning algorithms. Through balancing spectral clustering, the large-scale graph is segmented into manageable subgraphs. For each subgraph, the prioritized inspection routing problem is formulated as a constrained Markov decision process and optimized by transformer-based reinforcement learning in parallel. The optimized subgraphs are then merged using an adaptive large neighborhood search approach. Through parallel computing, our approach achieves as much as 75% reduction in computation time, while concurrently generating shorter routes. The approach is implemented in real-world cases to validate its efficacy. Note to Practitioners-The rapid expansion of 5G infrastructure underscores the critical need for advanced technology and maintenance strategies. Base stations are often placed at high altitudes to ensure line-of-sight connectivity, which poses difficulties for maintenance, particularly in challenging terrains. UAVs offer a promising solution for faster and safer inspection and rectification. The designed approach utilizes reinforcement learning in parallel to optimize UAV inspection routes in an opportunistic manner. This method strategically prioritizes inspection routes based o

关键词： Communication system operations and management optimization methods traveling salesman problems parallel algorithms unmanned aerial vehicle Communication system operations and management optimization methods traveling salesman problems parallel algorithms unmanned aerial vehicle

来源：评论

学校读者我要写书评

暂无评论

Fast parallel CPU-GPU Approximate Spectral Clustering for Transcriptomics Data

引用

INTERNATIONAL JOURNAL OF parallel PROGRAMMING 2025年第1期53卷 1-25页

作者： Brankovic, Stefan Smiljkovic, Lazar Obradovic, Predrag Radonjiic, Milos Misic, Marko Univ Belgrade Sch Elect Engn Bulevar Kralja Aleksandra 73 Belgrade 11000 Serbia MGI Tech Belgrade 11000 Serbia

Spectral clustering algorithms have been used in various research domains to discover structure and patterns in data. However, high computational and space complexity hinders their usage for large-scale datasets in machine learning and bioinformatics. Various approximate spectral clustering methods were proposed in the open literature to solve those problems. In this paper, we describe our GPU-based, parallel implementation of an approximate spectral algorithm based on the Nystrom method and column sampling and its memory-efficient variant. We evaluate our solution using several annotated datasets, such as USPS, MNIST, and MNIST8, as well as bioinformatics data, especially from the domain of single-cell and spatial transcriptomics. We obtain speedups of up to 31.8x depending on the dataset used and demonstrate the scalability of the solution for the datasets with up to four million samples.

关键词： Approximate spectral clustering GPU computing parallel algorithms parallel programming Performance analysis

来源：评论

学校读者我要写书评

暂无评论

A Unified parallel Framework for LUT Mapping and Logic Optimization

引用

IEEE TRANSACTIONS ON COMPUTER-AIDED DESIGN OF INTEGRATED CIRCUITS AND SYSTEMS 2025年第1期44卷 214-226页

作者： Liu, Tianji Sun, Yang Chen, Lei Li, Xing Yuan, Mingxuan Young, Evangeline F. Y. Chinese Univ Hong Kong Dept Comp Sci & Engn Hong Kong Peoples R China Huawei Huawei Noahs Ark Lab Hong Kong Peoples R China

Lookup-table (LUT) mapping has been extensively utilized in logic synthesis, including being an indispensable step in FPGA design, serving as a building block in high-effort synthesis flows, and providing an algorithmic framework for logic optimization. Hence, a fast mapping algorithm is vital to satisfying the demand for synthesizing high-quality, large-scale modern VLSI designs. This article proposes two efficient GPU-parallel algorithms, namely LUT mapping and and-inverter graph (AIG) optimization using a precomputed database, which rely on a common parallel mapping framework that consists of novel fine-grained parallel mapping passes with high degree of parallelism. The mapping pass is enhanced by specifically tailored cut evaluation and memory management methods for GPUs that enable fast mapping of large circuits with limited GPU memory. parallel timing analysis passes and parallel cut expansion passes are also proposed for constructing a fully GPU-accelerated LUT mapping flow. The core of parallel AIG optimization is a plugin of the mapping framework, which contains a self-adaptive parallel candidate structure evaluation procedure with high time efficiency and low hardware resource usage. Experiments show that on average, GPU LUT mapping and AIG optimization achieve 34.6x and 99.9x speedup with similar result quality, compared with the high-performance LUT mapper and AIG optimization algorithm with a database implemented in ABC, respectively, on large benchmarks. When combining the two algorithms with other GPU logic optimization algorithms, a GPU-based sequence targeting LUT network synthesis achieves 46.7x speedup with 4.7% smaller area and 0.2% smaller delay over ABC.

关键词： Table lookup Logic Optimization Delays Graphics processing units Databases Memory management GPU logic optimization logic synthesis parallel algorithms technology mapping

来源：评论

学校读者我要写书评

暂无评论

Automated Guided Vehicle Scheduling Problem in Manufacturing Workshops: An Adaptive parallel Evolutionary Algorithm

引用

IEEE TRANSACTIONS ON AUTOMATION SCIENCE AND ENGINEERING 2025年 22卷 7361-7372页

作者： Li, Zhongkai Pan, Quanke Miao, Zhonghua Sang, Hongyan Li, Weimin Shanghai Univ Sch Mechatron Engn & Automat Shanghai 200444 Peoples R China Liaocheng Univ Sch Comp Sci Liaocheng 252000 Peoples R China Shanghai Univ Sch Comp Engn & Sci Shanghai 200444 Peoples R China

In the realm of scheduling problems, metaheuristics have been widely embraced as superior solutions, appreciated for their ability to generate resolutions for non-deterministic polynomial-time hard (NP-hard) problems swiftly. This paper presents a novel parallel evolutionary algorithm (PEA), which marries metaheuristics and parallel computing to amplify computer performance utilization. Four operators and a restart strategy are incorporated into the proposed PEA to bolster both its global and local search capabilities. An accelerated calculation method for two operators is proposed. The algorithm also features an adaptive method that generates sub-threads and parameters based on computer performance, along with rotation for evaluating solutions. A random search sub-thread is established to update the solution. The algorithm is tested on the workshop automated guided vehicle (AGV) scheduling problem and compared against other optimization algorithms to ascertain its efficacy. The test results overwhelmingly highlight the superior performance of the proposed algorithm. Note to Practitioners-The paper introduces a novel parallel evolutionary algorithm (PEA) for scheduling problems, which combines metaheuristics and parallel computing to enhance computer performance utilization. The algorithm incorporates four operators and a restart strategy, along with an accelerated calculation method for two operators. It also includes an adaptive method to generate sub-threads and parameters based on computer performance, as well as rotation for evaluating solutions. A random search sub-thread is established to update the solution. The proposed algorithm is tested on the workshop automated guided vehicle (AGV) scheduling problem, producing superior results compared to other optimization algorithms. Its ability to swiftly generate resolutions for NP-hard problems can greatly benefit industries that rely on efficient scheduling, such as logistics and manufacturing. However, it is imp

关键词： Task analysis Job shop scheduling Processor scheduling Metaheuristics Scheduling parallel algorithms Costs Evolutionary algorithm parallel processing computational efficiency automated guided vehicle

来源：评论

学校读者我要写书评

暂无评论

parallel Multi Objective Shortest Path Update Algorithm in Large Dynamic Networks

引用

IEEE TRANSACTIONS ON parallel AND DISTRIBUTED SYSTEMS 2025年第5期36卷 932-944页

作者： Shovan, S. M. Khanda, Arindam Das, Sajal K. Missouri Univ Sci & Technol Dept Comp Sci Rolla MO 65409 USA

The multi objective shortest path (MOSP) problem, crucial in various practical domains, seeks paths that optimize multiple objectives. Due to its high computational complexity, numerous parallel heuristics have been developed for static networks. However, real-world networks are often dynamic where the network topology changes with time. Efficiently updating the shortest path in such networks is challenging, and existing algorithms for static graphs are inadequate for these dynamic conditions, necessitating novel approaches. Here, we first develop a parallel algorithm to efficiently update a single objective shortest path (SOSP) in fully dynamic networks, capable of accommodating both edge insertions and deletions. Building on this, we propose DynaMOSP, a parallel heuristic for Dynamic Multi Objective Shortest Path searches in large, fully dynamic networks. We provide a theoretical analysis of the conditions to achieve Pareto optimality. Furthermore, we devise a dedicated shared memory CPU implementation along with a version for heterogeneous computing environments. Empirical analysis on eight real-world graphs demonstrates that our method scales effectively. The shared memory CPU implementation achieves an average speedup of 12.74x and a maximum of 57.22x, while on an Nvidia GPU, it attains an average speedup of 69.19x, reaching up to 105.39x when compared to state-of-the-art techniques.

关键词： Heuristic algorithms Pareto optimization parallel algorithms Graphics processing units Wireless sensor networks Optimization Linear programming Aerodynamics Search problems Scalability Multi-objective shortest path dynamic graph GPU shared-memory SYCL programming model

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

时间限定

文献类型

馆藏选择

核心期刊

语言

文献类型

帮助

文字说明：

检索规则说明：

检索范例：

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：