Traditional topology partition of parallel network simulation usually uses topology graph partition, that is to induce actual network simulate task to a weighted graph, Then use certain tools to complete the partition...
详细信息
Traditional topology partition of parallel network simulation usually uses topology graph partition, that is to induce actual network simulate task to a weighted graph, Then use certain tools to complete the partition of simulation tasks, such as METIS, Chaco;These traditional tools usually perform very good when the simulation environment is of the same structure. However, in the complex physical computing environment, such as a large number of computing nodes whose calculation ability is different, a wide gap on the communication capabilities and so on, the traditional topology partition tools is difficult to achieve efficiency optimal, resulting in computing power a tremendous waste. Therefore, we introduce a Cluster Instructed partitioning algorithm (CIPA), suitable for partitioning simulation tasks for heterogeneous platform. In complex network environment, this method organizes the computing nodes by clustering, and distributes the simulated topology to the computing nodes properly combined with the improved topology graph partition. It can get a balanced partitioning result, and greatly improve the efficiency of the entire system.
Multiprocessor System-on-Chip (MPSoC) based on Network-on-Chip (NoC) integrates a large amount of Processor Elements (PEs) to fulfill the performance requirements of several applications. These applications are compos...
详细信息
ISBN:
(纸本)9781479975815
Multiprocessor System-on-Chip (MPSoC) based on Network-on-Chip (NoC) integrates a large amount of Processor Elements (PEs) to fulfill the performance requirements of several applications. These applications are composed of a set of intercommunicating tasks, which are dynamically mapped onto PEs of the target architecture. However, the efficient task-mapping requires some previous steps, among them partitioning, which organizes tasks considering their interaction before applying a mapping process. This paper introduces Partition Reduce (PR) - a task partitioning approach based on the MapReduce algorithm targeting homogeneous NoC based MPSoCs. We analyze the efficiency of PR for energy consumption (EC) minimization and load balance (LB). The results obtained from a set of experiments, with large number of tasks, demonstrate that PR is more effective on processing time and result quality when compared to the classic Simulated Annealing (SA). In addition, PR produces partitions with low energy consumption and rigorous load balance.
Driven by our work on a large-scale distributed microscopic road traffic simulator, we present ENHANCE, a novel re-partitioning approach that allows incorporating fine-grained simulator-specific cost models into the p...
详细信息
Driven by our work on a large-scale distributed microscopic road traffic simulator, we present ENHANCE, a novel re-partitioning approach that allows incorporating fine-grained simulator-specific cost models into the partitioning process to account for the actual performance characteristics of the *** use of explicit cost models enables partitioning for heterogeneous resources, which are a common occurrence in cloud deployments. Importantly, ENHANCE can be used in conjunction with other partitioning approaches by further enhancing partitions according to provided cost models. We demonstrate the benefits of our approach in an experimental evaluation showing performance improvements of up to 29% against METIS under heterogeneous conditions. Taking a different perspective, the partitioning produced by ENHANCE can provide similar performance as METIS, but using up to 20% fewer resources.
PurposeThis research aims to advance the Isogeometric Scaled Boundary Finite Element Method (IG-SBFEM) by introducing a partitioning approach for solving elastic and viscoelastic problems with cyclic symmetry. The stu...
详细信息
PurposeThis research aims to advance the Isogeometric Scaled Boundary Finite Element Method (IG-SBFEM) by introducing a partitioning approach for solving elastic and viscoelastic problems with cyclic symmetry. The study seeks to mitigate the computational burden associated with eigenvalue problems by proving the block-circulant nature of the system matrices. Through partitioning, the solution scale is reduced, and the study further explores the integration of the Lagrange multiplier scheme and temporally adaptive algorithms (TPAA) to handle complex displacement constraints and viscoelastic properties, ensuring efficient computation even in cyclically symmetric ***/methodology/approachThe methodology centers on the development of a partitioning algorithm integrated into the Isogeometric Scaled Boundary Finite Element Method (IG-SBFEM). By leveraging the block-circulant nature of matrices under cyclic symmetry, the study reduces the solution scale of both eigenvalue and system equations. Displacement constraints are addressed through a Lagrange multiplier scheme. The approach further applies a temporally piecewise adaptive algorithm (TPAA) to convert viscoelastic problems into elastic problems, allowing efficient numerical analysis and computation for cyclically symmetric *** study finds that the partitioning IG-SBFEM efficiently addresses elastic and viscoelastic problems with cyclic symmetry, reducing both the solution scale and computational cost. The block-circulant property of the matrices enables the decomposition of complex equations into smaller sub-problems, improving performance. Additionally, the Lagrange multiplier scheme successfully handles displacement constraints. The temporally piecewise adaptive algorithm (TPAA) further enhances efficiency by transforming viscoelastic problems into elastic equivalents. Numerical results confirm that this approach achieves accurate solutions with reduced computational ***
作者:
Wang, ChongshuaiPeng, RuifeiHe, YiqianYang, HaitianHan, XuHebei Univ Technol
Sch Elect Engn State Key Lab Reliabil & Intelligence Elect Equipm Tianjin Peoples R China Chinese Acad Sci
Shenyang Inst Automat Inst Robot & Intelligent Mfg Key Lab Networked Control Syst Shenyang Peoples R China Dalian Univ Technol
Int Res Ctr Computat Mech Dept Engn Mech State Key Lab Struct Anal Ind Equipment Dalian Peoples R China Dalian Univ Technol
Int Res Ctr Computat Mech Dept Engn Mech State Key Lab Struct Anal Ind Equipment Dalian 116024 Peoples R China Hebei Univ Technol
Sch Elect Engn State Key Lab Reliabil & Intelligence Elect Equipm Tianjin 300401 Peoples R China
An efficient reduced order algorithm is proposed for the elastic SBFE analysis for 2-D cyclic symmetric structures with or without a common node. The general stiffness matrices of scaled boundary finite element (SBFE)...
详细信息
An efficient reduced order algorithm is proposed for the elastic SBFE analysis for 2-D cyclic symmetric structures with or without a common node. The general stiffness matrices of scaled boundary finite element (SBFE) is proved to be block-circulant, and can be constructed via the basic region, instead of the whole computing domain. Thus, the expense on eigenvalue analysis required in generating stiffness matrix can be significantly reduced, and the solution scale can be reduced by partitioning the system equation into a series of small independent subproblems. Furthermore, the presented algorithm is combined with the Woodbury formula to reduce the computational cost on the analysis of incomplete cyclically symmetric structures, the original system equation is transformed into the equations with block-circulant coefficient matrices, which is efficiently solved by partitioning algorithm. Four numerical examples are provided to demonstrate the effectiveness and advantages of the proposed approach.
Currently, Raft, as an mainstream consensus mechanism, has received widespread attention. Partition consensus can reduce the number of nodes involved in a single consensus and improve consensus efficiency. However, ex...
详细信息
Currently, Raft, as an mainstream consensus mechanism, has received widespread attention. Partition consensus can reduce the number of nodes involved in a single consensus and improve consensus efficiency. However, existing algorithms suffer from unreasonable partitioning and intolerance of Byzantine nodes. To address these problems, this paper proposes a novel Raft consensus algorithm combining comprehensive evaluation partitioning and Byzantine fault tolerance, CB-Raft. First, a comprehensive evaluation of nodes is conducted from the perspectives of consensus behavior and location, and the nodes are evenly divided based on the parity of the comprehensive ranking. Second, the leader is selected from the nodes with the top rankings in the comprehensive evaluation, and the nodes communicate with each other based on BLS signatures. Finally, a fast response mechanism based on cross-partition leader-follower communication is proposed to avoid the continued evil behavior of the leader, and a pipeline mechanism based on changeable signature thresholds is proposed to solve consensus blocking. The experimental results show that compared with the existing partitioning methods, the proposed partitioning scheme has significant advantages in terms of consensus latency, throughput, and the probability of partition success. Compared with the similar Raft algorithms, CB-Raft has high consensus performance and good resistance to Byzantine nodes.
Focusing on embedded applications, scratchpad memories (SPMs) look like a best-compromise solution when taking into account performance, energy consumption, and die area. The main challenge in SPM design is to optimal...
详细信息
Focusing on embedded applications, scratchpad memories (SPMs) look like a best-compromise solution when taking into account performance, energy consumption, and die area. The main challenge in SPM design is to optimally map memory locations to scratchpad locations. This paper describes an algorithm to solve such a mapping problem by means of dynamic programming applied to a synthesizable hardware architecture. The algorithm works by mapping segments of external memory to physically partitioned banks of an on-chip SPM;this architecture provides significant energy savings. The algorithm does not require any user-set bound on the number of partitions and takes into account partitioning overhead. Improving on previous solutions, execution time is polynomial in the number of memory locations, even in the most general solving policy. This has the major practical advantage of allowing an arbitrary number of scratchpad segments, something that was impossible with previous methods, whose running time is exponential to this number. Strategies to optimize memory requirements and speed of the algorithm are exploited. Additionally, we integrate this algorithm in a complete and automated design, simulation, and synthesis flow.
The exhaustive testing of today's digital circuits is not possible, owing to the vast test sequences which would have to be applied. Breaking down the circuit into manageable subcircuits (partitioning) makes exhau...
详细信息
The exhaustive testing of today's digital circuits is not possible, owing to the vast test sequences which would have to be applied. Breaking down the circuit into manageable subcircuits (partitioning) makes exhaustive testing practicable. partitioning has previously been done by the designer of the circuit in rather an ad hoc manner. The paper describes an algorithm which can be used to find the partitioning points in a circuit. The algorithm is illustrated for circuits containing reconvergent and nonreconvergent fan-outs.
A novel efficient algorithm for solution of the problem of equal partitioning of a set with predefined weights of elements is proposed. The algorithm is based on calculation of a linear group preserving an invariant: ...
详细信息
A novel efficient algorithm for solution of the problem of equal partitioning of a set with predefined weights of elements is proposed. The algorithm is based on calculation of a linear group preserving an invariant: the set of zeros of a cubic form. algorithms for solution of related problems, including the problem of the search for the second solution if the first solution is known, are discussed.
Integrating more functionality in a smaller form factor with higher performance and lower power consumption is pushing semiconductor technology scaling to its limits. Three-dimensional (3-D) chip stacking is touted as...
详细信息
Integrating more functionality in a smaller form factor with higher performance and lower power consumption is pushing semiconductor technology scaling to its limits. Three-dimensional (3-D) chip stacking is touted as the silver bullet technology that can keep Moore's momentum and fuel the next wave of consumer electronics products. This letter introduces a TSV-aware partitioning algorithm that enables higher performance for application implementation onto 3-D field-programmable gate arrays (FPGAs). Unlike other algorithms that minimize the number of connections among layers, our solution leads to a more efficient utilization of the available (fabricated) interlayer connectivity. Experimental results show average reductions in delay and power consumption, as compared to similar 3-D computer-aided design (CAD) tools, about 28% and 26%, respectively.
暂无评论