The authors present cost-optimal parallel algorithms for depth-order (e.g., pre-, in-, and post-order) and level-order (e.g., breadth-first and breadth-depth) traversals of general trees with n nodes. Each of the algo...
详细信息
The authors present cost-optimal parallel algorithms for depth-order (e.g., pre-, in-, and post-order) and level-order (e.g., breadth-first and breadth-depth) traversals of general trees with n nodes. Each of the algorithms requires O(n/p+log n) time using p >
The article presents a cost-optimal parallel algorithm for the parentheses matching problem on the EREW PRAM model. For n parentheses, the algorithm requires O(n/p+log n) time and O(n+p log p) space, employing p proce...
详细信息
The article presents a cost-optimal parallel algorithm for the parentheses matching problem on the EREW PRAM model. For n parentheses, the algorithm requires O(n/p+log n) time and O(n+p log p) space, employing p processors. Thus, for p >
This paper presents empirical performance of parallel algorithms for computing a spanning tree (SPT) and a minimum spanning tree (MST) of connected graphs on the Transputer and Unix systems, where processors are confi...
详细信息
This paper presents empirical performance of parallel algorithms for computing a spanning tree (SPT) and a minimum spanning tree (MST) of connected graphs on the Transputer and Unix systems, where processors are configured as a one-dimensional array. The parallel MST algorithm uses a weight matrix data structure; and three implementations of the SPT algorithm are presented with unordered edge-list, linked adjacency list and adjacency matrix as data structures. The experiments are conducted with a wide range of random graphs, generated for various edge-densities (d) for a given number (n) of vertices. The edge-density is varied between 0.1 and 0.9, and the maximum number of vertices (or edges) considered are 300 (or 40000) and 500 (or 110000) for transputer and Unix systems, respectively. A maximum speed-up of 2.98 is achieved on the transputer network of eight processors, and that for the Unix system is 3.0 with four processors.< >
The parentheses matching problem is to determine the mate of each parenthesis in a balanced string of n parentheses. In this paper, we present three novel and elegant parallel algorithms for this problem on parallel r...
详细信息
Simulated annealing based standard cell placement for VLSI designs has long been acknowledged as a compute-intensive process. All previous work in parallel simulated annealing based placement has minimized area, but w...
详细信息
Simulated annealing based standard cell placement for VLSI designs has long been acknowledged as a compute-intensive process. All previous work in parallel simulated annealing based placement has minimized area, but with deep submicron design, minimizing wirelength delay is also needed. The algorithm discussed in this paper is the first parallel algorithm for timing driven placement. We have used a very accurate Elmore delay model which is more complete intensive and hence the need for parallel placement is more apparent. parallel placement is also needed for very large circuits that may not fit in the memory of a single processor. Therefore, our algorithm is circuit partitioned and can handle arbitrary large circuits on distributed memory multiprocessors. The algorithm, called mpi PLACE, has been tested on several large benchmarks on a variety of parallel architectures.
The goal of knowledge graph completion (KGC) is to predict missing facts among entities. Previous methods for KGC re-ranking are mostly built on non-generative language models to obtain the probability of each candida...
详细信息
Finitely inductive (F1) sequences are a class of sequences, finite or infinite, which are amenable to a certain mathematical representation which has direct significance to pattern recognition and string matching. Pat...
详细信息
In this paper, we propose three different parallel algorithms based on a state-of-the-art global router called TimberWolfSC. The parallel algorithms have been implemented by using the Message Passing Interface (MPI), ...
详细信息
In this paper, we propose three different parallel algorithms based on a state-of-the-art global router called TimberWolfSC. The parallel algorithms have been implemented by using the Message Passing Interface (MPI), and have been evaluated on a wide range of parallel platforms such as the Sun Sparccenter 1000 and the Intel Paragon. Our experimental results show good speedups and qualities from two of these parallel algorithms. We have been able to reduce runtimes of some circuits from half an hour to 5 minutes, obtained speedups of about 4.0 to 5.0 on 8 processors, with less than 2-3% degradation of quality of the solutions.
Defect detection aims to locate the accurate position of defects in images, which is of great significance to quality inspection in the industrial product manufacturing. Currently, many defect detection methods rely o...
详细信息
Defect detection aims to locate the accurate position of defects in images, which is of great significance to quality inspection in the industrial product manufacturing. Currently, many defect detection methods rely on deep neural networks to extract features. Although the accuracy of these methods is relatively high, it is computationally intensive, making the methods difficult to deploy in resource-limited edge devices. In order to solve these problems, a lightweight defect detection model for the industrial edge environment is proposed, termed the efficient defect detection network (EDDNet). EfficientNet-B0 is used as the feature extraction backbone, extracting feature maps from feature layers of different depths of the network and fusing multilevel features by multilevel feature fusion (MFF). To obtain more information, we redesign the attention mechanism in MBConv blocks, taking the encoding space (ES) attention mechanism as a new module, which solves the problem that the defective image spatial information is ignored. The experimental results on the NEU-DET and DAGM2007 datasets and PCB defect datasets demonstrate the effectiveness of the proposed EDDNet and its possibility for application in industrial edge device.
暂无评论