In this paper, a parallel multi-path variant of the well-known TSAB algorithm for the job shop scheduling problem is proposed. Coarse-grained parallelization method is employed, which allows for great scalability of t...
详细信息
In this paper, a parallel multi-path variant of the well-known TSAB algorithm for the job shop scheduling problem is proposed. Coarse-grained parallelization method is employed, which allows for great scalability of the algorithm with accordance to Gustafon's law. The resulting P-TSAB algorithm is tested using 162 well-known literature benchmarks. Results indicate that P-TSAB algorithm with a running time of one minute on a modern PC provides solutions comparable to the ones provided by the newest literature approaches to the job shop scheduling problem. Moreover, on average P-TSAB achieves two times smaller percentage relative deviation from the best known solutions than the standard variant of TSAB. The use of parallelization also relieves the user from having to fine-tune the algorithm. The P-TSAB algorithm can thus be used as module in real-life production planning systems or as a local search procedure in other algorithms. It can also provide the upper bound of minimal cycle time for certain problems of cyclic scheduling.
With advantages such as non-destructiveness, high sensitivity and high accuracy, optical techniques have successfully integrated into various important physical quantities in experimental mechanics (EM) and optical me...
详细信息
With advantages such as non-destructiveness, high sensitivity and high accuracy, optical techniques have successfully integrated into various important physical quantities in experimental mechanics (EM) and optical measurement (OM). However, in pursuit of higher image resolutions for higher accuracy, the computation burden of optical techniques has become much heavier. Therefore, in recent years, heterogeneous platforms composing of hardware such as CPUs and GPUs, have been widely employed to accelerate these techniques due to their cost-effectiveness, short development cycle, easy portability, and high scalability. In this paper, we analyze various works by first illustrating their different architectures, followed by introducing their various parallel patterns for high speed computation. Next, we review the effects of CPU and GPU parallel computing specifically in EM & OM applications in a broad scope, which include digital image/volume correlation, fringe pattern analysis, tomography, hyperspectral imaging, computer-generated holograms, and integral imaging. In our survey, we have found that high parallelism can always be exploited in such applications for the development of high-performance systems. (C) 2017 Elsevier Ltd. All rights reserved.
Rapid advancement in sensing, communication, and mobile technologies brings a new wave of Industrial Internet of Things (IIoT). IIoT integrates a large number of sensors for smart and connected monitoring of machine c...
详细信息
Rapid advancement in sensing, communication, and mobile technologies brings a new wave of Industrial Internet of Things (IIoT). IIoT integrates a large number of sensors for smart and connected monitoring of machine conditions. Sensor observations contain rich information on operational signatures of machines, thereby providing a great opportunity for machine condition monitoring and control. However, realizing the full potential of IIoT depends to a great extent on the development of new methodologies using big data analytics. This paper presents a new methodology for large-scale IIoT machine information processing, network modeling, condition monitoring, and fault diagnosis. First, we introduce a dynamic warping algorithm to characterize the dissimilarity of machine signatures (e.g., power profiles during operations). Second, we develop a stochastic network embedding algorithm to construct a large-scale network of IIoT machines, in which the dissimilarity between machine signatures is preserved in the network node-to-node distance. When the machine condition varies, the location of the corresponding network node changes accordingly. As such, node locations will reveal diagnostic information about machine conditions. However, the network embedding algorithm is computationally expensive in the presence of large amounts of 11 T-enabled machines. Therefore, we further develop a parallel computing scheme that harnesses the power of multiple processors for efficient network modeling of large-scale IIoT-enabled machines. Experimental results show that the developed algorithm efficiently and effectively characterizes the variations of signatures in both cycle-to-cycle and machine-to-machine scales. This new approach shows strong potentials for optimal machine scheduling and maintenance in the context of large-scale IIoT. (C) 2018 The Society of Manufacturing Engineers. Published by Elsevier Ltd. All rights reserved.
Concept-cognitive learning, as an interdisciplinary study of concept lattice and cognitive learning, has become a hot research direction among the communities of rough set, formal concept analysis and granular computi...
详细信息
Concept-cognitive learning, as an interdisciplinary study of concept lattice and cognitive learning, has become a hot research direction among the communities of rough set, formal concept analysis and granular computing in recent years. The main objective of concept-cognitive learning is to learn concepts from a give clue with the help of cognitive learning methods. Note that this kind of studies can provide concept lattice insight to cognitive learning. In order to deal with more complex data and improve learning efficiency, this paper investigates parallel computing techniques for concept-cognitive learning in terms of large data and multi-source data based on granular computing and information fusion. Specifically, for large data, a parallel computing framework is designed to extract global granular concepts by combining local granular concepts. For multi-source data, an effective information fusion strategy is adopted to obtain final concepts by integrating the concepts from all single-source data. Finally, we conduct some numerical experiments to evaluate the effectiveness of the proposed parallel computing algorithms.
Convolutional neural network (CNN) is a deep feed-forward artificial neural network, which is widely used in image recognition. However, this mode highlights the problems that the training time is too long and memory ...
详细信息
ISBN:
(纸本)9783030053666;9783030053659
Convolutional neural network (CNN) is a deep feed-forward artificial neural network, which is widely used in image recognition. However, this mode highlights the problems that the training time is too long and memory is insufficient. Traditional acceleration methods are mainly limited to optimizing for an algorithm. In this paper, we propose a method, namely CNN-S, to improve training efficiency and cost based on Storm and is suitable for every algorithm. This model divides data into several sub sets and processes data on several machine in parallel flexibly. The experimental results show that in the case of achieving a recognition accuracy rate of 95%, the training time of single serial model is around 913 s, and in CNN-S model only needs 248 s. The acceleration ratio can reach 3.681. This shows that the CNN-S parallel model has better performance than single serial mode on training efficiency and cost of system resource.
Electromagnetic-thermal co-simulation of large antenna array on platform is performed using in-house enhanced finite element solver with massively parallel computing capability. It appropriately integrates both domain...
详细信息
ISBN:
(纸本)9780996007894
Electromagnetic-thermal co-simulation of large antenna array on platform is performed using in-house enhanced finite element solver with massively parallel computing capability. It appropriately integrates both domain decomposition method (DDM) so as to solve very large complex matrix efficiently during the implementation of finite element procedure. Its massively parallel computing capability is examined on the supercomputer, where the antenna arrays of 16*16 operating at 2.5 GHz, together with their platform are simulated, respectively, and its accuracy is validated in comparison with the numerical result obtained by commercial software but just with small scale. The computational efficiency of such in-house developed solver can reach a speedup of 7.149 and strong scalability efficiency of 64.2% on 512 CPU cores of 32 computing nodes, and its speedup, scalability and efficiency are also examined. In particular, not only the radiation characteristics but also the surface temperature distribution of large antenna arrays are for the first time addressed in the high power operating state, which directly support their reliability design.
In online dynamic graph drawing, constraints over nodes and node pairs are added to help preserve a coherent mental map in a sequence of graphs. However, defining constraints is challenging due to the requirements of ...
详细信息
ISBN:
(纸本)9781728143286
In online dynamic graph drawing, constraints over nodes and node pairs are added to help preserve a coherent mental map in a sequence of graphs. However, defining constraints is challenging due to the requirements of both preserving mental map and satisfying the visual aesthetics of a graph layout. The requirements underline the importance of properly evaluating the node stability based on the graph changes. Most existing algorithms basically depend on local changes without evaluating their global propagation, as the calculation on the global propagation recursively takes an expensive time cost. To reach the purpose of evaluating globally and saving time cost at the same time, we introduce a heuristic model on which parallel computing can be implemented to reduce time cost. This model uses inverse analysis of Markov process to simulate the node movement hence to give a global analysis of the layout's change, according to which different constraints can be set. Experiments demonstrate that our method preserves both structure and position similarity to help users track graph changes visually, at the same time has similar time cost as the local evaluation algorithm.
Because Amazon chess is very complicated, so it is very suitable for the study of machine games. For the Amazon chess human-machine game system, incorporating transposition tables and historical heuristics based on tr...
详细信息
ISBN:
(纸本)9781728101057
Because Amazon chess is very complicated, so it is very suitable for the study of machine games. For the Amazon chess human-machine game system, incorporating transposition tables and historical heuristics based on traditional PVS algorithms, optimized the traditional PVS algorithm, improve its pruning efficiency. After that, using the OpenMP standard, parallel design of the above game algorithm. Through experiments, parallel computing greatly increases CPU utilization, effectively improve the efficiency of the PVS algorithm, enable the algorithm to search for deeper layers at the same time, thus a substantial increase in playing strength.
In this work, we introduce an hybrid WiNoC, which judicially uses the wired and wireless interconnects for broadcasting/multicasting of packets. A code division multiple access (CDMA) method is used to support multipl...
详细信息
ISBN:
(纸本)9781450367004
In this work, we introduce an hybrid WiNoC, which judicially uses the wired and wireless interconnects for broadcasting/multicasting of packets. A code division multiple access (CDMA) method is used to support multiple broadcast operations originating from multiple applications executed on the multiprocessor platform. The CDMA-based WiNoC is compared in terms of network latency and power consumption with wired-broadcast/multicast NoC.
The aim of Elliptic Curve Cryptosystems (ECC) is to achieve the same security level as RSA but with shorter key size. The basic operation in the ECC is scalar multiplication which is an expensive operation. In this pa...
详细信息
ISBN:
(纸本)9789897583599
The aim of Elliptic Curve Cryptosystems (ECC) is to achieve the same security level as RSA but with shorter key size. The basic operation in the ECC is scalar multiplication which is an expensive operation. In this paper, we focus on optimizing ECC scalar multiplication based on Non-Adjacent Form (NAF). A new algorithm is introduced that combines an Add-Subtract Scalar Multiplication Algorithm with NAF representation to accelerate the performance of the ECC calculation. parallelizing the new algorithm shows an efficient method to calculate ECC. The proposed method has sped up the calculation up to 60% compared with the standard method.
暂无评论