Universal combinatorics coding is a new kind of coding method and whole ordinal plays an important role in universal combinatorics coding. Whole ordinal GPU parallel algorithm is proposed in this paper. Whole ordinal ...
详细信息
ISBN:
(纸本)9781479913909
Universal combinatorics coding is a new kind of coding method and whole ordinal plays an important role in universal combinatorics coding. Whole ordinal GPU parallel algorithm is proposed in this paper. Whole ordinal computing is implemented by combining the advantage of GPU in aspect of complicated and vast parallelcomputing with CPU technology, and employing CPU+GPU heterogeneous program development model. It makes whole ordinal parallelcomputing implement by calling kernel function of large number multiplication calculation in original serial program of computing whole ordinal. In the process of research, a series of GPU parallel problem are solved by optimizing program code many times. Experiments show that GPU parallel method improves the speed of calculating whole ordinal greatly, and then improves the computation efficiency of relevant experiments. Whole ordinal parallel implementation also has important meaning to estimating relevant experimental data and pushes the pace of universal combinatorics coding into practical at the same time.
The parallelcomputing method of max ordinal number in universal combinatorics coding based on GPU is advanced in this paper. The core of universal combinatorics coding computing is ordinal number computing while the ...
详细信息
ISBN:
(纸本)9781479913909
The parallelcomputing method of max ordinal number in universal combinatorics coding based on GPU is advanced in this paper. The core of universal combinatorics coding computing is ordinal number computing while the calculation of ordinal number depends on the value of max ordinal number. The calculation of max ordinal number is divided into two parts, multiplication and division of large numbers. This paper focus on the GPU parallelcomputing of the multiplication in max ordinal number calculation, the calculation speed of the multiplication part is increased substantially, and thus improves the efficiency of max ordinal number calculation.
The Graph Isomorphism (GI) problem has been extensively studied due to its significant applications. The most effective class of GI algorithms, i.e., canonical labeling algorithms, are suitable for either graphs with ...
详细信息
ISBN:
(纸本)9780769550886
The Graph Isomorphism (GI) problem has been extensively studied due to its significant applications. The most effective class of GI algorithms, i.e., canonical labeling algorithms, are suitable for either graphs with high randomness or symmetry, or graphs for which both of them are not strongly held. Also, the core operations of canonical labeling algorithms, i.e., individuation-refinement (IR) and certificate comparison, usually occupy more than 70% of the total running time. How to weaken the limitations of structures and improve the efficiency of these operations are challenges. In this paper, we present an efficient GI algorithm called PEACE, which is particularly suitable for graphs with high randomness or symmetry. We present a parallel implementation of our algorithm on GPUs. We design some new techniques and also use some existing methods to speed up calculations under CUDA. More importantly, these techniques can be applied to all IR-based GI algorithms. We evaluate the proposed algorithm on various graphs to make comprehensive comparison with currently the most efficient canonical labeling algorithms on CPUs. Experimental results show that PEACE is superior to other algorithms on graphs with high symmetry or many automorphisms, and up to 50% performance increase can be achieved in the best case. We also apply our parallel techniques on these algorithms, and compare the performance on CPU and multiple GPU devices. The results show that the techniques make all algorithms gain 15 similar to 55 speedup.
In multi-sink Wireless Sensor Networks (WSNs), different sinks may issue same queries and therefore they can share some common query results. As a classical data-sharing method, data caching is employed widely in trad...
详细信息
In multi-sink Wireless Sensor Networks (WSNs), different sinks may issue same queries and therefore they can share some common query results. As a classical data-sharing method, data caching is employed widely in traditional databases and it can also bring benefits in WSNs thanks to the emergence of sensors with Flash Memory. Therefore, a query-processing method based on common subtree (CS) caching in multi-sink WSNs is proposed. We first construct CSs and then cache the results of processed queries at the roots of CSs. Subsequently, later common queries can obtain their expected results from the common roots. Furthermore, in order to achieve a higher degree of data sharing, we propose a loop-removing algorithm. Extensive simulations indicate that the proposed method can reduce the energy consumption and the query response time significantly for WSNs.
Approximate string matching using the.. mismatch technique has been widely applied to many fields such as virus detection and computational biology. The traditional parallel algorithms are all based on multiple proces...
详细信息
ISBN:
(纸本)9780769546766
Approximate string matching using the.. mismatch technique has been widely applied to many fields such as virus detection and computational biology. The traditional parallel algorithms are all based on multiple processors, which have high costs of computing and communication. GPU has high parallel processing capability, low cost of computing, and less time of communication. To the best of our knowledge, there is no any parallel algorithm for approximate string matching with.. mismatches on GPU. With a new parallel programming model based on CUDA, we present three parallel algorithms and their implementations on GPU, namely, the thread parallel algorithm, the block-thread parallel algorithm, and the OPT-block-thread parallel algorithm. The OPT-block-thread parallel algorithm can take full advantage of the powerful parallel capability of GPU. Furthermore, it balances the load among the threads and optimizes the execution time with the memory model of GPU. Experimental results show that compared with the traditional sequential algorithm on CPU, our best parallel algorithm on GPU in this paper achieves speedup of 40-80.
In this paper, we study the problem of join query in MANET and propose a new type of join query called Single-join Query Based on Caching (SQBC). This problem is proved to be a NP-hard problem and we present two polyn...
详细信息
ISBN:
(纸本)9783642318696
In this paper, we study the problem of join query in MANET and propose a new type of join query called Single-join Query Based on Caching (SQBC). This problem is proved to be a NP-hard problem and we present two polynomial approximate greedy algorithms (SOCA and MOCA) to solve the single-join query and multi-join query respectively. Both of these algorithms can ensure the low energy consumption. Moreover, MOCA can reduce the response time of query processing and minimize the energy consumption by considering the overhead of join query. Theoretical analysis and experimental results show that the algorithms proposed in this paper can effectively reduce the energy consumption, prolong the survival period of the network and enhance query efficiency.
To handle triple hidden terminal problems, this paper proposes RCO, an asynchronous multi-channel MAC protocol with random cooperation for sensor networks. By adopting a probability-based random cooperation, RCO effec...
详细信息
ISBN:
(纸本)9783642163548
To handle triple hidden terminal problems, this paper proposes RCO, an asynchronous multi-channel MAC protocol with random cooperation for sensor networks. By adopting a probability-based random cooperation, RCO effectively alleviates, if not eliminates, triple hidden terminal problems. More importantly, RCO is fully distributed with no requirements of time synchronization or multi-radio. Therefore, it is very easy to implement RCO in resource-constrained wireless sensor nodes. The simulation and real testbed experimental results show that RCO achieves significant improvement in energy efficiency with increasing benefit when the number of channels and traffic loads increase, while maintaining higher throughput.
This paper proposes ARM, an receiver-initiated MAC protocol with duty cycling to tackle control channel saturation, triple hidden terminal and low broadcast reliability problems in asynchronous multi-channel WSNs. By ...
详细信息
ISBN:
(纸本)9781424493289
This paper proposes ARM, an receiver-initiated MAC protocol with duty cycling to tackle control channel saturation, triple hidden terminal and low broadcast reliability problems in asynchronous multi-channel WSNs. By adopting a receiver-initiated transmission scheme and probability-based random channel selection, ARM effectively solves control channel saturation and triple hidden terminal problems. Further, ARM employs a receiver-adjusted broadcast scheme to guarantee broadcast reliability for broadcast-intensive applications. Via the theoretical analysis, two factors that assist ARM to handle these problems are derived. The simulation and real testbed experimental results show that via solving these three problems ARM achieves significant improvement in energy efficiency and throughput. Moreover, ARM exhibits a prominent ability to enhance its broadcast reliability.
暂无评论