this paper presents a study of exact string matching algorithms and their performance behavior when executed on dynamic parallelism enabled Kepler Graphics processing Unit (GPU) by Nvidia. the algorithms considered in...
详细信息
ISBN:
(纸本)9781509041527
this paper presents a study of exact string matching algorithms and their performance behavior when executed on dynamic parallelism enabled Kepler Graphics processing Unit (GPU) by Nvidia. the algorithms considered in this paper are Quick search (QS), Horspool (HP), and Brute force (BF) string matching. their efficient implementation on Kepler gives a remarkable improvement over their respective multi-core CPU and generic GPU implementations. In addition, the proposed work further optimizes the algorithm by exploiting the fundamental architectural aspects of the GPU like the memory hierarchy, lightweight threads and avoiding strided access. the optimization associated withthe memory is called Binning and the one using memory alignment is named Chunking. As a result of the significant boost obtained by extracting the most out of these features, the newer methods are named as SWIFT. SWIFT algorithms give performance benefit of about 1.7X on an average and up to 5X in some cases, which will be discussed in the paper. Also, the paper proposes a hybrid algorithm that employs both regular GPU implementation and SWIFT based on a predefined condition that gives benefit for all pattern sizes. the experimental results for different pattern sizes using benchmark datasets are presented in the paper.
In distributed storage networks (DSN), server failures occur frequently. Although network coding can provide high reliability of storage in DSN, to maintain reliability, a new server, named the new comer, should be fi...
详细信息
Dangling pointer error is pervasive in C/C++ programs and it is very hard to detect. this paper introduces an efficient detector to detect dangling pointer error in C/C++ programs. By selectively leave some memory acc...
详细信息
the proceedings contain 40 papers. the topics discussed include: efficient heuristic for placing monitors on flow networks;efficient implementation of genetic algorithms on GP-GPU with scheduled persistent CUDA thread...
ISBN:
(纸本)9781467391177
the proceedings contain 40 papers. the topics discussed include: efficient heuristic for placing monitors on flow networks;efficient implementation of genetic algorithms on GP-GPU with scheduled persistent CUDA threads;exploiting pure superword level parallelism for array indirections;modeling binary oriented software buffer overflow vulnerability in process algebra;OpenISMA: an approach of achieving a scalable OpenFlow network by identifiers separating and mapping;an efficient tolerant-anisotropic localization for large-scale wireless sensor network;distributedprocessing of approximate range queries in wireless sensor networks;vector localization algorithm based on signal strength in wireless sensor network;and parallel and improvement of pre-computation technique for approximation shortest distance query.
Given the large amount of data from different sources that have become available to researchers in multiple fields, Data Science has emerged as a new paradigm for exploring and getting value from that data. In that co...
详细信息
ISBN:
(纸本)9781509061082
Given the large amount of data from different sources that have become available to researchers in multiple fields, Data Science has emerged as a new paradigm for exploring and getting value from that data. In that context, new parallelprocessing environments with abstract programming interfaces, like Spark, were proposed to try to simplify the development of distributed programs. Although such solutions have become widely used, achieving the best performance withthem is still not always straight-forward, despite the multiple run-time strategies they use. In this work we analyze some of the causes of performance degradation in such systems and, based on that analysis, we propose a tool to improve performance by dynamically adjusting data partitioning and parallelism degree in recurrent applications based on previous executions. Our results applying that methodology show consistent reductions in execution time for the applications considered, with gains of up to 50%.
thispaper presents a new design of negative resistance oscillator with low phase noise by using microwave microstrip technology. this microstrip oscillator is designed at 2.4 GHz frequency for applications in mobile c...
详细信息
ISBN:
(纸本)9781509051465
thispaper presents a new design of negative resistance oscillator with low phase noise by using microwave microstrip technology. this microstrip oscillator is designed at 2.4 GHz frequency for applications in mobile communication, wireless network, wireless fidelity, and Bluetooth. the final circuit is designed by using microstrip distributed resonator. the distributed resonator can be modeled by using the parallel RLC at the resonant frequency. the high Q factor of the resonator provides high carrier output power and low phase noise. the output power is obtained around 7.902dBm at 2.4 GHz with 2V DC supply and 12 mA current consumption. In this proposed oscillator, each step has been analyzed by using Advanced Design System ADS and following a theoretical study which permits to optimize the different performances of the whole circuit.
Data access of a massive collection of geographic spatial data is one of the serious bottlenecks in large-scale datacentric applications in the big data era such as data assimilation and urban data analytic systems. I...
详细信息
ISBN:
(纸本)9781509051465
Data access of a massive collection of geographic spatial data is one of the serious bottlenecks in large-scale datacentric applications in the big data era such as data assimilation and urban data analytic systems. In this paper, we consider the issue of implementation of distributed spatial indices, specifically quad trees, on a distributed computing system in the shared-nothing memory approach. We discuss static and dynamic partitioning and allocation strategies for data and queries across distributed nodes. Using scale-down parallel data load and search experiments with a small distributed processor system as proof-of-concept, we show that the proposed approach with a collection of small indices of distributed shared-nothing memory is more efficient than the conventional approach with a single processor with a large external index. We also observed that the proposed tree-based partitioning and assignment strategy using sampling reduces query time than other conventional partitioning strategies used in databases. We also discuss how to allocate a collection of small tree indices among distributed processors. these results suggest that the use of parallelized access to databases with spatial indexing functions can enhance the throughput of large-scale data-centric applications.
the Pipe-and-Filter style represents a well-known family of component-based architectures. By executing each filter on a dedicated processing unit, it is also possible to leverage contemporary distributed systems and ...
详细信息
ISBN:
(纸本)9781509025695
the Pipe-and-Filter style represents a well-known family of component-based architectures. By executing each filter on a dedicated processing unit, it is also possible to leverage contemporary distributed systems and multi-core systems for a high throughput. However, this simple parallelization approach is not very effective when (1) the workload is uneven distributed over all filters and when (2) the number of available processing units exceeds the number of filters. In the first case, parallelizing all filters can lead to a waste of resources since only the slowest filter is responsible for the overall throughput. In the second case, some processing units remain unused. In this paper, we present an automatic parallelization approach providing high throughput and utilizing the available processing units. Our main idea is to provide a composite filter that is wrapped around an existing filter to increase its throughput. We call this composite filter the Task Farm Filter since it implements the Task Farm parallelization pattern. It creates and executes multiple instances of the underlying filter in parallel. Moreover, we present a modular, self-adaptive mechanism that automatically adapts the number of instances at runtime to achieve the highest possible throughput. Finally, we present an extensive experimental evaluation of our self-adaptive task farm filter by employing a CPU-intensive, an I/O-intensive, and a hybrid scenario. the evaluation shows that our task farm automatically parallelize the underlying filter and thus increases the overall throughput. Furthermore, the evaluation shows that our task farm scales well withthe workload of the executed Pipe-and-Filter architecture.
the proceedings contain 68 papers. the topics discussed include: clustering geo-tagged tweets for advanced big data analytics;optimizing Hadoop framework for solid state drives;a case study of optimizing big data anal...
ISBN:
(纸本)9781509026227
the proceedings contain 68 papers. the topics discussed include: clustering geo-tagged tweets for advanced big data analytics;optimizing Hadoop framework for solid state drives;a case study of optimizing big data analytical stacks using structured data shuffling;Geelytics: enabling on-demand edge analytics over scoped data sources;continual and cost-effective partitioning of dynamic graphs for optimizing big graph processing systems;distributed incremental graph analysis;model transformation and data migration from relational database to MongoDB;enhanced state history tree (eSHT) : a stateful data structure for analysis of highly parallel system traces;analytics toolkit for business big data;a NoSQL data model for scalable big data workflow execution;evaluation and analysis of in-memory key-value systems;predictive modeling in a big data distributed setting: a scalable bias correction approach;improving the visualization of WordNet large lexical database through semantic tag clouds;boosting vertex-cut partitioning for streaming graphs;clustering geo-tagged tweets for advanced big data analytics;and improving the visualization of WordNet large lexical database through semantic tag clouds.
暂无评论