Given a pattern string of length m. for the string-matching problem, we design an algorithm that computes deterministic samples of a sufficiently long substring of the pattern in constant time. This problem used to be...
详细信息
Given a pattern string of length m. for the string-matching problem, we design an algorithm that computes deterministic samples of a sufficiently long substring of the pattern in constant time. This problem used to be the bottleneck in the pattern preprocessing for one- and two-dimensional pattern matching. The best previous time bound was O(log(2) m/log log m). We use this algorithm to obtain the following results (all algorithms below are optimal parallel algorithms on a CRCW PRAM): 1. a deterministic string-matching algorithm which takes O(log log m) time for preprocessing and constant time for text search, which are the best possible in both preprocessing and text search;2. a constant-time deterministic string-matching algorithm in the case where the text length n satisfies n = Omega(m(1+epsilon)) for a constant epsilon > 0;3. a simple string-matching algorithm that has constant time with high probability for random input;4. the main result: a constant-expected-time Las Vegas algorithm for computing the period of the pattern and all witnesses and thus for stringmatching itself;in both cases, an Omega(log log m) lower bound is known for deterministic algorithms.
Resource Description Framework (RDF) is a commonly used format for semantic web processing. It basically contains strings representing items and their relationships which can be queried or inferred. In this paper, we ...
详细信息
Resource Description Framework (RDF) is a commonly used format for semantic web processing. It basically contains strings representing items and their relationships which can be queried or inferred. In this paper, we propose a framework for processing large RDF data sets. It is based on Brute-force stringmatching on GPUs (BFG). Graphics Processing Units (GPUs) are used as a parallel platform that allows thousands of threads to find RDF data. Our search algorithm is customized to suit the nature of RDF processing and GPU memory architecture. Then, the algorithm is integrated into the proposed framework for computing queries and chaining rules for RDF data. Experiments show that utilizing these algorithms can achieve the speedup of 7 times for querying and for forward chaining compared to using the sequential version. The proposed framework can achieve a string comparison rate of 67,000 comparisons per second using 2 GPUs.
Resource Description Framework (RDF) is commonly used for the semantic web query. During this decade, due to big data processing, the large numbers of RDF triples are crawled. The triples usually stored distributed on...
详细信息
ISBN:
(纸本)9781479908066;9781479908059
Resource Description Framework (RDF) is commonly used for the semantic web query. During this decade, due to big data processing, the large numbers of RDF triples are crawled. The triples usually stored distributed on the clouds storage or the large clusters. To search for the query answer, it is usually difficult to handle the search across platforms. Also, the search takes a long executed time. Thus, the data representation and platform are important to speedup the search and handle the heterogeneousness. In this paper, we present the experimental framework which can be used to handle the search of RDF data in GPU clusters. Our framework uses the Java platform to manipulate the semantic query while using JCuda(1) to perform the GPU processing. Apache Cassandra storage, known as CumulusRDF, is used to store key-values for searching. In the experiments, DBpedia and Freebase dataset are extracted and manipulated. The triple structures are transformed and loaded into Apache Cassandra storage as CumulusRDF's flat layout. The subject-predicate-object keys are kept in CQL caching. There are about 3-hundred-million tags that can be handled with in one machine, which can reduce time, with an inexpensive cost. We shape the data grid from row-major-ordering of Java, to GPU thread grid of CUDA, retrieved keys to join for finding the correspondence of the RDF graph.
暂无评论