It is challenging to optimize GPU kernels because this progress requires deep technical knowledge of the underlying hardware. Modern GPU architectures are becoming more and more diversified, which further exacerbates ...
详细信息
Based on the analysis of logic optimization and task parallel allocation algorithm, combining with logic optimization features, we propose a new parallelprocessing algorithm for logic optimization scheduling and allo...
详细信息
Based on the analysis of logic optimization and task parallel allocation algorithm, combining with logic optimization features, we propose a new parallelprocessing algorithm for logic optimization scheduling and allocation. Considering the correlation between minimizers of each logic function, assign the logic associated with high correlation firstly, then withthe size of the matrix. through the studies of furnished example, the algorithm can be well in completing the logic scheduling.
the paper presents the methodology and the outcome of the compilation and the processing of the Bulgarian X-language parallel Corpus (Bul-X-Cor) which was integrated as part of the Bulgarian National Corpus (BulNC). W...
详细信息
ISBN:
(纸本)9782951740877
the paper presents the methodology and the outcome of the compilation and the processing of the Bulgarian X-language parallel Corpus (Bul-X-Cor) which was integrated as part of the Bulgarian National Corpus (BulNC). We focus on building representative parallel corpora which include a diversity of domains and genres, reflect the relations between Bulgarian and other languages and are consistent in terms of compilation methodology, text representation, metadata description and annotation conventions. the approaches implemented in the construction of Bul-X-Cor include using readily available text collections on the web, manual compilation (by means of Internet browsing) and preferably automatic compilation (by means of web crawling - general and focused). Certain levels of annotation applied to Bul-X-Cor are taken as obligatory (sentence segmentation and sentence alignment), while others depend on the availability of tools for a particular language (morpho-syntactic tagging, lemmatisation, syntactic parsing, named entity recognition, word sense disambiguation, etc.) or for a particular task (word and clause alignment). To achieve uniformity of the annotation we have either annotated raw data from scratch or transformed the already existing annotation to follow the conventions accepted for BulNC. Finally, actual uses of the corpora are presented and conclusions are drawn with respect to future work.
ATLIS (short for "ATLIS Tags Locations in Strings") is a tool being developed using a maximum-entropy machine learning model for automatically identifying information relating to spatial and locational infor...
详细信息
ISBN:
(纸本)9782951740877
ATLIS (short for "ATLIS Tags Locations in Strings") is a tool being developed using a maximum-entropy machine learning model for automatically identifying information relating to spatial and locational information in natural language text. It is being developed in parallel withthe ISO-Space standard for annotation of spatial information (Pustejovsky, Moszkowicz & Verhagen 2011). the goal of ATLIS is to be able to take in a document as raw text and mark it up with ISO-Space annotation data, so that another program could use the information in a standardized format to reason about the semantics of the spatial information in the document. the tool (as well as ISO-Space itself) is still in the early stages of development. At present it implements a subset of the proposed ISO-Space annotation standard: it identifies expressions that refer to specific places, as well as identifying prepositional constructions that indicate a spatial relationship between two objects. In this paper, the structure of the ATLIS tool is presented, along with preliminary evaluations of its performance.
Modern GPUs (Graphics processing Units) can be used for general purpose parallel computation. Users can develop parallel programs running on GPUs using programming architecture called CUDA (Compute Unified Device Arch...
详细信息
Fast convergence speed is a desired property for training topic models such as latent Dirichlet allocation (LDA), especially in online and parallel topic modeling algorithms for big data sets. In this paper, we develo...
详细信息
Statistical machine translation (SMT) systems depend on the availability of domain-specific bilingual parallel text. However parallel corpora are a limited resource and they are often not available for some domains or...
详细信息
Cloud computing opens new possibilities for computational biologists. Given the pay-as-you-go model and the commodity hardware base, new tools for extensive parallelism are needed to make experimentation in the cloud ...
详细信息
ISBN:
(纸本)9783642314995;9783642315008
Cloud computing opens new possibilities for computational biologists. Given the pay-as-you-go model and the commodity hardware base, new tools for extensive parallelism are needed to make experimentation in the cloud an attractive option. In this paper, we present Easy Prot, a parallel message-passing architecture designed for developing experimental workflows in computational biology while harnessing the power of cloud resources. the system exploits parallelism in two ways: by multithreading modular components on virtual machines while respecting data dependencies and by allowing expansion across multiple virtual machines. Components of the system, called elements, are easily configured for efficient modification and testing of workflows during ever-changing experimentation. though Easy Prot, as an abstract cloud programming model, can be extended beyond computational biology, current development brings cloud computing to experimenters in this important discipline who are facing unprecedented data-processing challenges, with a type system designed for proteomics, interactomics and comparative genomics data, and a suite of elements that perform useful analysis tasks on biological data using cloud resources. Availability: Easy Prot is available as a public abstract machine image (AMI) on Amazon EC2 cloud service, with an open source license, registered with manifest easyprot-ami/***.
Polynomial resultants are of fundamental importance in symbolic computations, especially in the field of quantifier elimination. In this paper we show how to compute the resultant res(y) (f, g) of two bivariate polyno...
详细信息
ISBN:
(纸本)9783642281440
Polynomial resultants are of fundamental importance in symbolic computations, especially in the field of quantifier elimination. In this paper we show how to compute the resultant res(y) (f, g) of two bivariate polynomials f, g is an element of Z[x, y] on a CUDA-capable graphics processing unit (GPU). We achieve parallelization by mapping the bivariate integer resultant onto a sufficiently large number of univariate resultants over finite fields, which are then lifted back to the original domain. We point out, that the commonly proposed special treatment for so called unlucky homomorphisms is unnecessary and how this simplifies the parallel resultant algorithm. All steps of the algorithm are executed entirely on the GPU. Data transfer is only used for the input polynomials and the resultant. Experimental results show the considerable speedup of our implementation compared to host-based algorithms.
A low-latency and low-diameter interconnection network will be an important component of future exascale architectures. the dragonfly network topology, a two-level directly connected network, is a candidate for exasca...
详细信息
ISBN:
(纸本)9780769549569;9781467362184
A low-latency and low-diameter interconnection network will be an important component of future exascale architectures. the dragonfly network topology, a two-level directly connected network, is a candidate for exascale architectures because of its low diameter and reduced latency. To date, small-scale simulations with a few thousand nodes have been carried out to examine the dragonfly topology. However, future exascale machines will have millions of cores and up to 1 million nodes. In this paper, we focus on the modeling and simulation of large-scale dragonfly networks using the Rensselaer Optimistic Simulation System (ROSS). We validate the results of our model against the cycle-accurate simulator "booksim". We also compare the performance of booksim and ROSS for the dragonfly network model at modest scales. We demonstrate the performance of ROSS on boththe Blue Gene/P and Blue Gene/Q systems on a dragonfly model with up to 50 million nodes, showing a peak event rate of 1.33 billion events/second and a total of 872 billion committed events. the dragonfly network model for million-node configurations strongly scales when going from 1,024 to 65,536 MPI tasks on IBM Blue Gene/P and IBM Blue Gene/Q systems. We also explore a variety of ROSS tuning parameters to get optimal results withthe dragonfly network model.
暂无评论