In this paper, we present a novel approach to microaneurysm candidate extraction. To strengthen the accuracy of individual algorithms, we propose an ensemble of state-of-the-art candidate extractors. We apply a simula...
详细信息
ISBN:
(纸本)9783642252051
In this paper, we present a novel approach to microaneurysm candidate extraction. To strengthen the accuracy of individual algorithms, we propose an ensemble of state-of-the-art candidate extractors. We apply a simulated annealing based method to select an optimal combination of such algorithms for a particular dataset. We also present a novel classification technique, which is based on a parallel ensemble of kernel density estimators. the experimental results show improvement in the positive likelihood rate compared to the individual candidate extractors.
Pairwise sequence alignment has received a new motivation due to the advent of next-generation sequencing technologies, particularly so for the application of re-sequencing-the assembly of a genome directed by a refer...
详细信息
Modern multicore processor technology can fairly easily deliver special accelerator processors dedicated to fast optimised execution of critical computational functions. Multi CMP (Chip Multi-Processor) systems can be...
详细信息
ISBN:
(纸本)9783642314995;9783642315008
Modern multicore processor technology can fairly easily deliver special accelerator processors dedicated to fast optimised execution of critical computational functions. Multi CMP (Chip Multi-Processor) systems can be composed as a set of dedicated and general purpose computational modules interconnected by a global data exchange network. the paper proposes special program scheduling algorithms for such systems. Dedicated CMP modules assumed in the paper are based on a new data communication model called communication on the fly. It enables strong reduction of inter process and inter core communication overheads for intensively shared data.
MultiUN is a multilingual parallel corpus extracted from the official documents of the United Nations. It is available in the six official languages of the UN and a small portion of it is also available in German. thi...
详细信息
ISBN:
(纸本)9782951740877
MultiUN is a multilingual parallel corpus extracted from the official documents of the United Nations. It is available in the six official languages of the UN and a small portion of it is also available in German. this paper presents a major update on the first public version of the corpus released in 2010. this version 2 consists of over 513, 091 documents, including around 9% of new documents retrieved from the United Nations official document system. Compared to the first release, we applied several modifications to the corpus preparation method. In this paper, we describe the methods we used for processingthe UN documents and aligning the sentences. the most significant improvement compared to the previous release is the newly added multilingual sentence alignment information. the alignment information is encoded together withthe text in XML instead of additional files. Our representation of the sentence alignment allows quick construction of aligned texts parallel in arbitrary number of languages, which is essential for building machine translation systems.
In this paper, we present a trilingual parallel corpus for German, Italian and Romansh, a Swiss minority language spoken in the canton of Grisons. the corpus called ALLEGRA contains press releases automatically gather...
详细信息
ISBN:
(纸本)9782951740877
In this paper, we present a trilingual parallel corpus for German, Italian and Romansh, a Swiss minority language spoken in the canton of Grisons. the corpus called ALLEGRA contains press releases automatically gathered from the website of the cantonal administration of Grisons. Texts have been preprocessed and aligned with a current state-of-the-art sentence aligner. the corpus is one of the first of its kind, and can be of great interest, particularly for the creation of natural language processing resources and tools for Romansh. We illustrate the use of such a trilingual resource for automatic induction of bilingual lexicons, which is a real challenge for under-represented languages. We induce a bilingual lexicon for German-Romansh by phrase alignment and evaluate the resulting entries withthe help of a reference lexicon. We then show that the use of the third language of the corpus Italian as a pivot language can improve the precision of the induced lexicon, without loss in terms of quality of the extracted pairs.
In the implementation of a surface realisation engine, many of the computational techniques seen in other AI fields have been widely applied. Among these, the use of statistical methods has been particularly successfu...
详细信息
ISBN:
(纸本)9782951740877
In the implementation of a surface realisation engine, many of the computational techniques seen in other AI fields have been widely applied. Among these, the use of statistical methods has been particularly successful, as in the so-called 'generate-and-select', or 2-stages architectures. Systems of this kind produce output strings from possibly underspecified input data by over-generating a large number of alternative realisations (often including ungrammatical candidate sentences.) these are subsequently ranked withthe aid of a statistical language model, and the most likely candidate is selected as the output string. Statistical approaches may however face a number of difficulties. Among these, there is the issue of data sparseness, a problem that is particularly evident in cases such as our target language - Brazilian Portuguese - which is not only morphologically-rich, but relatively poor in NLP resources such as large, publicly available corpora. In this work we describe a first implementation of a shallow surface realisation system for this language that deals withthe issue of data sparseness by making use of factored language models built from a (relatively) large corpus of Brazilian newspapers articles.
the coming generation of supercomputing architectures will require fundamental changes in programming models to effectively make use of the expected million to billion way concurrency and thousand-fold reduction in pe...
详细信息
In this paper, we propose an implementation of a parallel two-dimensional fast Fourier transform (FFT) using Intel Advanced Vector Extensions (AVX) instructions on multi-core processors. the combination of vectorizati...
详细信息
parallelprocessing is essential to mining frequent closed sequences from massive volume of data in a timely manner. On the other hand, MapReduce is an ideal software framework to support distributed computing on larg...
详细信息
the development of scientific software, reliable and efficient, in distributed computing environments, requires the identification and the analysis of issues related to the design and the deployment of algorithms for ...
详细信息
ISBN:
(数字)9783642314643
ISBN:
(纸本)9783642314636;9783642314643
the development of scientific software, reliable and efficient, in distributed computing environments, requires the identification and the analysis of issues related to the design and the deployment of algorithms for high-performance computing architectures and their integration in distributed contexts. In these environments, indeed, resources efficiency and availability can change unexpectedly because of overloading or failure i.e. of both computing nodes and interconnection network. the scenario described above, requires the design of mechanisms enabling the software to "survive" to such unexpected events by ensuring, at the same time, an effective use of the computing resources. Although many researchers are working on these problems for years, fault tolerance, for some classes of applications is an open matter still today. Here we focus on the design and the deployment of a checkpointing/migration system to enable fault tolerance in parallel applications running in distributed environments. In particular we describe details about HADAB, a new hybrid checkpointing strategy, and its deployment in a meaningful case study: the PETSc Conjugate Gradient algortithm implementation. the related testing phase has been performed on the University of Naples distributed infrastructure (***.P.E. infrastructure).
暂无评论