the increasing popularity of advanced schedule-based techniques designed to solve Grid scheduling problems requires the use of efficient datastructures to represent the constructed job schedules. Based on our previou...
详细信息
the Collaborative Computing Project for NMR (CCPN) has build a software framework consisting of the CCPN data model (with APIs) for NMR related data, the CcpNmr Analysis program and additional tools like CcpNmr Format...
详细信息
In this paper we review the progress in the design of low-complexity digital correction structures and algorithms for time-interleaved ADCs over the last five years. We devise a discrete-time model, state the design p...
详细信息
In this paper we review the progress in the design of low-complexity digital correction structures and algorithms for time-interleaved ADCs over the last five years. We devise a discrete-time model, state the design problem, and finally derive the algorithms and structures. In particular, we discuss efficient algorithms to design time-varying correction filters as well as iterative structures utilizing polynomial based filters. Finally, we give an outlook to future research questions.
Web-scale digital assets comprise millions or billions of documents. Due to such increase, sequential algorithms cannot cope withthis data, and parallel and distributed computing become the solution of choice. MapRed...
详细信息
Web-scale digital assets comprise millions or billions of documents. Due to such increase, sequential algorithms cannot cope withthis data, and parallel and distributed computing become the solution of choice. MapReduce is a programming model proposed by Google for scalable data processing. MapReduce is mainly applicable for data intensive algorithms. In contrast, the message passing interface (MPI) is suitable for high performance algorithms. this paper proposes an adapted structure of MapReduce programming model using MPI for multimedia indexing. Experimental results on a large number of text (XML) excerpts related to images from the ImageNet corpus indicate that our implementation achieved good speedup compared to the sequential version and the earlier versions of MapReduce using MPI. Extensions to index large-scale multimedia collections are discussed.
the Document Analysis and Exploitation platform is a sophisticated technical environment that consists of a repository containing document images, implementations of document analysis algorithms, and the results of th...
详细信息
the Document Analysis and Exploitation platform is a sophisticated technical environment that consists of a repository containing document images, implementations of document analysis algorithms, and the results of these algorithms when applied to data in the repository. the use of a web services model makes it possible to set up document analysis pipelines that form the basis for reproducible protocols. Since the platform keeps track of all intermediate results, it becomes an information resource for the analysis of experimental data. this paper provides a tutorial on how to get started using the platform. It covers the technical details needed to overcome the initial hurdles and have a productive experience with DAE.
Every day the number of citations an author receives is becoming more important than the size of his list of publications. the automatic extraction of bibliographic references in scientific articles is still a difficu...
详细信息
Every day the number of citations an author receives is becoming more important than the size of his list of publications. the automatic extraction of bibliographic references in scientific articles is still a difficult problem in Document Engineering, even if the document is originally in digital form. this paper presents a strategy for extracting references of scientific documents in PDF format. the scheme proposed was validated in Live Memory platform, developed to generate digital libraries of proceedings of technical events.
Multi-core CPUs are very efficient at executing multiple threads at the same time without significant performance penalty; this capability, however, results in increasing demand for the memory and the caches, which no...
详细信息
Multi-core CPUs are very efficient at executing multiple threads at the same time without significant performance penalty; this capability, however, results in increasing demand for the memory and the caches, which not only have to serve multiple parallel requests but also have to endure the consequences of parallel programming. the performance of parallel applications is not only limited by the CPU and the internal level of parallelism that the algorithms and datastructures allow, but is also restricted by their memory characteristics. Altering the data structure for example to work with machine word-sized pointers required by atomic operations incurs additional cache misses, while a lock protecting a critical section not only consumes memory, but could also be responsible for increased memory traffic for cache-line invalidations when acquired or released. We investigate these effects and analyze the behavior of different parallelization mechanisms, both blocking and lock-free solutions, through the example of a basic data structure: a hash table.
Techniques and performance of text recognition systems and software has shown great improvement in recent years. OCRs now can read any machine printed document with good accuracy. However, the advancements are primari...
详细信息
Techniques and performance of text recognition systems and software has shown great improvement in recent years. OCRs now can read any machine printed document with good accuracy. However, the advancements are primarily for Latin scripts and even for such scripts performance is limited in case of handwritten documents. Little work has been done for cursive scripts such as Arabic and still there is a room for improvement both in terms of accuracy and techniques. this paper presents an algorithm to recognize handwritten Arabic text using an ensemble of biased classifiers in a hierarchical setting. We address the fundamental shortcomings of the traditional Machine Learning paradigms when applied to Arabic scripts. Experiments have been conducted on the AMA Arabic dataset to show the efficacy of our method.
Among the biological sequences, sequential pattern mining reveals implicit motifs/patterns, which are of functional significance and have specific structures. Small alphabets and long sequences, such as DNA and protei...
详细信息
Graph theory provides a set of powerful tools (boththeorems and algorithms) for problem modeling and solving in numerous domains. though there are several libraries implementing graph algorithms and targeting differe...
详细信息
Graph theory provides a set of powerful tools (boththeorems and algorithms) for problem modeling and solving in numerous domains. though there are several libraries implementing graph algorithms and targeting different platforms and users, few of those offer parallel implementations. To the best of our knowledge, there is a particular need for an easier to use and extend library, specifically designed to exploit the multicore architecture trend for high performance parallelism. In this paper we describe Magical, a new OpenMP-based C++ multicore graph library. Our focus is to provide an implementation of graph algorithms which is designed for multicore architectures, by means of an easy to use application programming interface. We describe the library design and evaluate its performance by means of a case study concerning a shortest-paths problem.
暂无评论