there is currently a widespread interest in using phased execution models in the design of real-time systems, stimulated by their ability to reduce contention in accessing shared resources. the main beneficiary of suc...
详细信息
ISBN:
(数字)9798331532833
ISBN:
(纸本)9798331532840
there is currently a widespread interest in using phased execution models in the design of real-time systems, stimulated by their ability to reduce contention in accessing shared resources. the main beneficiary of such approaches is the wide category of Commercial Off-the-Shelf (COTS) multi-core systems, where complexity is an underlying issue. However, the advantage of predictability comes at a cost, which often resides in the additional decisions that are usually made when using phased execution models. One such decision is employing non-preemptive scheduling; as is well known, this leads to NP-hardness for most instances of the scheduling problem, while preemption can avoid this level of complexity in many cases. It is no surprise, then, that recent efforts have attempted to explore the possibilities offered by preemptive scheduling in combination with phased execution models. Unlike existing work, the current paper focuses on the scratchpad memory consumption in preemptive phased execution models. the starting point, the model introduced by the authors of a previous article, was meant for the general analysis of the efficiency of preemptive scheduling in phased execution models. Notably, two methods for handling intermediate data are introduced, namely, the Waiting Time Minimizing Preemption (WMPM) and the Overhead Minimizing Preemption (OMPM). the current paper analyses the requirements of each approach in terms of scratchpad memory. It also introduces a hybrid method, in order to achieve a finer balance between memory consumption and schedulability constraints.
Performance and scaling of biomolecular simulations frameworks largely depends on not only the workload characteristics of the simulations but also the design of underlying processor architecture and interconnection n...
详细信息
Performance and scaling of biomolecular simulations frameworks largely depends on not only the workload characteristics of the simulations but also the design of underlying processor architecture and interconnection networks. Because construction of Teraflops and Petaflops scale prototype systems for evaluation alone is impractical and cost-prohibitive, architects use analytical models of workloads and architecture simulators to guide their design decisions and tradeoffs. To address the problem of providing scalable yet precise input for network simulators, we have developed a technique to model symbolically the communication patterns of production-level scientific applications to study workload growth rates and to carry out sensitivity analysis. We apply our symbolic modeling scheme to the particle mesh ewald (PME) implementation in the sander package of the AMBER framework and demonstrate how the increase in computation, memory and communication requirements impact the performance and scaling of the PME method on the next-generation massively-parallel systems.
In this paper, we focus on the monomial prediction problem in two settings: (1) Decide whether a particular mono-mial $m$ is present in a composite function $f:=f_{r}\circ f_{r-1}\circ\ldots f\mathrm{o}$ , where $fi$ ...
详细信息
ISBN:
(数字)9798331532833
ISBN:
(纸本)9798331532840
In this paper, we focus on the monomial prediction problem in two settings: (1) Decide whether a particular mono-mial $m$ is present in a composite function $f:=f_{r}\circ f_{r-1}\circ\ldots f\mathrm{o}$ , where $fi$ are quadratic boolean functions, (2) Decide whether a particular monomial $m$ is present in a composite function $f:= f_{r}\circ f_{r-1}\circ\ldots f0$ , where polynomials $fi$ are efficiently computable by Probabilistic Generating circuits over rationals. Probabilistic generating circuits (PGCs) are economical representations of multivariate probability generating polynomials (PGPs), which capture many tractable probabilistic models in machine learning. the first problem has a strong connection withthe security of symmetric-key primitives. Dinur and Shamir proposed the cube attack for distinguishing a cryptographic primitive from a random function, which can be thought of as an efficient monomial prediction. In a general setting, over any large finite field or integers, monomial prediction is known to be NP-hard. Here, we show that in the quadratic setting, the problem is $\oplus \mathbf{P}$ complete. $\oplus \mathbf{P}$ is an interesting complexity class that is not known to contain NP, however, it is believed to contain computationally hard problems. On the other hand, we also present several new zero-sum distinguishers for 5-round Ascon, which is one of the ten finalists for NIST light weight cryptography standardization competition. We show that the second problem is #P-complete. It is known that PGCs have efficient inference, i.e. given a monomial, one can efficiently output (which signifies the probability) its coefficient in the polynomial computed by the circuit. However, a composition of such functions makes the inference hard. Composition of prob-abilistic models and their efficient inference play a crucial role in the semantic contextualization and framework of uncertainty theories in graphical modelling.
Multi-word Relevant Expressions (REs) can be defined as sequences of words (n grams) with strong semantic meaning, such as "ice melting" and "Ministère des Affaires Étrangères", usef...
详细信息
Multi-word Relevant Expressions (REs) can be defined as sequences of words (n grams) with strong semantic meaning, such as "ice melting" and "Ministère des Affaires Étrangères", useful in Information Retrieval, Document Clustering or Classification and Indexing of Documents. the need of extracting REs in several languages led research on statistical approaches rather than symbolic methods, since the former allow language-independence. Based on the assumption that REthe LocalMaxs algorithms have strong cohesion between their consecutive n-grams, the LocalMaxs algorithm is a language independent approach that extracts REs. Apart from its good precision, this extractor is time-consuming, being inoperable for Big Data if implemented in a sequential manner. this paper presents the first parallel and distributed version of this algorithm, achieving almost linear speedup and sizeup when processing corpora up to 1 billion words, using up to 54 virtual machines in a public cloud. this parallel version of the algorithm explores the statistical knowledge of the n grams in the corpus, to promote the locality of the references.
暂无评论