SR is a language for programmingdistributed systems ranging from operating systems to application programs. On the basis of our experience with the initial version, the language has evolved considerably. In this pape...
详细信息
SR is a language for programmingdistributed systems ranging from operating systems to application programs. On the basis of our experience with the initial version, the language has evolved considerably. In this paper we describe the current version of SR and give an overview of its implementation. The main language constructs are still resources and operations. Resources encapsulate processes and variables that they share; operations provide the primary mechanism for process interaction. One way in which SR has changed is that both resources and processes are now created dynamically. Another change is that inheritance is supported. A third change is that the mechanisms for operation invocation—call and send—and operation implementation—proc and in—have been extended and integrated. Consequently, all of local and remote procedure call, rendezvous, dynamic process creation, asynchronous message passing, multicast, and semaphores are supported. We have found this flexibility to be very useful for distributedprogramming. Moreover, by basing SR on a small number of well-integrated concepts, the language has proved easy to learn and use, and it has a reasonably efficient implementation.
Apache Spark is a Big Data framework for working on large distributed datasets. Although widely used in the industry, it remains rather limited in the academic community or often restricted to software engineers. The ...
详细信息
Apache Spark is a Big Data framework for working on large distributed datasets. Although widely used in the industry, it remains rather limited in the academic community or often restricted to software engineers. The goal of this paper is to show with practical uses-cases that the technology is mature enough to be used without excessive programming skills by astronomers or cosmologists in order to perform standard analyses over large datasets, as those originating from future galaxy surveys. To demonstrate it, we start from a realistic simulation corresponding to 10 years of LSST data taking (6 billions of galaxies). Then, we design, optimize and benchmark a set of Spark python algorithms in order to perform standard operations as adding photometric redshift errors, measuring the selection function or computing power spectra over tomographic bins. Most of the commands execute on the full 110 GB dataset within tens of seconds and can therefore be performed interactively in order to design full-scale cosmological analyses. A jupyter notebook summarizing the analysis is available at https://***/astrolabsoftware/1807.03078. (C) 2019 Elsevier B.V. All rights reserved.
This paper presents an overview of GCP (Guarded Communicating Processes), a language for distributed applications programming, which has been defined deriving its control mechanisms from Hoare's CSP (with new comm...
详细信息
ISBN:
(纸本)9780897911542
This paper presents an overview of GCP (Guarded Communicating Processes), a language for distributed applications programming, which has been defined deriving its control mechanisms from Hoare's CSP (with new communication primitives and a new distributed termination convention) and embedding them in a fully defined concurrent programming language. Besides an easy retargetable compiler the GCP environment consists of a configurator/distributor to distribute and activate the processes constituting an application and of a run-time support.
暂无评论