We introduce Xtract, an automated and scalable system for bulk metadata extraction from large, distributed research data repositories. Xtract orchestrates the application of metadata extractors to groups of files, det...
详细信息
ISBN:
(纸本)9781450382175
We introduce Xtract, an automated and scalable system for bulk metadata extraction from large, distributed research data repositories. Xtract orchestrates the application of metadata extractors to groups of files, determining which extractors to apply to each file and, for each extractor and file, where to execute. A hybrid computing model, built on the funcX federated FaaS platform, enables Xtract to balance tradeoffs between extraction time and data transfer costs by dispatching each extraction task to the most appropriate location. Experiments on a range of clouds and supercomputers show that Xtract can efficiently process multi-million-file repositories by orchestrating the concurrent execution of container-based extractors on thousands of nodes. We highlight the flexibility of Xtract by applying it to a large, semi-curated scientific data repository and to an uncurated scientific Google Drive repository. We show that by remotely orchestrating metadata extraction across decentralized storage and compute nodes, Xtract can process large repositories in 50% of the time it takes just to transfer the same data to a machine within the same computing facility. We also show that when transferring data is necessary (e.g., no local compute is available), Xtract can scale to process files as fast as they are received, even over a multi-GB/s network.
The MYRTUS Horizon Europe project embraces the principles of the EU CloudEdgeIoT Initiative, integrating edge, fog, and cloud in a continuum of computing resources. MYRTUS intends to deliver abstractions, cognitive or...
详细信息
Deep neural network (DNN) has become the leading technology to realize Artificial Intelligence (AI). As DNN models become larger and more complex, so do datasets. Being able to efficiently train DNNs in parallel has b...
详细信息
ISBN:
(纸本)9781450385626
Deep neural network (DNN) has become the leading technology to realize Artificial Intelligence (AI). As DNN models become larger and more complex, so do datasets. Being able to efficiently train DNNs in parallel has become a crucial need. Data Parallelism (DP) is the widest-used solution today to accelerate DNN training but could be inefficient when processing DNNs with large-size parameters. Hybrid Parallelism (HP), which applies different parallel strategies on different parts of DNNs, is more efficient but requires advanced configurations. Not all AI researchers are experts in parallel computing, thus automating the configuration of HP strategies is very desirable for all AI frameworks. We propose a parallel semantics analysis method, which can analyze the trade-offs among different kinds of parallelisms and systematically choose the HP strategies with good training time performance. We demonstrated experimentally 260% speedup when applying our method compared to using a conventional DP approach. With our proposal, AI researchers would be able to focus more on AI algorithm research without being disturbed by parallel analysis and engineering concerns.
Traditional computing systems based on von Neumann architectures are fundamentally bottle-necked by the transfer speeds between memory and processor. With growing computational needs of today's application space, ...
详细信息
ISBN:
(纸本)9781450393225
Traditional computing systems based on von Neumann architectures are fundamentally bottle-necked by the transfer speeds between memory and processor. With growing computational needs of today's application space, dominated by Machine Learning (ML) workloads, there is a need to design special purpose computing systems operating on the principle of co-located memory and processing units. Such an approach, commonly known as 'In-memory computing', can potentially eliminate expensive data movement costs by computing inside the memory array itself. To that effect, crossbars based on resistive switching Non-Volatile Memory (NVM) devices has shown immense promise in serving as the building blocks of in-memory computing systems, as their high storage density can overcome scaling challenges that plague CMOS technology today. Adding to that, the ability of resistive crossbars to accelerate the main computational kernel of ML workloads by performing massively parallel, in-situ matrix vector multiplication (MVM) operations, makes them a promising candidate for building area and energy-efficient systems. However, the analog computing nature in resistive crossbars introduce approximations in MVM computations due to device and circuit level nonidealities. Further, analog systems pose high cost peripheral circuit requirements for conversions between the analog and digital domain. Thus, there is a need to understand the entire system design stack, from device characteristics to architectures, and perform effective hardware-software co-design to truly realize the potential of resistive crossbars as future computing systems. In this talk, we will present a comprehensive overview of NVM crossbars for accelerating ML workloads. We describe, in detail, the design principles of the basic building blocks, such as the device and associated circuits, that constitute the crossbars. We explore non-idealities arising from the device characteristics and circuit behavior and study their impact on
The proceedings contain 95 papers. The topics discussed include: towards a more efficient approach for the satisfiability of two-variable logic;demonic lattices and semilattices in relational semigroups with ordinary ...
ISBN:
(纸本)9781665448956
The proceedings contain 95 papers. The topics discussed include: towards a more efficient approach for the satisfiability of two-variable logic;demonic lattices and semilattices in relational semigroups with ordinary composition;on logics and homomorphism closure;universal Skolem sets;evidenced frames: a unifying framework broadening realizability models;behavioral preorders via graded monads;the undecidability of system F typability and type checking for reductionists;complexity lower bounds from algorithm design;on the logical structure of choice and bar induction principles;compositional relational reasoning via operational game semantics;continuous one-counter automata;and in search of lost time: axiomatizing parallel composition in process algebras.
Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy their applications without the burden of managing the underlying infrastructure. However, existing FaaS platforms rely on remote ...
详细信息
ISBN:
(纸本)9781450386388
Function-as-a-Service (FaaS) has become an increasingly popular way for users to deploy their applications without the burden of managing the underlying infrastructure. However, existing FaaS platforms rely on remote storage to maintain state, limiting the set of applications that can be run efficiently. Recent caching work for FaaS platforms has tried to address this problem, but has fallen short: it disregards the widely different characteristics of FaaS applications, does not scale the cache based on data access patterns, or requires changes to applications. To address these limitations, we present Faa$T, a transparent auto-scaling distributed cache for serverless applications. Each application gets its own cache. After a function executes and the application becomes inactive, the cache is unloaded from memory with the application. Upon reloading for the next invocation, Faa$T pre-warms the cache with objects likely to be accessed. In addition to traditional compute-based scaling, Faa$T scales based on working set and object sizes to manage cache space and I/O bandwidth. We motivate our design with a comprehensive study of data access patterns on Azure Functions. We implement Faa$T for Azure Functions, and show that Faa$T can improve performance by up to 92% (57% on average) for challenging applications, and reduce cost for most users compared to state-of-the-art caching systems, i.e. the cost of having to stand up additional serverful resources.
A popular programming technique that contributes to designing provably-correct distributed applications is to use shared objects for interprocess communication, instead of more low-level techniques. Although shared ob...
详细信息
ISBN:
(纸本)9781450375825
A popular programming technique that contributes to designing provably-correct distributed applications is to use shared objects for interprocess communication, instead of more low-level techniques. Although shared objects are a convenient abstraction, they are not generally provided in large-scale distributed systems; instead, the processes keep individual copies of the data and communicate by sending messages to keep the copies consistent. Traditional distributedcomputing considers a static system, with known bounds on the number of fixed computing nodes and the number of possible failures. Dynamic distributed systems allow nodes to enter and leave the system at will, either due to failures and recoveries, moving in the real world, or changes to the systems' composition. Motivating applications include those in peer-to-peer, sensor, mobile, and social networks, as well as server farms.
The proceedings contain 10 papers. The topics discussed include: computation offloading from mobile devices: can edge devices perform better than the cloud?;performance of approximate causal consistency for partially-...
ISBN:
(纸本)9781450342278
The proceedings contain 10 papers. The topics discussed include: computation offloading from mobile devices: can edge devices perform better than the cloud?;performance of approximate causal consistency for partially-replicated systems;modeling the scalability of real-time online interactive applications on clouds;the impact on the performance of co-running virtual machines in a virtualized environment;a gossip-based dynamic virtual machine consolidation strategy for large-scale cloud data centers;data management of sensor signals for high bandwidth data streaming to the cloud;cloud elasticity: going beyond demand as user load;and cloud live streaming system based on auto-adaptive overlay for cyber physical infrastructure.
Today's Internet is heavily used for multimedia streaming from cloud backends, while the Internet of Things (IoT) challenges the traditional data flow, with high data volumes produced at the network edge. Informat...
详细信息
ISBN:
(纸本)9781450384605
Today's Internet is heavily used for multimedia streaming from cloud backends, while the Internet of Things (IoT) challenges the traditional data flow, with high data volumes produced at the network edge. Information Centric Networking (ICN) advocates against a host-centric communication model using content identifiers decoupling content from a location, and therefore, promising for distributed edge computing environments. However, the resulting coupling of data to content identifiers in ICNs introduces new challenges regarding dissemination of large data volumes and services and synchronization across multiple consumers. We present Tangle Centric Networking (TCN) - a decentralized data structure for co-ordinated distribution of data and services for ICN deployments. TCN simplifies the management of data and service changes and updates them accordingly in network nodes using principles of Tangles. Using simulations, first implementations of TCN show improvements in data discovery as well as less synchronization overhead of large volumes of data compared to a state-of-the-art ICN system.
暂无评论