In general, one of the complexities of large simulations is related to the usage of the heterogeneous computational resources that are needed to execute them. The definition of workflows, usually linked to concrete or...
详细信息
ISBN:
(纸本)9781538678800
In general, one of the complexities of large simulations is related to the usage of the heterogeneous computational resources that are needed to execute them. The definition of workflows, usually linked to concrete orchestrations solutions, has reduced most of that complexity. These solutions are oriented to High Performance computing (HPC) or just deals with services managed remotely. This paper presents a novel solution we propose for running simulations in a hybrid HPC and Cloud infrastructure, exploiting the performance and power of HPC systems and benefiting from the fast and flexible provision of Cloud resources. We provide our vision about the typical simulation workflows and the kind of computational resources that fits best in each phase. In line with such vision, we describe the research done in order to enable the definition of the workflows extending the TOSCA standard (originally focused on Cloud solutions), used by our orchestrator and other solutions. We propose several extensions (types, relationships, compute properties and job properties), compatible with the standard definition, so Cloud and HPC tasks can be processed as expected. The paper also shows a use case implemented with the proposed approach, highlighting some of the benefits found so far.
It is common practice to use large computational resources to train neural networks, known from many examples, such as reinforcement learning applications. However, while massively parallelcomputing is often used for...
详细信息
This paper aims at high and portable performance for tensor computations across spatial (e.g., FPGAs) and vector architectures (e.g., GPUs). The state-of-the-art usually address performance portability across vector a...
详细信息
We argue that the current heterogeneous computing environment mimics a complex nonlinear system which needs to borrow the concept of time-scale separation and the delayed difference approach from statistical mechanics...
详细信息
We argue that the current heterogeneous computing environment mimics a complex nonlinear system which needs to borrow the concept of time-scale separation and the delayed difference approach from statistical mechanics and nonlinear dynamics. We show that by replacing the usual difference equations approach by a delayed difference equations approach, the sequential fraction of many scientific computing algorithms can be substantially reduced. We also provide a comprehensive theoretical analysis to establish that the error and stability of our scheme is of the same order as existing schemes for a large, well-characterized class of problems.
We show that for the lattice Boltzmann model, the existing paradigm in computer science for the choice of the data structure is suboptimal. In this paper we use the requirements of physical symmetry necessary for reco...
详细信息
We show that for the lattice Boltzmann model, the existing paradigm in computer science for the choice of the data structure is suboptimal. In this paper we use the requirements of physical symmetry necessary for recovering hydrodynamics in the lattice Boltzmann description to propose a hybrid data layout for the method. This hybrid data structure, which we call a structure of an array of structures, is shown to be optimal for the lattice Boltzmann model. Finally, the possible advantages of establishing a connection between group theoretic symmetry requirements and the construction of the data structure is discussed in the broader context of grid-based methods.
It is common practice to use large computational resources to train neural networks, known from many examples, such as reinforcement learning applications. However, while massively parallelcomputing is often used for...
详细信息
Multimedia data, especially image and video data, have become one of the most overwhelming data types on the Internet recently. Considering the user experience and real application requirements, multimedia data always...
详细信息
Multimedia data, especially image and video data, have become one of the most overwhelming data types on the Internet recently. Considering the user experience and real application requirements, multimedia data always demand a real-time processing speed. As a result, the huge amount of such data make retrieving useful information from them not only data-intensive, but also computation-intensive, which poses significant challenges to current system and architecture designs. Unfortunately, most prior studies focus only on text based retrieval systems or traditional multimedia processing applications. As far as we know, there is no systematic study on analyzing the characteristics of multimedia retrieval applications and how they might impact system and architecture designs. In this paper, we make the first attempt to construct a multimedia retrieval benchmark suite (called MMR Bench) to evaluate the corresponding system and architecture designs. To embody diverse multimedia retrieval applications, we collect eight state-of-the-art multimedia retrieval algorithms which cover the whole retrieval stages, including feature extraction, feature matching, and spatial verification. To satisfy diverse evaluation purposes, we implement multiple versions for each algorithm, including sequential version, pthread version for multi-core evaluation and data-parallel (i.e., Map-reduce) version for data-center evaluation. Moreover, MMR Bench provides flexible interfaces through retrieval stages, as well as a tool to adjust parameters and regenerating different scales of reasonable input. With such a flexible design, the algorithms in MMR Bench may be not only suitable for individual kernel-level evaluation, but also capable to be integrated into a complete infrastructure for system-level evaluation. Based on MMR Bench, we further analyze the inherent architectural characteristics, such as input size sensitivity and workload balance, which provides some insights into system and archite
Even the most sophisticated artificial neural networks are built by aggregating substantially identical units called neurons. A neuron receives multiple signals, internally combines them, and applies a non-linear func...
详细信息
Data flow analysis (e.g., dynamic taint analysis) has proven to be useful for guiding fuzzers to explore hard-to-reach code and find vulnerabilities. However, traditional taint analysis is labor-intensive, inaccurate ...
ISBN:
(纸本)9781939133175
Data flow analysis (e.g., dynamic taint analysis) has proven to be useful for guiding fuzzers to explore hard-to-reach code and find vulnerabilities. However, traditional taint analysis is labor-intensive, inaccurate and slow, affecting the fuzzing efficiency. Apart from taint, few data flow features are *** this paper, we proposed a data flow sensitive fuzzing solution GREYONE. We first utilize the classic feature taint to guide fuzzing. A lightweight and sound fuzzing-driven taint inference (FTI) is adopted to infer taint of variables, by monitoring their value changes while mutating input bytes during fuzzing. With the taint, we propose a novel input prioritization model to determine which branch to explore, which bytes to mutate and how to mutate. Further, we use another data flow feature constraint conformance, i.e., distance of tainted variables to values expected in untouched branches, to tune the evolution direction of *** implemented a prototype of GREYONE and evaluated it on the LAVA data set and 19 real world programs. The results showed that it outperforms various state-of-the-art fuzzers in terms of both code coverage and vulnerability discovery. In the LAVA data set, GREYONE found all listed bugs and 336 more unlisted. In real world programs, GREYONE on average found 2.12X unique program paths and 3.09X unique bugs than state-of-the-art evolutionary fuzzers, including AFL, VUzzer, CollAFL, Angora and Honggfuzz, Moreover, GREYONE on average found 1.2X unique program paths and 1.52X unique bugs than a state-of-the-art symbolic exeuction assisted fuzzer QSYM. In total, it found 105 new security bugs, of which 41 are confirmed by CVE.
Federated learning (FL), as a safe distributed training mode, provides strong support for the edge intelligence of the Internet of Vehicles (IoV) to realize efficient collaborative control and safe data sharing. Howev...
详细信息
暂无评论