With the growing interest in Big Data technologies, companies and organizations are devoting much effort to designing Big Data Analytics (BDA) applications that may increase their competitiveness or foster innovation....
详细信息
ISBN:
(纸本)9783319936598;9783319936581
With the growing interest in Big Data technologies, companies and organizations are devoting much effort to designing Big Data Analytics (BDA) applications that may increase their competitiveness or foster innovation. However, BDA design requires expertise and economic resources that may not always available. To overcome this limit, the TOREADOR project has proposed a model-based BDAaaS (MBDAaaS) approach to guarantee the automation of BDA applications design, allowing users to focus on business cases without having to deal with technical aspects of data storage and management. Although many platforms providing BDA services are available, most of them exploit ontologies only for data representation and not for describing the BDA computation itself. This paper describes how the Semantic technologies meet MBDAaaS in the TOREADOR project.
In this paper producers and distributors in supply chain need to agree on orders quantities and schedules. Instead of classical integrated production-distribution planning model we propose a reformulation in form of c...
详细信息
ISBN:
(纸本)9783319974903;9783319974897
In this paper producers and distributors in supply chain need to agree on orders quantities and schedules. Instead of classical integrated production-distribution planning model we propose a reformulation in form of centralized auction, oriented on maximization of individual benefits. Production costs and inventory costs are handled as parts of competitive offers/bids. We reformulate the inventory costs, so they can be interpreted as costs of reduction or increase in ordered quantities. Ordered distribution quantities are balanced in each period with offered production. We point to the similarity of the proposed model to the centralized double energy market for demands with shifting capabilities. This allows us to use pricing model based on uplifts minimization, originally developed for pool-based auction. As a result we obtain prices complemented with minimum uplifts that support competitive producers and distributors to follow schedules efficient for entire supply chain. This novel coordination mechanism is formulated as a set of two clearing and pricing mixed integer linear programming problems (MILP). Simple example illustrates the approach.
Important insights into many data science problems that are traditionally analyzed via statistical models can be obtained by re-formulating and evaluating within a large-scale optimization framework. However, the theo...
详细信息
ISBN:
(纸本)9781728108582
Important insights into many data science problems that are traditionally analyzed via statistical models can be obtained by re-formulating and evaluating within a large-scale optimization framework. However, the theoretical underpinnings of the statistical model may shift the goal of the decision space traversal from a traditional search for a single optimal solution to a traversal with the purpose of yielding a set of high quality, independent solutions. We examine statistical frameworks with astronomical decision spaces that translate to optimization problem but are challenging for standard optimization methodologies. We address the new challenges by designing a hybrid metaheuristic with specialized intensification and diversification protocols in the base search algorithm. Our algorithm is extended to the high performance computing realm using the Stampede2 supercomputer where we experimentally demonstrate the effectiveness of our algorithm to utilize multiple processors to collaboratively hill climb, broadcast messages to one another regarding landscape characteristics, diversify across the solution landscape, and request aid in climbing particularly difficult peaks.
In the last three years, the largest dense deep learning models have grown over 1000x to reach hundreds of billions of parameters, while the GPU memory has only grown by 5x (16 GB to 80 GB). Therefore, the growth in m...
详细信息
ISBN:
(数字)9781450384421
ISBN:
(纸本)9781665483902
In the last three years, the largest dense deep learning models have grown over 1000x to reach hundreds of billions of parameters, while the GPU memory has only grown by 5x (16 GB to 80 GB). Therefore, the growth in model scale has been supported primarily though system innovations that allow large models to fit in the aggregate GPU memory of multiple GPUs. However, we are getting close to the GPU memory wall. It requires 800 NVIDIA V100 GPUs just to fit a trillion parameter model for training, and such clusters are simply out of reach for most data scientists. In addition, training models at that scale requires complex combinations of parallelism techniques that puts a big burden on the data scientists to refactor their model. In this paper we present ZeRO-Infinity, a novel heterogeneous system technology that leverages GPU, CPU, and NVMe memory to allow for unprecedented model scale on limited resources without requiring model code refactoring. At the same time it achieves excellent training throughput and scalability, unencumbered by the limited CPU or NVMe bandwidth. ZeRO-Infinity can fit models with tens and even hundreds of trillions of parameters for training on current generation GPU clusters. It can be used to fine-tune trillion parameter models on a single NVIDIA DGX-2 node, making large models more accessible. In terms of training throughput and scalability, it sustains over 25 petaflops on 512 NVIDIA V100 GPUs (40% of peak), while also demonstrating super linear scalability. An open source implementation of ZeRO-Infinity is available through DeepSpeed 1 1 DeepSpeed (https://***/) is a deep learning optimization library designed to make distributed training easy, efficient, and effective. DeepSpeed has been extensively adopted by the DL community..
Medical research is not only expensive but also time-consuming, what can be seen in the queues, and then after the waiting time for the analysis of the effects obtained from tests. In the case of computed tomography e...
详细信息
ISBN:
(纸本)9783319999968;9783319999951
Medical research is not only expensive but also time-consuming, what can be seen in the queues, and then after the waiting time for the analysis of the effects obtained from tests. In the case of computed tomography examinations, the end result is a series of the described images of the examined object's shape. The description is made on the careful observation of the results. In this work, we propose a solution that allows to select images that are suspicious. This type of technique reduces the amount of data that needs to be analyzed and thus reduces the waiting time for the patient. The idea is based on a three-stage data processing. In the first one, key-points are located as features of found elements, in the second, images are constructed containing found areas of images, and in the third, the classifier assesses whether the image should be analyzed in terms of diseases. The method has been described and tested on a large CT dataset, and the results are widely discussed.
The representation of words by means of vectors, also called Word Embeddings (WE), has been receiving great attention from the Natural Language Processing (NLP) field. WE models are able to express syntactic and seman...
详细信息
ISBN:
(数字)9781728165820
ISBN:
(纸本)9781728165837
The representation of words by means of vectors, also called Word Embeddings (WE), has been receiving great attention from the Natural Language Processing (NLP) field. WE models are able to express syntactic and semantic similarities, as well as relationships and contexts of words within a given corpus. Although the most popular implementations of WE algorithms present low scalability, there are new approaches that apply High-Performance computing (HPC) techniques. This is an opportunity for an analysis of the main differences among the existing implementations, based on performance and scalability metrics. In this paper, we present a study which addresses resource utilization and performance aspects of known WE algorithms found in the literature. To improve scalability and usability we propose a wrapper library for local and remote execution environments that contains a set of optimizations such as the pWord2vec, pWord2vec MPI, Wang2vec and the original Word2vec algorithm. Utilizing these optimizations it is possible to achieve an average performance gain of 15x for multicores and 105x for multinodes compared to the original version. There is also a big reduction in the memory footprint compared to the most popular python versions.
As high-performance computing (HPC) resources continue to grow in size and complexity, so too does the volume and velocity of the operational data that is associated with them. At such scales, new mechanisms and techn...
详细信息
ISBN:
(纸本)9781450371964
As high-performance computing (HPC) resources continue to grow in size and complexity, so too does the volume and velocity of the operational data that is associated with them. At such scales, new mechanisms and technologies are required to continuously gather, store, and analyze this data in near-real time from heterogeneous and distributed sources without impacting the underlying data center operations or HPC resource utilization. In this paper, we describe our experiences in designing and implementing an infrastructure for extreme-scale operational data collection, known as the Operations Monitoring and Notification Infrastructure (OMNI) at the National Energy Research Scientific computing (NERSC) center at Lawrence Berkeley National Laboratory. OMNI currently holds over 522 billion records of online operational data (totaling over 125TB) and can ingest new data points at an average rate of 25,000 data points per second. Using OMNI as a central repository, facilities and environmental data can be seamlessly integrated and correlated with machine metrics, job scheduler information, network errors, and more, providing a holistic view of data center operations. To demonstrate the value of real-time operational data collection, we present a number of real-world case studies for which having OMNI data readily available led to key operational insights at NERSC. The case results include a reduction in the downtime of an HPC system during a facility transition, as well as a $2.5 million electrical substation savings for the next-generation Perlmutter HPC system.
CAPE, which stands for Checkpointing-Aided parallel Execution, is a framework that automatically translates and provides runtime functions to execute OpenMP programs on distributed-memory architectures based on checkp...
详细信息
This paper presents the latest developments on the VIALACTEA Science Gateway in the context of the FP7 VIALACTEA project. The science gateway operates as a central workbench for the VIALACTEA community in order to all...
详细信息
This paper presents the latest developments on the VIALACTEA Science Gateway in the context of the FP7 VIALACTEA project. The science gateway operates as a central workbench for the VIALACTEA community in order to allow astronomers to process the new-generation surveys (from Infrared to Radio) of the Galactic Plane to build and deliver a quantitative 3D model of our Milky Way Galaxy. The final model will be used as a template for external galaxies to study star formation across the cosmic time. The adopted agile software development process allowed to fulfill the community needs in terms of required workflows and underlying resource monitoring. Scientific requirements arose during the process highlighted the needs for easy parameter setting, fully embarrassingly parallel computations and large-scale input dataset processing. Therefore the science gateway based on the WS-PGRADE/gUSE framework has been able to fulfill the requirements mainly exploiting the parameter sweep paradigm and parallel job execution of the workflow management system. Moving from development to production environment an efficient resource monitoring system has been implemented to easily analyze and debug sources of potential failures occurred during workflow computations. The results of the resource monitoring system are exploitable not only for IT experts, administrators and workflow developers but also for the end users of the gateway. The affiliation to the STARnet Gateway Federation ensures the sustainability of the presented products after the end of the project, allowing the usage of the VIALACTEA Science Gateway to all the stakeholders, not only to the community members. (C) 2017 Elsevier B.V. All rights reserved.
In cloud computing, profitability among others, is the driving force that encourages full utilization of computing resources. Hence, scenarios where one or more users are co-located on the same CPU but on different vi...
详细信息
暂无评论