检索结果-内蒙古大学图书馆

Innovative Trends in Information Technology (ICITIIT), International Conference on

作者： Anshul Jindal Michael Gerndt Mario Bauch Hachim Haddouti Chair of Computer Architecture and Parallel Systems Technical University of Munich Garching (near Munich) Germany OTD Platform and Services Munich Germany

ISBN: (数字)9781728142104

ISBN: (纸本)9781728142111

Anomalies are unexpected instances which significantly deviate from the normal patterns formed by the majority of a dataset. The more an observation deviate from the normal pattern, the more likely it is an anomaly. The continuous increase in the number of car models and configuration possibilities has led to continuous increase in the complexity of logistics supply chain and production. Consequently, it has become difficult to manage the whole IT Landscape, a small anomaly/failure somewhere in the system could lead to a huge loss of money. Therefore, to identify and ultimately resolve quickly a problem in such a system is highly important. This paper addresses the challenge of identifying anomalies in a scalable way. The new data collected suffers from the problem of lack of labels for training. This challenge is addressed in the developed solution by using multiple unsupervised algorithms and reporting those observation as anomalies which are commonly reported as anomalies by all the algorithms. The developed solution also tackles the problem of data heterogeneity and big size by using Spark underneath for scalable data processing. Scalability test results demonstrate the reduction in training time of 100 transactions by 80% when using 10 cores instead of using 1 core. The results of the study have also pointed out that increasing the number of cores does not necessarily means reduction in the overall execution time, there are other factors like communications between the cores, non-spark related processing tasks, etc which can also influence the execution time.

关键词： Anomaly detection Time series analysis Industries Production Data models Support vector machines Automotive engineering

来源：评论

学校读者我要写书评

暂无评论

Apodotiko: Enabling Efficient Serverless Federated Learning in Heterogeneous Environments

Apodotiko: Enabling Efficient Serverless Federated Learning ...

引用

IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID)

作者： Mohak Chadha Alexander Jensen Jianfeng Gu Osama Abboud Michael Gerndt Chair of Computer Architecture and Parallel Systems Technische Universität München Garching (near Munich) Germany Huawei Technologies Munich Germany

ISBN: (数字)9798350395662

ISBN: (纸本)9798350395679

Federated Learning (FL) is an emerging machine learning paradigm that enables the collaborative training of a shared global model across distributed clients while keeping the data decentralized. Recent works on designing systems for efficient FL have shown that utilizing serverless computing technologies, particularly Function-as-a-Service (FaaS) for FL, can enhance resource efficiency, reduce training costs, and alleviate the complex infrastructure management burden on data holders. However, current serverless FL systems still suffer from the presence of stragglers, i.e., slow clients that impede the collaborative training process. While strategies aimed at mitigating stragglers in these systems have been proposed, they overlook the diverse hardware resource configurations among FL clients. To this end, we present Apodotiko, a novel asynchronous training strategy designed for serverless FL. Our strategy incorporates a scoring mechanism that evaluates each client’s hardware capacity and dataset size to intelligently prioritize and select clients for each training round, thereby minimizing the effects of stragglers on system performance. We comprehensively evaluate Apodotiko across diverse datasets, considering a mix of CPU and GPU clients, and compare its performance against five other FL training strategies. Results from our experiments demonstrate that Apodotiko outperforms other FL training strategies, achieving an average speedup of 2.75x and a maximum speedup of 7.03x. Furthermore, our strategy significantly reduces cold starts by a factor of four on average, demonstrating suitability in serverless environments.

关键词： Training Federated learning System performance Prevention and mitigation Collaboration Serverless computing Graphics processing units Distributed databases Hardware Data models

来源：评论

学校读者我要写书评

暂无评论

Elastic Workflows in Hybrid Cloud for CAE Simulations

Elastic Workflows in Hybrid Cloud for CAE Simulations

引用

IEEE International Conference on Cloud Engineering (IC2E)

作者： Srishti Dasgupta Michael Gerndt Babak Gholami Chair of Computer Architecture and Parallel Systems Technische Universität München Garching (near Munich) Germany BMW Group Munich Germany

ISBN: (数字)9798331528690

ISBN: (纸本)9798331528706

Companies traditionally dependent on on-premise HPC clusters for simulations are increasingly migrating workloads to the cloud. Cloud computing offers greater flexibility in selecting processors, memory, network bandwidth, along with enhanced resource availability and scalability. Automotive companies rely on computationally intensive numerical simulation tools for CAE (computer-Aided Engineering), particularly with the growing demand for generative design, which utilizes algorithms to automatically explore a large solution space. This work addresses the gap between the growing runtime demands of these simulations and the limitations of static HPC infrastructure by representing iterative workflows as Directed Acyclic Graphs (DAGs) and optimizing their scheduling. We propose a unified hybrid infrastructure that leverages the elasticity of cloud resources along with existing HPC clusters to maximize computational efficiency, ensure timely completion of simulations, and optimize resource utilization and costs.

关键词： Cloud computing Runtime Program processors Processor scheduling Computational modeling Scalability Companies Numerical simulation Numerical models Resource management

来源：评论

学校读者我要写书评

暂无评论

FedLesScan: Mitigating Stragglers in Serverless Federated Learning

FedLesScan: Mitigating Stragglers in Serverless Federated Le...

引用

IEEE International Conference on Big Data

作者： Mohamed Elzohairy Mohak Chadha Anshul Jindal Andreas Grafberger Jianfeng Gu Michael Gerndt Osama Abboud Chair of Computer Architecture and Parallel Systems Technische Universität München Garching (near Munich) Germany Huawei Technologies Munich Germany

ISBN: (纸本)9781665480468

Federated Learning (FL) is a machine learning paradigm that enables the training of a shared global model across distributed clients while keeping the training data local. While most prior work on designing systems for FL has focused on using stateful always running components, recent work has shown that components in an FL system can greatly benefit from the usage of serverless computing and Function-as-a-Service technologies. To this end, distributed training of models with severless FL systems can be more resource-efficient and cheaper than conventional FL systems. However, serverless FL systems still suffer from the presence of stragglers, i.e., slow clients due to their resource and statistical heterogeneity. While several strategies have been proposed for mitigating stragglers in FL, most methodologies do not account for the particular characteristics of serverless environments, i.e., cold-starts, performance variations, and the ephemeral stateless nature of the function instances. Towards this, we propose FedLesScan, a novel clustering-based semi-asynchronous training strategy, specifically tailored for serverless F L. FedLesScan dynamically adapts to the behavior of clients and minimizes the effect of stragglers on the overall system. We implement our strategy by extending an open-source serverless FL system called FedLess. Moreover, we comprehensively evaluate our strategy using the 2 nd generation Google Cloud Functions with four datasets and varying percentages of stragglers. Results from our experiments show that compared to other approaches FedLesScan reduces training time and cost by an average of 8% and 20% respectively while utilizing clients better with an average increase in the effective update ratio of 17.75%.

关键词： Training Costs Federated learning Computational modeling Training data Serverless computing Big Data

来源：评论

学校读者我要写书评

暂无评论

Economy-based Greedy Bidding for Resources for CAE Workflows in Hybrid Cloud Infrastructure

Economy-based Greedy Bidding for Resources for CAE Workflows...

引用

IEEE International Conference on e-Science and Grid Computing

作者： Srishti Dasgupta Tähvend Uustalu Michael Gerndt Babak Gholami Chair of Computer Architecture and Parallel Systems Technische Universität München Garching (near Munich) Germany BMW Group Munich Germany

ISBN: (数字)9798350365610

ISBN: (纸本)9798350365627

The advent of generative design in the automotive sector, characterised by the automatic and iterative exploration of expansive solution spaces to discover optimal design configurations, has significantly increased the demand for computational resources to run intensive computer-aided engineering (CAE) simulations within constrained time frames. The inherent limitations of static high-performance computing (HPC) clusters have necessitated the adoption of cloud resources due to their flexible and elastic nature, thereby enhancing the capacity to accommodate the computational demands of these iterative workflows. These workflows, represented as Directed Acyclic Graphs (DAGs), involve the serial and parallel execution of tasks, which can dynamically share resources with other workflows during idle periods. In this paper, we propose an economy-based approach to exploit the gaps generated by these idle periods through a bidding system, thereby enabling more efficient resource utilisation and reducing the average wait time, makespan, cost and deadline miss by more than 40%, 6%, 13% and 45%respectively against certain infrastructures and baselines. Furthermore, we explore the potential for generating revenue by renting out idle resources in a hybrid cloud setup. This approach not only aims to optimise the use of computational resources but also seeks to provide cost-effective solutions to meet the escalating demands of generative design in the automotive sector.

关键词： Directed acyclic graph Costs High performance computing Computational modeling Hybrid power systems Iterative methods Time factors

来源：评论

学校读者我要写书评

暂无评论

Windsurfing with APPA: Automating Computational Fluid Dynamics Simulations of Wind Flow using Cloud Computing

Windsurfing with APPA: Automating Computational Fluid Dynami...

引用

Euromicro Conference on parallel, Distributed and Network-Based Processing

作者： Anshul Jindal Benedikt Strahm Vladimir Podolskiy Michael Gerndt Chair of Computer Architecture and Parallel Systems Technical University of Munich Garching (near Munich) Germany Independent Researcher Garching (near Munich) Germany

ISBN: (数字)9781728165820

ISBN: (纸本)9781728165837

Computational fluid dynamics (CFD) can serve as a complementary approach to conventional wind tunnel testing to assess the wind flow around tall buildings. Being a clear High Performance Computing (HPC) task, CFD simulations conventionally run on supercomputers and compute clusters using specialized software such as OpenFOAM. The limited availability and high maintenance costs of supercomputers and clusters force small and medium companies to search for the cost-efficient infrastructure to conduct their simulations with the appropriate performance. The on-demand offer of compute capacity by cloud service providers are well suited this task. However, engineers and researchers require extensive expertise and experience in working with cloud computing in order to benefit from running CFD simulations on a *** contribution of the paper to the outlined problem is two-fold: 1) a unique Automated parallel Processing Application (APPA) tool that hides the cloud management details from the wind engineer and provides an intuitive user interface; 2) the estimation of the optimal number of cores (vCPUs) for virtual machine instances provided by AWS and Google Cloud based on average run time and total cost metrics for a given number of cells of a CFD-simulation. n1-highcpu-96 Google Cloud VM met both goals: low cost and low runtime per timestep. For the number of vCPUs below 16, the c4.8xlarge AWS VM type has the least runtime per timestep in all the cases. Google Cloud instances with high vCPUs are recommended to run the simulations if budget is a big concern.

关键词： Cloud computing Computational modeling Google Containers Buildings Tools Numerical models

来源：评论

学校读者我要写书评

暂无评论

Online memory leak detection in the cloud-based infrastructures

arXiv

引用

arXiv 2021年

作者： Jindal, Anshul Staab, Paul Cardoso, Jorge Gerndt, Michael Podolskiy, Vladimir Chair of Computer Architecture and Parallel Systems Technical University of Munich Garching Germany Huawei Munich Research Center Huawei Technologies Munich Germany

A memory leak in an application deployed on the cloud can affect the availability and reliability of the application. Therefore, to identify and ultimately resolve it quickly is highly important. However, in the production environment running on the cloud, memory leak detection is a challenge without the knowledge of the application or its internal object allocation details. This paper addresses this challenge of online detection of memory leaks in cloud-based infrastructure without having any internal application knowledge by introducing a novel machine learning based algorithm Precog. This algorithm solely uses one metric i.e the system’s memory utilization on which the application is deployed for the detection of a memory leak. The developed algorithm’s accuracy was tested on 60 virtual machines manually labeled memory utilization data provided by our industry partner Huawei Munich Research Center and it was found that the proposed algorithm achieves the accuracy score of 85% with less than half a second prediction time per virtual machine. © 2021, CC BY.

关键词： Machine learning

来源：评论

学校读者我要写书评

暂无评论

Comparison of Atom Detection Algorithms for Neutral Atom Quantum Computing

Comparison of Atom Detection Algorithms for Neutral Atom Qua...

引用

Quantum Computing and Engineering (QCE), IEEE International Conference on

作者： Jonas Winklmann Andrea Alberti Martin Schulz Chair of Computer Architecture and Parallel Systems Technical University of Munich Munich Germany Quantum Many-Body Systems Division Max Planck Institute of Quantum Optics Garching Germany

ISBN: (数字)9798331541378

ISBN: (纸本)9798331541385

In neutral atom quantum computers, readout and preparation of the atomic qubits are usually based on fluorescence imaging and subsequent analysis of the acquired image. For each atom site, the brightness or some comparable metric is estimated and used to predict the presence or absence of an atom. Across different setups, we can see a vast number of different approaches used to analyze these images. Often, the choice of detection algorithm is either not mentioned at all or it is not justified. We investigate several different algorithms and compare their performance in terms of both precision and execution run time. To do so, we rely on a set of synthetic images across different simulated exposure times with known occupancy states, which we generated using a previously validated imaging simulation. Since the use of simulation provides us with the ground truth of atom site occupancy, we can easily state precise error rates and variances of the reconstructed property. However, knowing the relative performance of these algorithms is not sufficient to justify their use, since better ones can exist that were not compared. To investigate this possibility, we calculated the Cramer-Rao bound in order to establish an upper limit that even a perfect estimator cannot outperform. As the metric of choice, we used the number of photonelectrons that can be contributed to a specific atom site. Every estimator that reconstructs a different property can simply be scaled accordingly. Since the bound depends on the occupancy of neighboring sites, we provide the best and worst cases, as well as a half filled one, which should represent an averaged bound best. Our comparison shows that of our tested algorithms, a global nonlinear least-squares solver that uses the optical system's point spread function (PSF) to return a global bias and each sites' number of photoelectrons performed the best, on average crossing the worst-case bound for longer exposure times. Its main drawback is its huge

关键词： Measurement computers Runtime Deconvolution Error analysis Qubit Quantum mechanics Prediction algorithms Detection algorithms Image reconstruction

来源：评论

学校读者我要写书评

暂无评论

Comparison of Atom Detection Algorithms for Neutral Atom Quantum Computing

arXiv

引用

arXiv 2024年

作者： Winklmann, Jonas Alberti, Andrea Schulz, Martin Chair of Computer Architecture and Parallel Systems Technical University of Munich Munich Germany Quantum Many-Body Systems Division Max Planck Institute of Quantum Optics Garching Germany

In neutral atom quantum computers, readout and preparation of the atomic qubits are usually based on fluorescence imaging and subsequent analysis of the acquired image. For each atom site, the brightness or some comparable metric is estimated and used to predict the presence or absence of an atom. Across different setups, we can see a vast number of different approaches used to analyze these images. Often, the choice of detection algorithm is either not mentioned at all or it is not justified. We investigate several different algorithms and compare their performance in terms of both precision and execution run time. To do so, we rely on a set of synthetic images across different simulated exposure times with known occupancy states, which we generated using a previously validated imaging simulation. Since the use of simulation provides us with the ground truth of atom site occupancy, we can easily state precise error rates and variances of the reconstructed property. However, knowing the relative performance of these algorithms is not sufficient to justify their use, since better ones can exist that were not compared. To investigate this possibility, we calculated the Cramér-Rao bound in order to establish an upper limit that even a perfect estimator cannot outperform. As the metric of choice, we used the number of photonelectrons that can be contributed to a specific atom site. Every estimator that reconstructs a different property can simply be scaled accordingly. Since the bound depends on the occupancy of neighboring sites, we provide the best and worst cases, as well as a half filled one, which should represent an averaged bound best. Our comparison shows that of our tested algorithms, a global non-linear least-squares solver that uses the optical system’s point spread function (PSF) to return a global bias and each sites’ number of photoelectrons performed the best, on average crossing the worst-case bound for longer exposure times. Its main drawback is its hug

关键词： Quantum computers

来源：评论

学校读者我要写书评

暂无评论

Realistic Neutral Atom Image Simulation

Realistic Neutral Atom Image Simulation

引用

Quantum Computing and Engineering (QCE), IEEE International Conference on

作者： Jonas Winklmann Dimitrios Tsevas Martin Schulz Chair of Computer Architecture and Parallel Systems Technical University of Munich Munich Germany Quantum Many-Body Systems Division Max Planck Institute of Quantum Optics Garching Germany

Neutral atom quantum computers require accurate single atom detection for the preparation and readout of their qubits. This is usually done using fluorescence imaging. The occupancy of an atom site in these images is often somewhat ambiguous due to the stochastic nature of the imaging process. Further, the lack of ground truth makes it difficult to rate the accuracy of reconstruction algorithms. We introduce a bottom-up simulator that is capable of generating sample images of neutral atom experiments from a description of the actual state in the simulated system. Possible use cases include the creation of exemplary images for demonstration purposes, fast training iterations for deconvolution algorithms, and generation of labeled data for machine-learning-based atom detection approaches. The implementation is available through our GitHub as a C library or wrapped Python package. We show the modeled effects and implementation of the simulations at different stages of the imaging process. Not all real-world phenomena can be reproduced perfectly. The main discrepancies are that the simulator allows for only one characterization of optical aberrations across the whole image, supports only discrete atom locations, and does not model all effects of complementary metal-oxide-semiconductor (CMOS) cameras perfectly. Nevertheless, our experiments show that the generated images closely match real-world pictures to the point that they are practically indistinguishable and can be used as labeled data for training the next generation of detection algorithms.

关键词：

来源：评论

学校读者我要写书评

暂无评论

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案：

请选择收藏分类：

通借通还

建议与咨询 留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

分类表

所选分类

限定检索结果

文献类型

馆藏范围

日期分布

学科分类号

主题

机构

作者

语言

请选择保存的检索档案： 新增检索档案 确定 取消

请选择收藏分类： 新增自定义分类 确定 取消

通借通还

建议与咨询留下您的常用邮箱和电话号码，以便我们向您反馈解决方案和替代方法

请选择保存的检索档案：

请选择收藏分类：