Federated Learning (FL) is a collaborative model training approach that protects data privacy while allowing for model updates and optimization. However, FL is vulnerable to poisoning attacks due to its distributed na...
详细信息
The amount of carbon emission associated with the computational energy consumption in data centers depends, in a significant way, on the schedule of the workloads. Due to the inconsistent availability of renewable ene...
详细信息
ISBN:
(纸本)9798350304817
The amount of carbon emission associated with the computational energy consumption in data centers depends, in a significant way, on the schedule of the workloads. Due to the inconsistent availability of renewable energy over time, in addition to the existence of various sources of power in grid regions, the carbon intensity of data centers changes over time and location. Thus, the placement and scheduling of flexible workloads, based on the carbon intensity of power sources in data centers, can remarkably decrease the carbon emission. In this paper, we address the problem of placement and scheduling of workloads over geographically distributed data centers. We propose two algorithms that take the variability of carbon intensity of the power sources of the data centers, as well as their computational resource availability, into account when deciding about the placement and scheduling of the workloads. The first is a randomized rounding approximation algorithm that provides solutions that are guaranteed to be within a given distance from the optimal solution. The second is a sample-based algorithm that improves the solutions obtained by the randomized rounding approximation algorithm. The experimental results show that the proposed algorithms can solve the problem efficiently.
While parallel programming, particularly on graphics processing units (GPUs), and numerical optimization hold immense potential to tackle real-world computational challenges across disciplines, their inherent complexi...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
While parallel programming, particularly on graphics processing units (GPUs), and numerical optimization hold immense potential to tackle real-world computational challenges across disciplines, their inherent complexity and technical demands often act as daunting barriers to entry. This, unfortunately, limits accessibility and diversity within these crucial areas of computer science. To combat this challenge and ignite excitement among undergraduate learners, we developed an application-driven course, harnessing robotics as a lens to demystify the intricacies of these topics making them tangible and engaging. Our course's prerequisites are limited to the required undergraduate introductory core curriculum, opening doors for a wider range of students. Our course also features a large final-project component to connect theoretical learning to applied practice. In our first offering of the course we attracted 27 students without prior experience in these topics and found that an overwhelming majority of the students fell that they learned both technical and soft skills such that they felt prepared for future study in these fields.
We present the first GPU-based parallel algorithm to efficiently update vertex coloring on large dynamic networks. For single GPU, we introduce the concept of loosely maintained vertex color update that reduces comput...
详细信息
ISBN:
(纸本)9781665494236
We present the first GPU-based parallel algorithm to efficiently update vertex coloring on large dynamic networks. For single GPU, we introduce the concept of loosely maintained vertex color update that reduces computation and memory requirements. For multiple GPUs, in distributed environments, we propose priority-based ordering of vertices to reduce the communication time. We prove the correctness of our algorithms and experimentally demonstrate that for graphs of over 16 million vertices and over 134 million edges on a single GPU, our dynamic algorithm is as much as 20x faster than state-of-the-art algorithm on static graphs. For larger graphs with over 130 million vertices and over 260 million edges, our distributed implementation with 8 GPUs produces updated color assignments within 160 milliseconds. In all cases, the proposed parallel algorithms produce comparable or fewer colors than state-of-the-art algorithms.
Quantum computing is a new computing paradigm that exploits laws of quantum mechanics to achieve an exponential speedup compared to classical logic. However, noise strongly limits current quantum hardware, reducing ac...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
Quantum computing is a new computing paradigm that exploits laws of quantum mechanics to achieve an exponential speedup compared to classical logic. However, noise strongly limits current quantum hardware, reducing achievable performance. Quantum Error Correction (QEC) techniques are a valuable approach to reduce the effects of noise. Nevertheless, the high computational complexity of QEC algorithms is incompatible with the tight time constraints of quantum devices. Thus, hardware acceleration is paramount to achieving real-time QEC. This work represents the first step in the FPGA acceleration of the Sparse Blossom Algorithm (SBA), a state-of-the-art decoding algorithm for QEC. We provide a performance profiling and a design methodology for the hardware development of the SBA. We evaluate the execution time, and energy efficiency of our solution, attaining up to 2.75x speedup and 9.59x improvement in energy efficiency compared to the software baseline.
Graph queries on large networks leverage the stored graph properties to provide faster results. Since real-world graphs are mostly dynamic, i.e., the graph topology changes over time, the corresponding graph attribute...
详细信息
ISBN:
(纸本)9781450397964
Graph queries on large networks leverage the stored graph properties to provide faster results. Since real-world graphs are mostly dynamic, i.e., the graph topology changes over time, the corresponding graph attributes also change over time. In certain situations, recompiling or updating earlier properties is necessary to maintain the accuracy of a response to a graph query. Here, we first propose a generic framework for developing parallel algorithms to update graph properties on large dynamic networks. We use our framework to develop algorithms for updating Single Source Shortest Path (SSSP) and Vertex Color. Then we propose applications of the developed algorithms in Unmanned Aerial Vehicle (UAV) based delivery systems under time-varying dynamics. Finally, we implement our SSSP and vertex color update algorithms for Nvidia GPU architecture and show empirically that the developed algorithms can update properties in large dynamic networks faster than the state-of-the-art techniques.
Terrain parameters such as slope, aspect, and hillshading are essential in various applications, including agriculture, forestry, and hydrology. However, generating high-resolution terrain parameters is computationall...
详细信息
ISBN:
(纸本)9798400701559
Terrain parameters such as slope, aspect, and hillshading are essential in various applications, including agriculture, forestry, and hydrology. However, generating high-resolution terrain parameters is computationally intensive, making it challenging to provide these value-added products to communities in need. We present a scalable workflow called GEOtiled that leverages data partitioning to accelerate the computation of terrain parameters from digital elevation models, while preserving accuracy. We assess our workflow in terms of its accuracy and wall time by comparing it to SAGA, which is highly accurate but slow to generate results, and to GDAL, which supports memory optimizations but not data parallelism. We obtain a coefficient of determination (R-2) between GEOtiled and SAGA of 0.794, ensuring accuracy in our terrain parameters. We achieve an X6 speedup compared to GDAL when generating the terrain parameters at a high-resolution (10 m) for the Contiguous United States (CONUS).
Although traditional 3D terrain algorithms can improve the rendering efficiency of the terrain, they often ignore the performance of the terrain itself. The use of four textures is not sufficient to deal with complex ...
详细信息
Modern supercomputers are becoming increasingly dense with accelerators. Industry leaders offer multi-GPU architectures with high interconnection bandwidth between the devices to match the requirements of modern workl...
详细信息
In order to effectively suppress the negative impact of the randomness and indirectness of photovoltaic output on the power grid, and improve the friendliness of the grid-connected photovoltaic-energy storage system, ...
详细信息
暂无评论