With the increasing demand for computility, managing distributed computility resources is crucial for improving service quality and performance of the cross-region computility infrastructure. In order to ensure timely...
详细信息
VisionMaster algorithm development platform is a powerful machine vision software. The platform aims to provide users with efficient and convenient algorithm tools to quickly build visual applications and solve comple...
详细信息
Modern advancements in large-scale machine learning would be impossible without the paradigm of data-paralleldistributedcomputing. Since distributedcomputing with large-scale models imparts excessive pressure on co...
详细信息
Modern advancements in large-scale machine learning would be impossible without the paradigm of data-paralleldistributedcomputing. Since distributedcomputing with large-scale models imparts excessive pressure on communication channels, significant recent research has been directed toward co-designing communication compression strategies and training algorithms with the goal of reducing communication costs. While pure data parallelism allows better data scaling, it suffers from poor model scaling properties. Indeed, compute nodes are severely limited by memory constraints, preventing further increases in model size. For this reason, the latest achievements in training giant neural network models also rely on some form of model parallelism. In this work, we take a closer theoretical look at Independent Subnetwork Training (IST), which is a recently proposed and highly effective technique for solving the aforementioned problems. We identify fundamental differences between IST and alternative approaches, such as distributed methods with compressed communication, and provide a precise analysis of its optimization performance on a quadratic model. Copyright 2024 by the author(s)
Dynamic Adaptive Streaming over HTTP (DASH) is a widely adopted video streaming protocol. Adaptive Bitrate Streaming (ABR) algorithm is utilized to dynamically switch between different bitrates. However, traditional A...
详细信息
Emerging applications in healthcare, autonomous vehicles, and wearable assistance require interactive and low-latency data analysis services. Unfortunately, cloud-centric architectures cannot fulfill the low-latency d...
详细信息
The increasing demand for real-time data analysis in Internet of Things (IoT) ecosystems has created several challenges, particularly in environments where resources are limited, and minimizing data processing latency...
详细信息
Federated self-supervised learning (FedSSL) is an emerging method in the domain of machine learning. It collaboratively learns a powerful feature extractor among multiple participants by utilizing distributed unlabele...
详细信息
Understanding the performance behavior of parallel applications is important in many ways, but doing so is not easy. Most open source analysis tools are written for the command line. We are building on these proven to...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
Understanding the performance behavior of parallel applications is important in many ways, but doing so is not easy. Most open source analysis tools are written for the command line. We are building on these proven tools to provide an interactive performance analysis experience within Jupyter Notebooks when developing parallel code with MPI, OpenMP, or both. Our solution makes it possible to measure the execution time, perform profiling and tracing, and visualize the results within the notebooks. For ease of use, it provides both a graphical JupyterLab extension and a C++ API. The JupyterLab extension shows a dialog where the user can select the type of analysis and its parameters. Internally, this tool uses Score -P, Scalasca, and Cube to generate profiling and tracing data. This tight integration gives students easy access to profiling tools and helps them better understand concepts such as benchmarking, scalability and performance bottlenecks. In addition to the technical development, the article presents hands-on exercises from our well-established parallel programming course. We conclude with a qualitative and quantitative evaluation with 19 students, which shows a positive effect of the tools on the students' perceived competence.
parallelcomputing and distributedcomputing are the popular terminologies of scheduling. With advancement in technology, systems have become much more compact and fast and need of parallelization plays a major role f...
详细信息
ISBN:
(纸本)9783031368042;9783031368059
parallelcomputing and distributedcomputing are the popular terminologies of scheduling. With advancement in technology, systems have become much more compact and fast and need of parallelization plays a major role for this compaction. Wireless computing is also a common concept associated with each new development. Scheduling of tasks has always been a challenging area and is an NP-complete problem. Moreover, when it comes to wireless distributedcomputing, reliable scheduling plays an important role in order to complete a task in a wireless distributed system. This work proposes an algorithm to dynamically schedule tasks on heterogeneous processors within a wireless distributedcomputing system. A lot of heuristics, meta-heuristics & genetics have been used earlier with scheduling strategies. However, most of them haven't taken reliability into account before scheduling. Here a heuristic that deals with reliable scheduling is considered. The scheduler also works within an environment which has dynamically changing resources and adapts itself to changing system resources. The testing was carried out with up to 200 tasks being scheduled while testing in a real time wireless distributed environment. Experiments have shown that the algorithm outperforms the other strategies and can achieve a better reliability along with no increase in make-span, in spite of wireless nodes.
A framework to support optimised application placement across the cloud-edge continuum is described, making use of the Optimized-Greedy Nominator Heuristic (EO-GNH). The framework can be employed across a range of dif...
详细信息
ISBN:
(纸本)9783031506833;9783031506840
A framework to support optimised application placement across the cloud-edge continuum is described, making use of the Optimized-Greedy Nominator Heuristic (EO-GNH). The framework can be employed across a range of different Internet of Things (IoT) applications, such as smart agriculture and healthcare. The framework uses asynchronous MapReduce and parallel meta-heuristics to support the management of IoT applications, focusing on metrics such as execution performance, resource utilization and system resilience. We evaluate EOGNH using service quality achieved through real-time resource management, across multiple application domains. Performance analysis and optimisation of EO-GNH has also been carried out to demonstrate how it can be configured for use across different IoT usage contexts.
暂无评论