In the era of Big Data, the computational demands of machine learning (ML) algorithms have grown exponentially, necessitating the development of efficient parallelcomputing techniques. This research paper delves into...
详细信息
A new hierarchical communication parallelcomputing algorithm for transient structural analysis is introduced based on the architecture characteristics of heterogeneous multi-core processors in order to increase the p...
详细信息
The rapid adoption of Internet of Things (IoT) technologies and digital twins (DTs) is revolutionizing smart cities by facilitating improved real-time monitoring, simulation, and optimization of urban infrastructures....
详细信息
High-density EEG is a non-invasive measurement method with millisecond temporal resolution that allows us to monitor how the human brain operates under different conditions. The large amount of data combined with comp...
详细信息
ISBN:
(纸本)9783031488023;9783031488030
High-density EEG is a non-invasive measurement method with millisecond temporal resolution that allows us to monitor how the human brain operates under different conditions. The large amount of data combined with complex algorithms results in unmanageable execution times. Large-scaleGPU parallelism provides the means to drastically reduce the execution time of EEG analysis and bring the execution of large cohort studies (over thousand subjects) within reach. This paper describes our effort to implement various EEG algorithms for multiGPUpre-exascale supercomputers. Several challenges arise during thiswork, such as the high cost of data movement and synchronisation compared to computation. A performance-oriented end-to-end design approach is chosen to develop highlyscalable, GPU-only implementations of full processing pipelines and modules. Work related to the parallel design of the family of Empirical Mode Decomposition algorithms is described in detail with preliminary performance results of single-GPU implementations. The research will continue with multi-GPU algorithm design and implementation aiming to achieve scalability up to thousands of GPU cards.
Serverless computing has emerged as a new execution model which gained a lot of attention in cloud computing thanks to the latest advances in containerization technologies. Recently, serverless has been adopted at the...
详细信息
ISBN:
(纸本)9781538674628
Serverless computing has emerged as a new execution model which gained a lot of attention in cloud computing thanks to the latest advances in containerization technologies. Recently, serverless has been adopted at the edge, where it can help overcome heterogeneity issues, constrained nature and dynamicity of edge devices. Due to the distributed nature of edge devices, however, the scaling of serverless functions presents a major challenge. We address this challenge by studying the optimality of serverless function scaling. To this end, we propose Semi-Markov Decision Process-based (SMDP) theoretical model, which yields optimal solutions by solving the serverless function scaling problem as a decision making problem. We compare the SMDP solution with practical, monitoring-based heuristics. We show that SMDP can be effectively used in edge computing networks, and in combination with monitoring-based approaches also in real-world implementations.
Improving the performance of quantum adder is an important technical challenge with major impact on the implementation of efficient, large-scale quantum computing. Continuing along this research direction, we propose ...
详细信息
Improving the performance of quantum adder is an important technical challenge with major impact on the implementation of efficient, large-scale quantum computing. Continuing along this research direction, we propose a novel parallel-prefix quantum adder based on Ling expansion. We systematically explored classical structures for parallel-prefix adders assessing their suitability to be realized in quantum domain. Furthermore, Ling adder enforces Logical OR and large fan-out, which require innovative solutions. We addressed these challenges to realize the quantum Ling adder, which results in a T-depth of only O(log n/2). This represents a substantial improvement over the previous quantum adders based on parallel prefix structure, which require O(log n) T-depth. Based on the proposed adder, an efficient quantum modular adder is also demonstrated in this paper, further extending the applicability of our approach. We present extensive theoretical and simulation-based studies to establish our claims.
In the pursuit of precision agriculture, the integration of artificial intelligence (AI) for real-time plant disease detection holds a significant promise. This study investigates the application of edge computing on ...
详细信息
The evolution of the distributedcomputing paradigm had as a result new computing models such as grid and cloud computing. Furthermore, in these environments it is common to run complex parallel applications thus maki...
详细信息
OpenCUBE aims to develop an open-source full software stack for Cloud computing blueprint deployed on EPI hardware, adaptable to emerging workloads across the computing continuum. OpenCUBE prioritizes energy awareness...
详细信息
ISBN:
(纸本)9783031488023;9783031488030
OpenCUBE aims to develop an open-source full software stack for Cloud computing blueprint deployed on EPI hardware, adaptable to emerging workloads across the computing continuum. OpenCUBE prioritizes energy awareness and utilizes open APIs, Open Source components, advanced SiPearl Rhea processors, and RISC-V accelerator. The project leverages representative workloads, such as cloud-native workloads and workflows of weather forecast data management, molecular docking, and space weather, for evaluation and validation.
Deploying Deep Learning (DL) models on edge devices presents several challenges due to the limited set of processing and memory resources, and the bandwidth constraints while ensuring performance and energy requiremen...
详细信息
ISBN:
(纸本)9798350364613;9798350364606
Deploying Deep Learning (DL) models on edge devices presents several challenges due to the limited set of processing and memory resources, and the bandwidth constraints while ensuring performance and energy requirements. In-memory computing (IMC) represents an efficient way to accelerate the inference of data-intensive DL tasks on the edge. Recently, several analog, digital, and mixed digital-analog memory technologies emerged as promising solutions for IMC. Among them, digital SRAM IMC exhibits a deterministic behavior and compatibility with advanced technology scaling rules making it a viable path for integration with hardware accelerators. This work focuses on discussing the potentially powerful aspects of digital IMC (DIMC) on edge System-on-Chip (SoC) devices. The limitations and ()pen challenges of DIMC are also discussed.
暂无评论