OpenMP provides a versatile framework for parallel computing, allowing developers to transform sequential programs into parallel applications for shared-memory architectures efficiently. One of the central challenges ...
详细信息
ISBN:
(数字)9798350391282
ISBN:
(纸本)9798350391299
OpenMP provides a versatile framework for parallel computing, allowing developers to transform sequential programs into parallel applications for shared-memory architectures efficiently. One of the central challenges in this transformation lies in accurately identifying appropriate parallel constructs and clauses, which are critical for maximizing performance and ensuring the correctness of the resulting parallel code. A particularly intricate aspect of this process is the classification of variables according to their data-sharing semantics, including first-private, private, last-private, shared, and reduction clauses. Manual classification is laborintensive and significantly susceptible to errors as the program's scale and complexity grow. Although various tools have been developed to assist with variable classification, they often rely on extensive data-dependence analyses and rigid classification schemes, limiting their effectiveness when applied to large-scale programs with complex scoping requirements. This paper presents a novel, cost-effective approach to automate and enhance the accuracy of variable classification in OpenMP parallelization. By reducing the manual effort required and improving the precision of parallel construct insertion, this approach aims to significantly optimize the performance of parallel applications, thereby advancing the utility and accessibility of OpenMP for a wide range of computational tasks.
Data analysis usually suffers from the Missing Not At Random (MNAR) problem, where the cause of the value missing is not fully observed. Compared to the naive Missing Completely At Random (MCAR) problem, it is more in...
详细信息
Microservice architecture has become a widely accepted solution to address the challenges, particularly scala-bility, deployment, and flexibility associated with monolithic architecture. A vital attribute of the micro...
详细信息
ISBN:
(数字)9798350362268
ISBN:
(纸本)9798350362275
Microservice architecture has become a widely accepted solution to address the challenges, particularly scala-bility, deployment, and flexibility associated with monolithic architecture. A vital attribute of the microservices architecture is its capability to handle load balancing on a large scale. The load balancer collaborates with a scaler to distribute the workload efficiently across multiple instances. In the literature, different studies employ load-balancing algorithms for efficient microservice load balancing. These works overlook cloud-based microservice applications or focus solely on virtual machines, neglecting containers. This paper addresses these limitations by comparatively assessing selected load-balancing algorithms. The three most used algorithms, random, round-robin, and least connection, are studied on a microservice application. The extensive experiments are conducted using Elastic Container Service (ECS) of Amazon Web Service (AWS) for containerized cloud setup where each service resides in a cluster and traffic is generated through Locust. Experimental results show that throughput and response time range of 6.2-288.7 and 312.2-3375.8 ms, respectively.
Optimizing tensor product matrix computations is critical for enhancing computational efficiency in high-performance applications. Traditional algorithms, like the Split algorithm, often struggle due to the unique pro...
详细信息
ISBN:
(数字)9798350391282
ISBN:
(纸本)9798350391299
Optimizing tensor product matrix computations is critical for enhancing computational efficiency in high-performance applications. Traditional algorithms, like the Split algorithm, often struggle due to the unique properties of each matrix involved. This paper presents a novel heuristic method that optimizes the selection of cutting points and matrix ar-rangement, significantly reducing redundant calculations and minimizing memory usage. The proposed approach adapts to the varying characteristics of tensor products, improving performance across different computational scenarios. Enhancing floating-point operation efficiency and CPU utilization delivers substantial speed and efficiency gains, particularly in large-scale tensor product matrix operations, offering a robust solution for complex computational tasks.
Simultaneous Localization and Mapping (SLAM) technology has been widely applied in various robotic scenarios, from rescue operations to autonomous driving. However, the generalization of SLAM algorithms remains a sign...
详细信息
In document-level neural machine translation (DocNMT), multi-encoder approaches are common in encoding context and source sentences. Recent studies (Li et al., 2020) have shown that the context encoder generates noise...
详细信息
Convolutional Neural Networks (CNNs) have made remarkable strides;however, they remain susceptible to vulnerabilities, particularly to image perturbations that humans can easily recognize. This weakness, often termed ...
详细信息
In real-world data stream mining, the composition of classes undergoes unpredictable changes, giving rise to the challenge of class evolution, encompassing class emergence, disappearance, and reoccurrence. However, mo...
In real-world data stream mining, the composition of classes undergoes unpredictable changes, giving rise to the challenge of class evolution, encompassing class emergence, disappearance, and reoccurrence. However, most existing approaches require the storage of past data to adapt their model. While some studies have focused on online learning approaches, they are built on an underlying assumption that the number of instances in any single class is consistently less than the sum of other classes. This assumption becomes invalid when a class emerges with a dominant amount, e.g., news about a pandemic outbreak, harming the performance of existing methods. In this paper, we thoroughly investigate this scenario and propose a novel online ensemble of ensemble one-versus-all framework (EEOF) to handle class evolution adaptively. The novel ensemble of ensemble architecture boosts diversity in each one-versus-all classifier. A novel adaptive model adaptation method is also designed to balance the error feedback between the emerging class and the other classes. A confidence-triggered fallback mode is integrated to prevent performance drop due to a wrong decision regarding class disappearance. Experimental studies are conducted on both synthetic and real-world data streams to show that our method achieves higher accuracy in diverse class evolution scenarios compared with the state-of-the-art method, particularly when classes emerge with dominant amounts.
With the growing number of sensitive data transmit-ted in IT infrastructures, healthcare organizations and compa-nies that generate users' wearable data have become a target for attackers. To protect electronic he...
详细信息
Designing generalizable agents capable of adapting to diverse embodiments has achieved significant attention in Reinforcement Learning (RL), which is critical for deploying RL agents in various real-world applications...
ISBN:
(纸本)9798331314385
Designing generalizable agents capable of adapting to diverse embodiments has achieved significant attention in Reinforcement Learning (RL), which is critical for deploying RL agents in various real-world applications. Previous Cross-Embodiment RL approaches have focused on transferring knowledge across embodiments within specific tasks. These methods often result in knowledge tightly coupled with those tasks and fail to adequately capture the distinct characteristics of different embodiments. To address this limitation, we introduce the notion of Cross-Embodiment Unsupervised RL (CEURL), which leverages unsupervised learning to enable agents to acquire embodiment-aware and task-agnostic knowledge through online interactions within reward-free environments. We formulate CEURL as a novel Controlled Embodiment Markov Decision Process (CE-MDP) and systematically analyze CEURL's pre-training objectives under CE-MDP. Based on these analyses, we develop a novel algorithm Pre-trained Embodiment-Aware Control (PEAC) for handling CEURL, incorporating an intrinsic reward function specifically designed for cross-embodiment pre-training. PEAC not only provides an intuitive optimization strategy for cross-embodiment pre-training but also can integrate flexibly with existing unsupervised RL methods, facilitating cross-embodiment exploration and skill discovery. Extensive experiments in both simulated (e.g., DMC and Robosuite) and real-world environments (e.g., legged locomotion) demonstrate that PEAC significantly improves adaptation performance and cross-embodiment generalization, demonstrating its effectiveness in overcoming the unique challenges of CEURL. The project page and code are in https://***/ceurl.
暂无评论