In recent years, multiagent reinforcement learning (MARL) has demonstrated considerable potential across diverse applications. However, in reinforcement learning environments characterized by sparse rewards, the scarc...
详细信息
In recent years, multiagent reinforcement learning (MARL) has demonstrated considerable potential across diverse applications. However, in reinforcement learning environments characterized by sparse rewards, the scarcity of reward signals may give rise to reward conflicts among agents. In these scenarios, each agent tends to compete to obtain limited rewards, deviating from collaborative efforts aimed at achieving collective team objectives. This not only amplifies the learning challenge but also imposes constraints on the overall learning performance of agents, ultimately compromising the attainment of team goals. To mitigate the conflicting competition for rewards among agents in MARL, we introduce the bidirectional influence and interaction (BDII) MARL framework. This innovative approach draws inspiration from the collaborative ethos observed in human social cooperation, specifically the concept of "sharing joys and sorrows." The fundamental concept behind BDII is to empower agents to share their individual rewards with collaborators, fostering a cooperative rather than competitive behavioral paradigm. This strategic shift aims to resolve the pervasive issue of reward conflicts among agents operating in sparse-reward environments. BDII incorporates two key factors—namely, the Gaussian kernel distance between agents (physical distance) and policy diversity among agents (logical distance). The two factor collectively contribute to the dynamic adjustment of reward allocation coefficients, culminating in the formation of reward distribution weights. The incorporation of these weights facilitates the equitable sharing of agents’ contributions to rewards, promoting a cooperative learning environment. Through extensive experimental evaluations, we substantiate the efficacy of BDII in addressing the challenge of reward conflicts in MARL. Our research findings affirm that BDII significantly mitigates reward conflicts, ensuring that agents consistently align with the origi
Adaptivity is the capacity of software to adjust itself to changes in its environment. A common approach to achieving adaptivity is to introduce dedicated code during software development stage. However,since those co...
详细信息
Adaptivity is the capacity of software to adjust itself to changes in its environment. A common approach to achieving adaptivity is to introduce dedicated code during software development stage. However,since those code fragments are designed a priori, self-adaptive software cannot handle situations adequately when the contextual changes go beyond those that are originally anticipated. In this case, the original builtin adaptivity should be tuned. For example, new code should be added to provide the capacity to sense the unexpected environment or to replace outdated adaptation decision logic. The technical challenges in this process, especially that of tuning software adaptivity at runtime, cannot be understated. In this paper,we propose an architecture-centric application framework for self-adaptive software named Auxo. Similar to existing work, our framework supports the development and running of self-adaptive software. Furthermore,our framework supports the tuning of software adaptivity without requiring the running self-adaptive software to be terminated. In short, the architecture style that we are introducing can encapsulate not only general functional logic but also the concerns in the self-adaptation loop(such as sensing, decision, and execution)as architecture elements. As a result, a third party, potentially the operator or an augmented software entity equipped with explicit domain knowledge, is able to dynamically and flexibly adjust the self-adaptation concerns through modifying the runtime software architecture. To truly exercise, validate, and evaluate our approach,we describe a self-adaptive application that was deployed on the framework, and conducted several experiments involving self-adaptation and the online tuning of software adaptivity.
If-conversion and predicated execution are widely adopted to eliminate branch misprediction penalty. Previous predication execution depends on compiler to generate explicit predicated instructions, In this paper, a tr...
详细信息
ISBN:
(纸本)3540296395
If-conversion and predicated execution are widely adopted to eliminate branch misprediction penalty. Previous predication execution depends on compiler to generate explicit predicated instructions, In this paper, a trace-based predicate mechanism named RIMP (Runtime IMplicit Predication) is discussed. The candidates of if-conversion will be identified during dynamic execution. Conventional trace cache has been modified to store RIMP traces, which include instructions both from fall-through and target block following the conditional branch. Hardware extension will add predication to RIMP trace automatically. With the help of RIMP, legacy applications can benefit from predication mechanism without recompiling source code. Simulation of RIMP implementation under diverse microarchitecture configurations is presented in the paper. Results have shown promising performance improvement. In general, RIMP with 64kB trace storage delivers an average 10.3% IPC improvement while actually speeding up the execution time by over 7%.
Jamming attack can severely affect the performance of Wireless sensor networks (WSNs) due to the broadcast nature of wireless medium. In order to localize the source of the attacker, we in this paper propose a jammer ...
详细信息
Jamming attack can severely affect the performance of Wireless sensor networks (WSNs) due to the broadcast nature of wireless medium. In order to localize the source of the attacker, we in this paper propose a jammer localization algorithm named as Minimum-circlecovering based localization (MCCL). Comparing with the existing solutions that rely on the wireless propagation parameters, MCCL only depends on the location information of sensor nodes at the border of the jammed region. MCCL uses the plane geometry knowledge, especially the minimum circle covering technique, to form an approximate jammed region, and hence the center of the jammed region is treated as the estimated position of the jammer. Simulation results showed that MCCL is able to achieve higher accuracy than other existing solutions in terms of jammer's transmission range and sensitivity to nodes' density.
Force-directed approach is one of the most widely used methods in graph drawing research. There are two main problems with the traditional force-directed algorithms. First, there is no mature theory to ensure the conv...
详细信息
Force-directed approach is one of the most widely used methods in graph drawing research. There are two main problems with the traditional force-directed algorithms. First, there is no mature theory to ensure the convergence of iteration sequence used in the algorithm and further, it is hard to estimate the rate of convergence even if the convergence is satisfied. Second, the running time cost is increased intolerablely in drawing largescale graphs, and therefore the advantages of the force-directed approach are limited in practice. This paper is focused on these problems and presents a sufficient condition for ensuring the convergence of iterations. We then develop a practical heuristic algorithm for speeding up the iteration in force-directed approach using a successive over-relaxation (SOR) strategy. The results of computational tests on the several benchmark graph datasets used widely in graph drawing research show that our algorithm can dramatically improve the performance of force-directed approach by decreasing both the number of iterations and running time, and is 1.5 times faster than the latter on average.
Modular datacenters (MDCs) use shipping containers,encapsulating thousands of servers,as large pluggable building blocks for mega *** MDC’s "service-free" model poses stricter demand on fault-tolerance of t...
详细信息
Modular datacenters (MDCs) use shipping containers,encapsulating thousands of servers,as large pluggable building blocks for mega *** MDC’s "service-free" model poses stricter demand on fault-tolerance of the modular datacenter network (MDCN).Based on the "scale-out" principle,in this paper we propose SCautz,a novel hybrid intra-container network for *** comprises a base Kautz topology,created by interconnecting servers,and a small number of COTS (commercial off-the-shelf) ***,each switch connects a specific number of servers forming "clusters",which,as logical nodes,form multiple higher-level logical Kautz ***’s hybrid structure has several ***,it supports multiple running modes for the MDC,while its full structure increases network capacity ***,it retains the throughput for processing one-to-x traffic in the presence of ***,it achieves much more graceful network performance degradation than computation and storage capacity *** from theoretical analysis and simulations show that SCautz is more viable for intra-container networks.
We consider the maximal vector problem on uncertain data, which has been recently posed by the study on processing skyline queries over a probabilistic data stream in the database context. Let D n be a set of n points...
详细信息
We consider the maximal vector problem on uncertain data, which has been recently posed by the study on processing skyline queries over a probabilistic data stream in the database context. Let D n be a set of n points in a d-dimensional space and q (0 < q 1) be a probability threshold; each point in D n has a probability to occur. Our problem is concerned with how to estimate the expected size of the probabilistic skyline, which consists of all the points that are not dominated by any other point in D n with a probability not less than q. We prove that the upper bound of the expected size is O(min{n, (- ln q)(ln n) d-1 }) under the assumptions that the value distribution on each dimension is independent and the values of the points along each dimension are distinct. The main idea of our proof is to find a recurrence about the expected size and solve it. Our results reveal the relationship between the probability threshold q and the expected size of the probabilistic skyline, and show that the upper bound is poly-logarithmic when q is not extremely small.
Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applicatio...
详细信息
Cloud computing has been widely adopted by enterprises because of its on-demand and elastic resource usage paradigm. Currently most cloud applications are running on one single cloud. However, more and more applications demand to run across several clouds to satisfy the requirements like best cost efficiency, avoidance of vender lock-in, and geolocation sensitive service. JointCloud computing is a new research initiated by Chinese institutes to address the computing issues concerned with multiple clouds. In JointCloud, users' diverse and dynamic requirements on cloud resources axe satisfied by providing users virtual cloud (VC) for special purposes. A virtual cloud for special purposes is in essence a user's specific cloud working environment having the customized software stacks, configurations and computing resources readily available. This paper first introduces what is JointCloud computing and then describes the design rationales, motivation examples, mechanisms and enabling technologies of VC in JointCloud.
The Sparse Matrix-Vector product (SpMV) is a key operation in engineering and scientific computing. Methods for efficiently implementing it in parallel are critical to the performance of many applications. Modern Grap...
详细信息
Combining virtual machine technology and network computing technology will be able to effectively aggregate the widely distributed heterogeneous and autonomous resources in the Internet. This paper proposes a virtual ...
详细信息
暂无评论