Dynamic programming is a class of algorithms used to compute optimal control policies for Markov decision processes. Dynamic programming is ubiquitous in control theory, and is also the foundation of reinforcement lea...
详细信息
Dynamic programming is a class of algorithms used to compute optimal control policies for Markov decision processes. Dynamic programming is ubiquitous in control theory, and is also the foundation of reinforcement learning. In this paper, we show that value improvement, one of the main steps of dynamic programming, can be naturally seen as composition in a category of optics, and intuitively, the optimal value function is the limit of a chain of optic compositions. We illustrate this with three classic examples: the gridworld, the inverted pendulum and the savings problem. This is a first step towards a complete account of reinforcement learning in terms of parametrised optics.
programming education has gradually become an integral part of the mandatory curriculum in schools. Previous research has provided various learning tools and techniques with the aim of promoting and popularizing progr...
详细信息
Mobile devices have been increasingly deployed in large-scale cyber-physical systems (CPS) to traverse the field and retrieve various data measurements from designated physical entities with stringent performance requ...
详细信息
Mobile devices have been increasingly deployed in large-scale cyber-physical systems (CPS) to traverse the field and retrieve various data measurements from designated physical entities with stringent performance requirements. This work studies the Availability-constrained real-time Fresh Data Retrieval problem in CPS with a Speed Adjustable mobile device (AFDR-SA). The goal is to maintain the temporal validity of the real-time data with different priorities to be retrieved in the system while meeting the data availability constraints imposed by the communication range between the mobile device and the physical entities. The general case of the AFDR-SA problem is proved to be NP-hard. A dynamic programming (DP)-based optimal algorithm is proposed for a special scenario where the retrieval times of individual data items with the same priority are of the same length. For the general case where data items can have arbitrary retrieval times and different priorities, another different DP-based scheme is proposed, which is proved to be optimal given the retrieval order. A fast heuristic with low complexity is also proposed for the general problem to improve the computational efficiency. The experimental results show that the proposed schemes for the general case outperform the state-of-the-art methods and have close performance compared to the optimal solution while incurring much less computational overhead.
The Balitsky-Kovchegov (BK) evolution equation is an equation derived from perturbative Quantum Chromodynamics that allows one to evolve with collision energy the scattering amplitude of a pair of quark and antiquark ...
详细信息
The Balitsky-Kovchegov (BK) evolution equation is an equation derived from perturbative Quantum Chromodynamics that allows one to evolve with collision energy the scattering amplitude of a pair of quark and antiquark off a hadron target, called the dipole amplitude. The initial condition, being a non-perturbative object, usually has to be modeled separately. Typically, the model contains several tunable parameters that are determined by fitting to experimental data. In this contribution, we propose an implementation of the BK solver using differentiable programming. Automatic differentiation offers the possibility that the first and second derivatives of the amplitude with respect to the initial condition parameters are automatically calculated at all stages of the simulation. This fact should considerably facilitate and speed up the fitting step. Moreover, in the context of Transverse Momentum Distributions (TMD), we demonstrate that automatic differentiation can be used to obtain the first and second derivatives of the amplitude with respect to the quark-antiquark separation. These derivatives can be used to relate various TMD functions to the dipole amplitude. Our C ++ code for the solver, which is available in a public repository [1], includes the Balitsky one-loop running coupling prescription and the kinematic constraint. This version of the BK equation is widely used in the small-x evolution framework.
We present novel empirical assessments of prominent finite state machine (FSM) conformance test derivation methods against their coverage of code faults. We consider a number of realistic extended FSM examples with th...
详细信息
We present novel empirical assessments of prominent finite state machine (FSM) conformance test derivation methods against their coverage of code faults. We consider a number of realistic extended FSM examples with their related Java implementations and derive for these examples complete test suites using the W method and its HSI and H derivatives considering the case when the implementation under test (IUT) has the same number of states as the specification FSM. We also consider W ++ , HSI ++ , and H ++ test suites derived considering the case when the IUT can have one more extra state. For each pair of considered test suites, we determine if there is a difference between the pair in covering the implementations faults. If the difference is significant, we determine which test suite outperforms the other. We run two other assessments which show that the obtained results are not due to the size or length of the test suites. In addition, we conduct assessments to determine whether each of the methods has better coverage of certain classes of faults than others and whether the W outperforms the HSI and H methods over only certain classes of faults. The results and outcomes of conducted experiments are summarized. Major artifacts used in the assessments are provided as benchmarks for further studies.
We consider the problem of aircraft motion control in uncertain conditions caused by incomplete and inaccurate knowledge of the aircraft characteristics, as well as by abnormal situations in flight, which affect the p...
详细信息
We consider the problem of aircraft motion control in uncertain conditions caused by incomplete and inaccurate knowledge of the aircraft characteristics, as well as by abnormal situations in flight, which affect the properties of the aircraft as the control object. One of the effective tools for solving problems of this kind, providing the adjustment of aircraft control algorithms taking into account its changed dynamics, is reinforcement learning (RL) in the approximate dynamic programming (ADP) variant in combination with artificial neural networks. In the past decade, a family of methods known as adaptive critic design (ACD) has been actively developed within the ADP approach to control the behavior of complex dynamic systems. This paper discusses the application of one variant of the ACD approach, namely, single network adaptive critic (SNAC) and its development through combined use with the dynamic inversion (DI) method. This approach makes it possible to form an optimal adaptive control law for the motion of an aircraft. Its effectiveness is demonstrated on the example of longitudinal motion control for a supersonic transport (SST) airplane.
In undergraduate courses, there is a wide diversity of approaches to teaching various subjects, ranging from the more traditional methods to the utilization of digital media. Students' learning styles and motivati...
详细信息
ISBN:
(纸本)9798350394023;9798350394030
In undergraduate courses, there is a wide diversity of approaches to teaching various subjects, ranging from the more traditional methods to the utilization of digital media. Students' learning styles and motivational triggers for learning vary from one group to another. It is essential to understand the types of students in order to formulate a strategy that motivates and inspires them to learn effectively. computerscience students should develop problem-solving skills. To acquire this competency, students must develop algorithmic thinking and not merely understand the syntax of a programming language. The most direct and effective way to obtain this competency is through practice. Students should be motivated by a learning strategy that takes advantage of their potential and enables them to engage in learning through appropriate activities. Gamification is a learning strategy that employs games within the educational environment to enhance student learning, reinforce knowledge, improve skills, or reward specific actions, among other objectives. The gamification strategy aims to motivate students, develop a better commitment in them, and foster the spirit of self-improvement. Under this context, Competitive programming enables students to engage in problem-solving using advanced programming algorithms, enhancing their skills in a motivational environment. During competitions, they face real problems, realizing the applications of what they have learned, having the possibility to deepen their knowledge because they have to solve a problem in the most efficient way. The specific objective of this study is to use competitive programming as a gamification strategy in computerscience groups to enhance problem-solving skills. The study demonstrates a substantial increase in students' motivation and learning. It illustrates that this learning strategy encourages students to learn more effectively in computerscience courses.
In recent years, genetic programming-based evolutionary feature construction has shown great potential in various applications. However, a critical challenge in applying this technique is the need to select an appropr...
详细信息
Real-time collaborative programming supports a team of programmers to concurrently edit a shared set of source code at the same time. To support semantic conflict prevention in real-time collaboration, prior work had ...
详细信息
ISBN:
(纸本)9798350349184;9798350349191
Real-time collaborative programming supports a team of programmers to concurrently edit a shared set of source code at the same time. To support semantic conflict prevention in real-time collaboration, prior work had proposed a dependency-based automatic locking (DAL) approach, which grants locks on selected source code regions based on a set of prefixed rules. To further improve the flexibility of the DAL scheme by utilizing programmers' knowledge on semantic conflict risks and collaboration requirements, we propose a novel Request-Invitation-Approval (RIA) scheme, which allows any programmer to manually request the editing permission on a locked code region, or invite another programmer to share locks on a region. To support the proposed scheme, we have further proposed two modes for the permission transfer process, and contributed detailed techniques on four request patterns. Prototype system implementation has validated the feasibility of the approach and techniques, and user evaluations have demonstrated the satisfactory usability of the system.
programming is becoming more prevalent in primary schools. However, we still do not know how to effectively train teachers in programming. On the other hand, prior research on pair programming has been shown to suppor...
详细信息
暂无评论