Model-free continuous control for robot navigation tasks using Deep Reinforcement Learning (DRL) that relies on noisy policies for exploration is sensitive to the density of rewards. In practice, robots are usually de...
详细信息
Model-free continuous control for robot navigation tasks using Deep Reinforcement Learning (DRL) that relies on noisy policies for exploration is sensitive to the density of rewards. In practice, robots are usually deployed in cluttered environments, containing many obstacles and narrow passageways. Designing dense effective rewards is challenging, resulting in exploration issues during training. Such a problem becomes even more serious when tasks are described using temporal logic specifications. This work presents a deep policy gradient algorithm for controlling a robot with unknown dynamics operating in a cluttered environment when the task is specified as a Linear Temporal Logic (LTL) formula. To overcome the environmental challenge of exploration during training, we propose a novel path planning-guided reward scheme by integrating sampling-based methods to effectively complete goal-reaching missions. To facilitate LTL satisfaction, our approach decomposes the LTL mission into sub-goal-reaching tasks that are solved in a distributed manner. Our framework is shown to significantly improve performance (effectiveness, efficiency) and exploration of robots tasked with complex missions in large-scale cluttered environments.
This paper studies optimal motion planning subject to motion and environment uncertainties. By modeling the system as a probabilistic labeled Markov decision process (PL-MDP), the control objective is to synthesize a ...
详细信息
This paper studies optimal motion planning subject to motion and environment uncertainties. By modeling the system as a probabilistic labeled Markov decision process (PL-MDP), the control objective is to synthesize a finite-memory policy, under which the agent satisfies complex high-level tasks expressed as linear temporal logic (LTL) with desired satisfaction probability. In particular, the cost optimization of the trajectory that satisfies infinite horizon tasks is considered, and the trade-off between reducing the expected mean cost and maximizing the probability of task satisfaction is analyzed. The LTL formulas are converted to limit-deterministic Buchi automata (LDBA) with a reachability acceptance condition and a compact graph structure. The novelty of this work lies in considering the cases where LTL specifications can be potentially infeasible and developing a relaxed product MDP between PL- MDP and LDBA. The relaxed product MDP allows the agent to revise its motion plan whenever the task is not fully feasible and quantify the revised plan's violation measurement. A multi- objective optimization problem is then formulated to jointly consider the probability of task satisfaction, the violation with respect to original task constraints, and the implementation cost of the policy execution. The formulated problem can be solved via coupled linear programs. This work first bridges the gap between probabilistic planning revision of potential infeasible LTL specifications and optimal control synthesis of both plan prefix and plan suffix of the trajectory over the infinite horizons. Experimental results are provided to demonstrate the effectiveness of the proposed framework.
This paper addresses the problem of task assignment and trajectory generation for installing bird diverters using a fleet of multi-rotors. The proposed solution extends our previous motion planner to compute feasible ...
详细信息
This paper addresses the problem of task assignment and trajectory generation for installing bird diverters using a fleet of multi-rotors. The proposed solution extends our previous motion planner to compute feasible and constrained trajectories, considering payload capacity limitations and recharging constraints. Signal Temporal Logic (STL) specifications are employed to encode the mission objectives and temporal requirements. Additionally, an event-based replanning strategy is introduced to handle unforeseen failures. An energy minimization term is also employed to implicitly save multi-rotor flight time during installation operations. The effectiveness and validity of the approach are demonstrated through simulations in MATLAB and Gazebo, as well as field experiments carried out in a mock-up scenario.
Hyperproperties are increasingly popular in verifying security policies and synthesis of control for dynamic systems. Hyperproperties generalize trace properties to enable reasoning about multiple computation traces t...
详细信息
Hyperproperties are increasingly popular in verifying security policies and synthesis of control for dynamic systems. Hyperproperties generalize trace properties to enable reasoning about multiple computation traces that traditional trace properties cannot. Recent works show the effectiveness and prospect of Hyperproperties, specifically Hyperproperties for Linear Temporal Logic (HyperLTL), in optimality-, robustness-, and privacy-aware robotic motion planning. However, despite their rich expressiveness, HyperLTL cannot express tasks with time constraints. This letter presents HyperTWTL, which extends the compact semantics of Time Window Temporal Logic (TWTL) with explicit and concurrent quantification over multiple execution traces. We demonstrate that HyperTWTL can be used to formalize complex robotic planning objectives. Given HyperTWTL specifications, we also propose a symbolic approach for synthesizing optimality-, robustness-, and privacy-aware strategies by reducing the planning problem to a first-order logic satisfiability problem. The planning problem was then solved using two industrial-strength SMT solvers. The feasibility of HyperTWTL and the efficiency and scalability of the proposed strategy synthesis approach are demonstrated by formalizing important motion planning objectives of a surveillance mission case study and synthesizing the respective strategies using Z3 and CVC4 SMT solvers.
The problem of passive learning of linear temporal logic formulae consists in finding the best explanation for how two sets of execution traces differ, in the form of the shortest formula that separates the two sets. ...
详细信息
The problem of passive learning of linear temporal logic formulae consists in finding the best explanation for how two sets of execution traces differ, in the form of the shortest formula that separates the two sets. We approach the problem by implementing an exhaustive search algorithm optimized for execution speed. We apply it to the use-case of a robot moving in an unstructured environment as its battery discharges, both in simulation and in the real world. The results of our experiments confirm that our approach can learn temporal formulas explaining task failures in a case of practical interest.
This paper investigates control synthesis for motion planning under conditions of uncertainty,specifically in robot motion and environmental properties,which are modeled using a probabilistic labeled Markov decision p...
详细信息
This paper investigates control synthesis for motion planning under conditions of uncertainty,specifically in robot motion and environmental properties,which are modeled using a probabilistic labeled Markov decision process(PL-MDP).To address this,a model-free reinforcement learning(RL)approach is designed to produce a finite-memory control policy that meets complex tasks specified by linear temporal logic(LTL)*** the presence of uncertainties and potentially conflicting objectives,this study centers on addressing infeasible LTL specifications.A relaxed LTL constraint enables the agent to adapt its motion plan,allowing for partial satisfaction by accounting for necessary task ***,a new automaton structure is introduced to increase the density of accepting rewards,facilitating deterministic policy *** proposed RL framework is rigorously analyzed and prioritizes two key objectives:(1)satisfying the acceptance condition of the relaxed product MDP,and(2)minimizing long-term violation *** and experimental results are presented to demonstrate the framework’s effectiveness and robustness.
Industrial robots are widely used in industrial production as mechanical devices. It is essential to guarantee that their control software operates safely and properly, as any functional or security-related defects ma...
详细信息
ISBN:
(纸本)9798400704208
Industrial robots are widely used in industrial production as mechanical devices. It is essential to guarantee that their control software operates safely and properly, as any functional or security-related defects may lead to serious incidents. However, industrial robots are programmed mostly in proprietary languages varying from vendor to vendor, making it challenging to formally analyze their correctness in a unified way. One of the most representative robot programming languages is the RAPID language proposed by ABB robotics. In this paper, we present K-RAPID, a formal executable semantics of RAPID in the K-Framework (K). K-RAPID is developed according to the official ABB documentation and defined in a generic extensible manner. It can be used either for validating the correctness of compiler implementation or analyzing the control programs written in RAPID. We evaluate the correctness of K-RAPID by executing 563 test programs collected from multiple sources and comparing the results against the official robot simulation environment RobotStudio. The results suggest that K-RAPID covers the core features of RAPID correctly. Moreover, we show how we could apply K-RAPID to verify RAPID programs using LTL model checking and to provide a formal specification of RAPID to uncover inappropriate behaviors in the programs.
We focus on decomposing large multi-agent path planning problems with global temporal logic goals (common to all agents) into smaller sub-problems that can be solved and executed independently. Crucially, the sub-prob...
详细信息
We focus on decomposing large multi-agent path planning problems with global temporal logic goals (common to all agents) into smaller sub-problems that can be solved and executed independently. Crucially, the sub-problems' solutions must jointly satisfy the common global mission specification. The agents' missions are given as Capability Temporal Logic (CaTL) formulas, a fragment of Signal Temporal Logic (STL) that can express properties over tasks involving multiple agent capabilities (i.e., different combinations of sensors, effectors, and dynamics) under strict timing constraints. We jointly decompose both the temporal logic specification and the team of agents, using a satisfiability modulo theories (SMT) approach and heuristics for handling temporal operators. The output of the SMT is then distributed to subteams and leads to a significant speed up in planning time compared to planning for the entire team and specification. We include computational results to evaluate the efficiency of our solution, as well as the trade-offs introduced by the conservative nature of the SMT encoding and heuristics.
Temporal logic is becoming increasingly popular for its application in the analysis and control of dynamic systems. Time window temporal logic (TWTL) is a rich expressive language for specifying time-bounded serial ta...
详细信息
Temporal logic is becoming increasingly popular for its application in the analysis and control of dynamic systems. Time window temporal logic (TWTL) is a rich expressive language for specifying time-bounded serial tasks in a compact manner that is common in many control applications, such as robotics. Typically, TWTL specifications are verified using an automata-based model checking algorithm. However, verification of important properties of a given system using model checking in design time does not guarantee that the system will behave as expected during the runtime operation which falls under the scope of runtime verification. In this letter, we present a rewriting-based algorithm for runtime monitoring of safety requirements expressed in TWTL. The feasibility and efficiency of our proposed approach is demonstrated using two case studies related to unmanned aerial vehicle (UAV) surveillance and industrial robotics for manufacturing.
Stochastic filters for on-line state estimation are a core technology for autonomous systems. The performance of such filters is one of the key limiting factors to a system's capability. Both asymptotic behavior (...
详细信息
Stochastic filters for on-line state estimation are a core technology for autonomous systems. The performance of such filters is one of the key limiting factors to a system's capability. Both asymptotic behavior (e.g., for regular operation) and transient response (e.g., for fast initialization and reset) of such filters are of crucial importance in guaranteeing robust operation of autonomous systems. This letter introduces a new generic formulation for a gyroscope aided attitude estimator using N direction measurements including both body-frame and reference-frame direction type measurements. The approach is based on an integrated state formulation that incorporates navigation, extrinsic calibration for all direction sensors, and gyroscope bias states in a single equivariant geometric structure. This newly proposed symmetry allows modular addition of different direction measurements and their extrinsic calibration while maintaining the ability to include bias states in the same symmetry. The subsequently proposed filter-based estimator using this symmetry noticeably improves the transient response, and the asymptotic bias and extrinsic calibration estimation compared to state-of-the-art approaches. The estimator is verified in statistically representative simulations and is tested in real-world experiments.
暂无评论