Designing and deriving effective model-based reinforcement learning (MBRL) algorithms with a performance improvement guarantee is challenging, mainly attributed to the high coupling between model learning and policy o...
详细信息
A primary challenge for visual-based Reinforcement Learning (RL) is to generalize effectively across unseen environments. Although previous studies have explored different auxiliary tasks to enhance generalization, fe...
详细信息
This paper introduces an automatic Web service composition method based on logical inference of Horn clauses and Petri nets. The Web service composition problem is transformed into the logical inference problem of Hor...
详细信息
One of the most promising advantages of Web service technology is the possibility of creating value-added services by combining existing ones. A major challenge is how to discover and select concrete service according...
详细信息
One of the most promising advantages of Web service technology is the possibility of creating value-added services by combining existing ones. A major challenge is how to discover and select concrete service according to user requirements. This paper addresses the topic of service discovery composite Web services. The main feature is that we take the process model as well as service profile into account. Firstly, the process models of Web services are translated into Petri nets. Based on this, we propose a service matchmaking algorithm, via comparing the functionality compatibility and process consistency, thus leading to more accurate matchmaking.
Illumination, occlusion, pose and expression variations are the most common challenging problems for face recognition in many real-world applications. However, existing face recognition methods are proposed to handle ...
详细信息
Designing and deriving effective model-based reinforcement learning (MBRL) algorithms with a performance improvement guarantee is challenging, mainly attributed to the high coupling between model learning and policy o...
Designing and deriving effective model-based reinforcement learning (MBRL) algorithms with a performance improvement guarantee is challenging, mainly attributed to the high coupling between model learning and policy optimization. Many prior methods that rely on return discrepancy to guide model learning ignore the impacts of model shift, which can lead to performance deterioration due to excessive model updates. Other methods use performance difference bound to explicitly consider model shift. However, these methods rely on a fixed threshold to constrain model shift, resulting in a heavy dependence on the threshold and a lack of adaptability during the training process. In this paper, we theoretically derive an optimization objective that can unify model shift and model bias and then formulate a fine-tuning process. This process adaptively adjusts the model updates to get a performance improvement guarantee while avoiding model over-fitting. Based on these, we develop a straightforward algorithm USB-PO (Unified model Shift and model Bias Policy Optimization). Empirical results show that USB-PO achieves state-of-the-art performance on several challenging benchmark tasks. Code: https://***/betray12138/***
To improve the availability of data in the cloud and avoid vendor lock-in risk, multi-cloud storage is attracting more and more attentions. However, accessing data from the cloud usually has some disadvantages such as...
详细信息
Central pattern generator (CPG) plays an important role in rhythmic activities of animals and this mechanism is an important inspiration source for the motion control of legged robots. In this paper, by using CPGs and...
详细信息
ISBN:
(纸本)9780889868595
Central pattern generator (CPG) plays an important role in rhythmic activities of animals and this mechanism is an important inspiration source for the motion control of legged robots. In this paper, by using CPGs and function mapping mechanism, a high-efficiency distributed CPG control network is constructed to realize the locomotion control of biped NAO robot. To realize stable and coordinated locomotion, the parameters of the CPG network are evolved by multi-object genetic algorithm (MOGA). Simulations with Webots validate the feasibility and efficiency of the presented CPG-based control method.
In order to reduce traffic accidents and road congestion in many cities, vehicle speed estimation is very critical and important to observe speed limitation law and traffic conditions. In this paper, we present a spee...
详细信息
In this work, we aim at building a bridge from poor behavioral data to an effective, quick-response, and robust behavior model for online identity theft detection. We concentrate on this issue in online social network...
详细信息
暂无评论