In the development of linear quadratic regulator(LQR) algorithms, the Riccati equation approach offers two important characteristics——it is recursive and readily meets the existence condition. However, these attribu...
详细信息
In the development of linear quadratic regulator(LQR) algorithms, the Riccati equation approach offers two important characteristics——it is recursive and readily meets the existence condition. However, these attributes are applicable only to transformed singular systems, and the efficiency of the regulator may be undermined if constraints are violated in nonsingular versions. To address this gap, we introduce a direct approach to the LQR problem for linear singular systems, avoiding the need for any transformations and eliminating the need for regularity assumptions. To achieve this goal, we begin by formulating a quadratic cost function to derive the LQR algorithm through a penalized and weighted regression framework and then connect it to a constrained minimization problem using the Bellman's criterion. Then, we employ a dynamic programming strategy in a backward approach within a finite horizon to develop an LQR algorithm for the original system. To accomplish this, we address the stability and convergence analysis under the reachability and observability assumptions of a hypothetical system constructed by the pencil of augmented matrices and connected using the Hamiltonian diagonalization technique.
Binary neural networks have become a promising research topic due to their advantages of fast inference speed and low energy consumption. However, most existing studies focus on binary convolutional neural networks, w...
详细信息
Binary neural networks have become a promising research topic due to their advantages of fast inference speed and low energy consumption. However, most existing studies focus on binary convolutional neural networks, while less attention has been paid to binary graph neural networks. A common drawback of existing studies on binary graph neural networks is that they still include lots of inefficient full-precision operations in multiplying three matrices and are therefore not efficient enough. In this paper, we propose a novel method, called re-quantization-based binary graph neural networks(RQBGN), for binarizing graph neural networks. Specifically, re-quantization, a necessary procedure contributing to the further reduction of superfluous inefficient full-precision operations, quantizes the results of multiplication between any two matrices during the process of multiplying three matrices. To address the challenges introduced by requantization, in RQBGN we first study the impact of different computation orders to find an effective one and then introduce a mixture of experts to increase the model capacity. Experiments on five benchmark datasets show that performing re-quantization in different computation orders significantly impacts the performance of binary graph neural network models, and RQBGN can outperform other baselines to achieve state-of-the-art performance.
Stochastic gradient descent(SGD) and its variants have been the dominating optimization methods in machine learning. Compared with SGD with small-batch training, SGD with large-batch training can better utilize the co...
详细信息
Stochastic gradient descent(SGD) and its variants have been the dominating optimization methods in machine learning. Compared with SGD with small-batch training, SGD with large-batch training can better utilize the computational power of current multi-core systems such as graphics processing units(GPUs)and can reduce the number of communication rounds in distributed training settings. Thus, SGD with large-batch training has attracted considerable attention. However, existing empirical results showed that large-batch training typically leads to a drop in generalization accuracy. Hence, how to guarantee the generalization ability in large-batch training becomes a challenging task. In this paper, we propose a simple yet effective method, called stochastic normalized gradient descent with momentum(SNGM), for large-batch training. We prove that with the same number of gradient computations, SNGM can adopt a larger batch size than momentum SGD(MSGD), which is one of the most widely used variants of SGD, to converge to an?-stationary point. Empirical results on deep learning verify that when adopting the same large batch size,SNGM can achieve better test accuracy than MSGD and other state-of-the-art large-batch training methods.
Exploration strategy design is a challenging problem in reinforcement learning(RL),especially when the environment contains a large state space or sparse *** exploration,the agent tries to discover unexplored(novel)ar...
详细信息
Exploration strategy design is a challenging problem in reinforcement learning(RL),especially when the environment contains a large state space or sparse *** exploration,the agent tries to discover unexplored(novel)areas or high reward(quality)*** existing methods perform exploration by only utilizing the novelty of *** novelty and quality in the neighboring area of the current state have not been well utilized to simultaneously guide the agent’s *** address this problem,this paper proposes a novel RL framework,called clustered reinforcement learning(CRL),for efficient exploration in *** adopts clustering to divide the collected states into several clusters,based on which a bonus reward reflecting both novelty and quality in the neighboring area(cluster)of the current state is given to the *** leverages these bonus rewards to guide the agent to perform efficient ***,CRL can be combined with existing exploration strategies to improve their performance,as the bonus rewards employed by these existing exploration strategies solely capture the novelty of *** on four continuous control tasks and six hard-exploration Atari-2600 games show that our method can outperform other state-of-the-art methods to achieve the best performance.
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive...
详细信息
As a pivotal enabler of intelligent transportation system(ITS), Internet of vehicles(Io V) has aroused extensive attention from academia and industry. The exponential growth of computation-intensive, latency-sensitive,and privacy-aware vehicular applications in Io V result in the transformation from cloud computing to edge computing,which enables tasks to be offloaded to edge nodes(ENs) closer to vehicles for efficient execution. In ITS environment,however, due to dynamic and stochastic computation offloading requests, it is challenging to efficiently orchestrate offloading decisions for application requirements. How to accomplish complex computation offloading of vehicles while ensuring data privacy remains challenging. In this paper, we propose an intelligent computation offloading with privacy protection scheme, named COPP. In particular, an Advanced Encryption Standard-based encryption method is utilized to implement privacy protection. Furthermore, an online offloading scheme is proposed to find optimal offloading policies. Finally, experimental results demonstrate that COPP significantly outperforms benchmark schemes in the performance of both delay and energy consumption.
Highway safety researchers focus on crash injury severity,utilizing deep learning—specifically,deep neural networks(DNN),deep convolutional neural networks(D-CNN),and deep recurrent neural networks(D-RNN)—as the pre...
详细信息
Highway safety researchers focus on crash injury severity,utilizing deep learning—specifically,deep neural networks(DNN),deep convolutional neural networks(D-CNN),and deep recurrent neural networks(D-RNN)—as the preferred method for modeling accident *** learning’s strength lies in handling intricate relation-ships within extensive datasets,making it popular for accident severity level(ASL)prediction and *** prior success,there is a need for an efficient system recognizing ASL in diverse road *** address this,we present an innovative Accident Severity Level Prediction Deep Learning(ASLP-DL)framework,incorporating DNN,D-CNN,and D-RNN models fine-tuned through iterative hyperparameter selection with Stochastic Gradient *** framework optimizes hidden layers and integrates data augmentation,Gaussian noise,and dropout regularization for improved *** and factor contribution analyses identify influential *** on three diverse crash record databases—NCDB 2018–2019,UK 2015–2020,and US 2016–2021—the D-RNN model excels with an ACC score of 89.0281%,a Roc Area of 0.751,an F-estimate of 0.941,and a Kappa score of 0.0629 over the NCDB *** proposed framework consistently outperforms traditional methods,existing machine learning,and deep learning techniques.
Dexterous robot manipulation has shone in complex industrial scenarios, where multiple manipulators, or fingers, cooperate to grasp and manipulate objects. When encountering multi-objective optimization with system co...
详细信息
Dexterous robot manipulation has shone in complex industrial scenarios, where multiple manipulators, or fingers, cooperate to grasp and manipulate objects. When encountering multi-objective optimization with system constraints in such scenarios, model predictive control(MPC) has demonstrated exceptional performance in complex multi-robot manipulation tasks involving multi-objective optimization with system constraints. However, in such scenarios, the substantial computational load required to solve the optimal control problem(OCP) at each triggering instant can lead to significant delays between state sampling and control application, hindering real-time performance. To address these challenges, this paper introduces a novel robust tube-based smooth MPC approach for two fundamental manipulation tasks: reaching a given target and tracking a reference trajectory. By predicting the successor state as the initial condition for imminent OCP solving, we can solve the forthcoming OCP ahead of time, alleviating delay effects. Additionally,we establish an upper bound for linearizing the original nonlinear system, reducing OCP complexity and enhancing response speed. Grounded in tube-based MPC theory, the recursive feasibility and closed-loop stability amidst constraints and disturbances are ensured. Empirical validation is provided through two numerical simulations and two real-world dexterous robot manipulation tasks, which shows that the seamless control input by our methods can effectively enhance the solving efficiency and control performance when compared to conventional time-triggered MPC strategies.
GPT is widely recognized as one of the most versatile and powerful large language models, excelling across diverse domains. However, its significant computational demands often render it economically unfeasible for in...
详细信息
Dear Editor,This letter presents a new transfer learning framework for the deep multi-agent reinforcement learning(DMARL) to reduce the convergence difficulty and training time when applying DMARL to a new scenario [1...
详细信息
Dear Editor,This letter presents a new transfer learning framework for the deep multi-agent reinforcement learning(DMARL) to reduce the convergence difficulty and training time when applying DMARL to a new scenario [1], [2].
Mobile applications(apps for short)often need to display ***,inefficient image displaying(IID)issues are pervasive in mobile apps,and can severely impact app performance and user *** paper first establishes a descript...
详细信息
Mobile applications(apps for short)often need to display ***,inefficient image displaying(IID)issues are pervasive in mobile apps,and can severely impact app performance and user *** paper first establishes a descriptive framework for the image displaying procedures of IID *** on the descriptive framework,we conduct an empirical study of 216 real-world IID issues collected from 243 popular open-source Android apps to validate the presence and severity of IID issues,and then shed light on these issues’characteristics to support research on effective issue *** the findings of this study,we propose a static IID issue detection tool TAPIR and evaluate it with 243 real-world Android ***,49 and 64 previously-unknown IID issues in two different versions of 16 apps reported by TAPIR are manually confirmed as true positives,respectively,and 16 previously-unknown IID issues reported by TAPIR have been confirmed by developers and 13 have been ***,we further evaluate the performance impact of these detected IID issues and the performance improvement if they are *** results demonstrate that the IID issues detected by TAPIR indeed cause significant performance degradation,which further show the effectiveness and efficiency of TAPIR.
暂无评论