gradient-based algorithms are commonly used for training radial basis function neural network (RBFNN). However, one of the challenges in the training process is determining how to avoid vanishing gradient. To solve th...
详细信息
gradient-based algorithms are commonly used for training radial basis function neural network (RBFNN). However, one of the challenges in the training process is determining how to avoid vanishing gradient. To solve this problem, an accelerated gradient algorithm (AGA) is designed to improve the learning perfor-mance of RBFNN in this paper. First, an indirect detection mechanism, based on the instantaneous gradi-ent decay rate (IGDR) and instantaneous convergence rate (ICR), is developed to identify the vanishing gradient in learning process. Second, an amplification gradient strategy (AGS), which can increase the gradient value of learning parameters, is designed to accelerate the learning speed of RBFNN. Third, the analysis of AGA-based RBFNN (AGA-RBFNN) is given to guarantee the successful application. Finally, some benchmark and real problems are used to illustrate the effectiveness of AGA-RBFNN. The results demonstrate the effectiveness of AGA-RBFNN. (c) 2021 Elsevier B.V. All rights reserved.
In this paper, we propose an optimal trade-off model for portfolio selection with the effect of systematic risk diversification, measured by the maximum marginal systematic risk of all the risk contributors. First, th...
详细信息
In this paper, we propose an optimal trade-off model for portfolio selection with the effect of systematic risk diversification, measured by the maximum marginal systematic risk of all the risk contributors. First, the classical portfolio selection model with constraints on allocation of systematic risk is shown to be equivalent to our trade-off model under certain conditions. Then, we transform the trade-off model into a special non-convex and non-smooth composite problem equivalently. Thus a modified acceleratedgradient (AG) algorithm can be introduced to solve the composite problem. The efficiency of the algorithm for solving the composite problem is demonstrated by theoretical results on both the convergence rate and the iteration complexity bound. Finally, empirical analysis demonstrates that the proposed model is a preferred tool for active portfolio risk management when compared with the existing models. We also carry out a series of numerical experiments to compare the performance of the modified AG algorithm with the other three first-order algorithms.
For smooth convex optimization problems, the optimal convergence rate of first-order algorithm is O(1/k(2)) in theory. This paper proposes three improved accelerated gradient algorithms with the gradient information a...
详细信息
For smooth convex optimization problems, the optimal convergence rate of first-order algorithm is O(1/k(2)) in theory. This paper proposes three improved accelerated gradient algorithms with the gradient information at the latest point. For the step size, to avoid using the global Lipschitz constant and make the algorithm converge faster, new adaptive line search strategies are adopted. By constructing a descent Lyapunov function, we prove that the proposed algorithms can preserve the convergence rate of O(1/k(2)). Numerical experiments demonstrate that our algorithms perform better than some existing algorithms which have optimal convergence rate.
acceleratedgradients algorithms are currently at the receiving end of widespread interest in optimization theory, both under discrete- and continuous-time (CT) frameworks. In light of recent developments, in the firs...
详细信息
acceleratedgradients algorithms are currently at the receiving end of widespread interest in optimization theory, both under discrete- and continuous-time (CT) frameworks. In light of recent developments, in the first part of our work, we design a CT accelerated gradient algorithm for strongly connected directed graphs. We show that the convergence is exponential and the convergence rate is proportional to the gradient gain which is chosen arbitrarily. To facilitate implementation of the algorithm over communication networks, in the second part of our work, we design an event-based broadcasting protocol that intermittently checks for events by evaluating an event-triggering condition and accordingly makes decision on broadcasting. The distributed system, with CT dynamics and discrete-time (event-based) broadcasts, is reformulated as a hybrid dynamical system which is devoid of Zeno solutions. Finally, we provide a numerical example to demonstrate our results.
The Non -Synchronous Measurements is particularly effective at expanding the operating frequency range of a single microphone array. The completed cross -spectral matrix of the synthetic array, when combined with high...
详细信息
The Non -Synchronous Measurements is particularly effective at expanding the operating frequency range of a single microphone array. The completed cross -spectral matrix of the synthetic array, when combined with high -resolution imaging algorithms, can yield accurate sound source identification results. However, when matrix completion algorithms are applied to large synthetic arrays, their computational complexity increases significantly. Furthermore, existing integrations with high -resolution imaging algorithms often encounter difficulties in the presence of coherent sound sources. To address this problem, this paper introduces the acceleratedgradient descent algorithm for the covariance matrix fitting by orthogonal least squares. The matrix completion model for non -synchronous measurements is initially simplified through matrix decomposition, thereby enabling the rapid completion of the cross -spectral matrix using the acceleratedgradient descent algorithm. This is then followed by the application of the covariance matrix fitting by orthogonal least squares to achieve quick and precise identification of coherent sound sources employing the completed cross -spectral matrix. The performance of the algorithm is evaluated using numerical simulations and validated through loudspeaker experiments in an anechoic chamber. The results from these simulations and experiments reveal that the proposed algorithm not only improves matrix completion performance on large synthetic arrays but also accurately identifies the locations and correlation coefficients of coherent sound sources.
Non-convex regularization has been recognized as an especially important approach in recent studies to promote sparsity. In this paper, we study the non-convex piecewise quadratic approximation (PQA) regularization fo...
详细信息
Non-convex regularization has been recognized as an especially important approach in recent studies to promote sparsity. In this paper, we study the non-convex piecewise quadratic approximation (PQA) regularization for sparse solutions of the linear inverse problem. It is shown that exact recovery of sparse signals and stable recovery of compressible signals are possible through local optimum of this regularization. After developing a thresholding representation theory for PQA regularization, we propose an iterative PQA thresholding algorithm (PQA algorithm) to solve this problem. The PQA algorithm converges to a local minimizer of the regularization, with an eventually linear convergence rate. Furthermore, we adopt the idea of acceleratedgradient method to design the accelerated iterative PQA thresholding algorithm (APQA algorithm), which is also linearly convergent, but with a faster convergence rate. Finally, we carry out a series of numerical experiments to assess the performance of both algorithms for PQA regularization. The results show that PQA regularization outperforms L-1 and L(1/2 )regularizations in terms of accuracy and sparsity, while the APQA algorithm is demonstrated to be significantly better than the PQA algorithm.
A distributed model predictive control (DMPC) approach based on distributed optimization is applied to the power reference tracking problem of a hydro power valley (HPV) system. The applied optimization algorithm is b...
详细信息
A distributed model predictive control (DMPC) approach based on distributed optimization is applied to the power reference tracking problem of a hydro power valley (HPV) system. The applied optimization algorithm is based on acceleratedgradient methods and achieves a convergence rate of O(1/k(2)), where k is the iteration number. Major challenges in the control of the HPV include a nonlinear and large-scale model, nonsmoothness in the power-production functions, and a globally coupled cost function that prevents distributed schemes to be applied directly. We propose a linearization and approximation approach that accommodates the proposed the DMPC framework and provides very similar performance compared to a centralized solution in simulations. The provided numerical studies also suggest that for the sparsely interconnected system at hand, the distributed algorithm we propose is faster than a centralized state-of-the-art solver such as CPLEX. (C) 2013 Elsevier Ltd. All rights reserved.
In recent years, there has been growing interest in learning to rank. The introduction of feature selection into different learning problems has been proven effective. These facts motivate us to investigate the proble...
详细信息
In recent years, there has been growing interest in learning to rank. The introduction of feature selection into different learning problems has been proven effective. These facts motivate us to investigate the problem of feature selection for learning to rank. We propose a joint convex optimization formulation which minimizes ranking errors while simultaneously conducting feature selection. This optimization formulation provides a flexible framework in which we can easily incorporate various importance measures and similarity measures of the features. To solve this optimization problem, we use the Nesterov's approach to derive an accelerated gradient algorithm with a fast convergence rate O(1/T-2). We further develop a generalization bound for the proposed optimization problem using the Rademacher complexities. Extensive experimental evaluations are conducted on the public LETOR benchmark datasets. The results demonstrate that the proposed method shows: 1) significant ranking performance gain compared to several feature selection baselines for ranking, and 2) very competitive performance compared to several state-of-the-art learning-to-rank algorithms.
Clustering techniques offer a systematic approach to organize the diverse and fast increasing Web services by assigning relevant services into homogeneous service communities. However, the ever increasing number of We...
详细信息
Clustering techniques offer a systematic approach to organize the diverse and fast increasing Web services by assigning relevant services into homogeneous service communities. However, the ever increasing number of Web services poses key challenges for building large-scale service communities. In this paper, we tackle the scalability issue in service clustering, aiming to accurately and efficiently discover service communities over very large-scale services. A key observation is that service descriptions are usually represented by long but very sparse term vectors as each service is only described by a limited number of terms. This inspires us to seek a new service representation that is economical to store, efficient to process, and intuitive to interpret. This new representation enables service clustering to scale to massive number of services. More specifically, a set of anchor services are identified that allows each service to represent as a linear combination of a small number of anchor services. In this way, the large number of services are encoded with a much more compact anchor service space. Despite service clustering can be performed much more efficiently in the compact anchor service space, discovery of anchor services from large-scale service descriptions may incur high computational cost. We develop principled optimization strategies for efficient anchor service discovery. Extensive experiments are conducted on real-world service data to assess both the effectiveness and efficiency of the proposed approach. Results on a dataset with over 3,700 Web services clearly demonstrate the good scalability of sparse functional representation and the efficiency of the optimization algorithms for anchor service discovery.
Recently there have been renewed interests in single-hidden-layer neural networks (SHLNNs). This is due to its powerful modeling ability as well as the existence of some efficient learning algorithms. A prominent exam...
详细信息
Recently there have been renewed interests in single-hidden-layer neural networks (SHLNNs). This is due to its powerful modeling ability as well as the existence of some efficient learning algorithms. A prominent example of such algorithms is extreme learning machine (ELM), which assigns random values to the lower-layer weights. While ELM can be trained efficiently, it requires many more hidden units than is typically needed by the conventional neural networks to achieve matched classification accuracy. The use of a large number of hidden units translates to significantly increased test time, which is more valuable than training time in practice. In this paper, we propose a series of new efficient learning algorithms for SHLNNs. Our algorithms exploit both the structure of SHLNNs and the gradient information over all training epochs, and update the weights in the direction along which the overall square error is reduced the most. Experiments on the MNIST handwritten digit recognition task and the MAGIC gamma telescope dataset show that the algorithms proposed in this paper obtain significantly better classification accuracy than ELM when the same number of hidden units is used. For obtaining the same classification accuracy, our best algorithm requires only 1/16 of the model size and thus approximately 1/16 of test time compared with ELM. This huge advantage is gained at the expense of 5 times or less the training cost incurred by the ELM training. (C) 2011 Elsevier B.V. All rights reserved.
暂无评论