A deep neural network with invertible hidden layers has a nice property of preserving all the information in the feature learning stage. In this paper, we analyse the hidden layers of residual rectifier neural network...
详细信息
A deep neural network with invertible hidden layers has a nice property of preserving all the information in the feature learning stage. In this paper, we analyse the hidden layers of residual rectifier neural networks, and investigate conditions for invertibility under which the hidden layers are invertible. A new fixed-point algorithm is developed to invert the hidden layers of residual networks. The proposed inverse algorithms are capable of inverting some residual networks which cannot be inverted by existing inverting algorithms. Furthermore, a special residual rectifier network is designed and trained on MNIST so that it can achieve comparable performance with the state-of-art performance while its hidden layers are invertible.
We develop fixed-point algorithms for the approximation of structured matrices with rank penalties. In particular we use these fixed-point algorithms for making approximations by sums of exponentials, i.e., frequency ...
详细信息
We develop fixed-point algorithms for the approximation of structured matrices with rank penalties. In particular we use these fixed-point algorithms for making approximations by sums of exponentials, i.e., frequency estimation. For the basic formulation of the fixed-point algorithm we show that it converges to the solution of a related minimization problem, namely the one obtained by replacing the original objective function with its convex envelope and keeping the structured matrix constraint unchanged. It often happens that this solution agrees with the solution to the original minimization problem, and we provide a simple criterion for when this is true. We also provide more general fixed-point algorithms that can be used to treat the problems of making weighted approximations by sums of exponentials given equally or unequally spaced sampling. We apply the method to the case of missing data, although the above mentioned convergence results do not hold in this case. However, it turns out that the method often gives perfect reconstruction (up to machine precision) in such cases. We also discuss multidimensional extensions, and illustrate how the proposed algorithms can be used to recover sums of exponentials in several variables, but when samples are available only along a curve. (C) 2017 Elsevier Inc. All rights reserved.
We analyze a fixed-point algorithm for reinforcement learning (RL) of optimal portfolio mean-variance preferences in the setting of multivariate generalized autoregressive conditional-heteroskedasticity (MGARCH) with ...
详细信息
We analyze a fixed-point algorithm for reinforcement learning (RL) of optimal portfolio mean-variance preferences in the setting of multivariate generalized autoregressive conditional-heteroskedasticity (MGARCH) with a small penalty on trading. A numerical solution is obtained using a neural network (NN) architecture within a recursive RL loop. A fixed-point theorem proves that NN approximation error has a big-oh bound that we can reduce by increasing the number of NN parameters. The functional form of the trading penalty has a parameter epsilon > 0 that controls the magnitude of transaction costs. When epsilon is small, we can implement an NN algorithm based on the expansion of the solution in powers of epsilon. This expansion has a base term equal to a myopic solution with an explicit form, and a first-order correction term that we compute in the RL loop. Our expansion-based algorithm is stable, allows for fast computation, and outputs a solution that shows positive testing performance.
In this paper, we propose an approach for computing invariant sets of discrete-time nonlinear systems by lifting the nonlinear dynamics into a higher dimensional linear model. In particular, we focus on the maximal ad...
详细信息
In this paper, we propose an approach for computing invariant sets of discrete-time nonlinear systems by lifting the nonlinear dynamics into a higher dimensional linear model. In particular, we focus on the maximal admissible invariant set contained in some given constraint set. For special types of nonlinear systems, which can be exactly immersed into higher dimensional linear systems with state transformations, invariant sets of the original nonlinear system can be characterized using the higher dimensional linear representation. For general nonlinear systems without the immersibility property, approximate immersions are defined in a local region within some tolerance and linear approximations are computed by leveraging the fixed-point iteration technique for invariant sets. Given the bound on the mismatch between the linear approximation and the original system, we provide an invariant inner approximation of the maximal admissible invariant set by a tightening procedure. (c) 2022 Elsevier Ltd. All rights reserved.
It is well known that the Newton method may not converge when the initial guess does not belong to a specific quadratic convergence region. We propose a family of new variants of the Newton method with the potential a...
详细信息
It is well known that the Newton method may not converge when the initial guess does not belong to a specific quadratic convergence region. We propose a family of new variants of the Newton method with the potential advantage of having a larger convergence region as well as more desirable properties near a solution. We prove quadratic convergence of the new family, and provide specific bounds for the asymptotic error constant. We illustrate the advantages of the new methods by means of test problems, including two and six variable polynomial systems, as well as a challenging signal processing example. We present a numerical experimental methodology which uses a large number of randomized initial guesses for a number of methods from the new family, in turn providing advice as to which of the methods employed is preferable to use in a particular search domain.
Imposing equilibrium restrictions provides substantial gains in the estimation of dynamic discrete games. Estimation algorithms imposing these restrictions have different merits and limitations. algorithms that guaran...
详细信息
Imposing equilibrium restrictions provides substantial gains in the estimation of dynamic discrete games. Estimation algorithms imposing these restrictions have different merits and limitations. algorithms that guarantee local convergence typically require the approximation of high-dimensional Jacobians. Alternatively, the Nested Pseudo-Likelihood (NPL) algorithm is a fixed-point iterative procedure, which avoids the computation of these matrices, but-in games-may fail to converge to the consistent NPL estimator. In order to better capture the effect of iterating the NPL algorithm in finite samples, we study the asymptotic properties of this algorithm for data generating processes that are in a neighborhood of the NPL fixed-point stability threshold. We find that there are always samples for which the algorithm fails to converge, and this introduces a selection bias. We also propose a spectral algorithm to compute the NPL estimator. This algorithm satisfies local convergence and avoids the approximation of Jacobian matrices. We present simulation evidence and an empirical application illustrating our theoretical results and the good properties of the spectral algorithm.
The adjoint mode of Algorithmic Differentiation (AD) is particularly attractive for computing gradients. However, this mode needs to use the intermediate values of the original simulation in reverse order at a cost th...
详细信息
The adjoint mode of Algorithmic Differentiation (AD) is particularly attractive for computing gradients. However, this mode needs to use the intermediate values of the original simulation in reverse order at a cost that increases with the length of the simulation. AD research looks for strategies to reduce this cost, for instance by taking advantage of the structure of the given program. In this work, we consider on one hand the frequent case of fixed-point loops for which several authors have proposed adapted adjoint strategies. Among these strategies, we select the one introduced by B. Christianson. We specify further the selected method and we describe the way we implemented it inside the AD tool Tapenade. Experiments on a medium-size application shows a major reduction of the memory needed to store trajectories. On the other hand, we study checkpointing in the case of MPI parallel programs with point-to- point communications. We propose techniques to apply checkpointing to these programs. We provide proof of correctness of our techniques and we experiment them on representative CFD codes. This work was sponsored by the European project "AboutFlow".
We study the problem of decomposing a measured signal as a sum of decaying exponentials. There is a direct connection to sums of these types and positive semi-definite (PSD) Hankel matrices, where the rank of these ma...
详细信息
ISBN:
(纸本)9780992862657
We study the problem of decomposing a measured signal as a sum of decaying exponentials. There is a direct connection to sums of these types and positive semi-definite (PSD) Hankel matrices, where the rank of these matrices equals the number of exponentials. We propose to solve the identification problem by forming an optimization problem with a misfit function combined with a rank penalty function that also ensures the PSD-constraint. This problem is non-convex, but we show that it is possible to compute the minimum of an explicit closely related convexified problem. Moreover, this minimum can be shown to often coincide with the minimum of the original non-convex problem, and we provide a simple criterion that enables to verify if this is the case.
Adjoint algorithms, and in particular those obtained through the adjoint mode of Automatic Differentiation (AD), are probably the most efficient way to obtain the gradient of a numerical simulation. This however needs...
详细信息
ISBN:
(纸本)9788494284472
Adjoint algorithms, and in particular those obtained through the adjoint mode of Automatic Differentiation (AD), are probably the most efficient way to obtain the gradient of a numerical simulation. This however needs to use the flow of data of the original simulation in reverse order, at a cost that increases with the length of the simulation. AD research looks for strategies to reduce this cost, taking advantage of the structure of the given program. One such frequent structure is fixed-point iterations, which occur e.g. in steady-state simulations, but not only. It is common wisdom that the first iterations of a fixed-point search operate on a meaningless state vector, and that reversing the corresponding data-flow may be suboptimal. An adapted adjoint strategy for this iterative process should consider only the last or the few last iterations. At least two authors, B. Christianson and A. Griewank, have studied mathematically fixed-point iterations with the goal of defining an efficient adjoint. In this paper, we describe and contrast these two strategies with the objective of implementing the best suited one into the AD tool that we are developing. We select a representative application to test the chosen strategy, to propose a set of user directives to trigger it, and to discuss the implementation implications in our tool.
In this paper, we analyze the convergence of two general classes of optimization algorithms for regularized kernel methods with convex loss function and quadratic norm regularization. The first methodology is a new cl...
详细信息
In this paper, we analyze the convergence of two general classes of optimization algorithms for regularized kernel methods with convex loss function and quadratic norm regularization. The first methodology is a new class of algorithms based on fixed-point iterations that are well-suited for a parallel implementation and can be used with any convex loss function. The second methodology is based on coordinate descent, and generalizes some techniques previously proposed for linear support vector machines. It exploits the structure of additively separable loss functions to compute solutions of line searches in closed form. The two methodologies are both very easy to implement. In this paper, we also show how to remove non-differentiability of the objective functional by exactly reformulating a convex regularization problem as an unconstrained differentiable stabilization problem.
暂无评论