In this work, we propose AirNN, a novel framework which enables dynamic approximation of an already-trained convolutional neural network (CNN) in hardware during inference. AirNN enables input-dependent approximation ...
详细信息
In this work, we propose AirNN, a novel framework which enables dynamic approximation of an already-trained convolutional neural network (CNN) in hardware during inference. AirNN enables input-dependent approximation of the CNN to achieve energy saving without much degradation in its classification accuracy at runtime. For each input, AirNN uses only a fraction of the CNN's weights based on that input (with the rest remaining 0) to conduct the inference. Consequently, energy saving is possible due to fewer number of fetches from off-chip memory as well as fewer multiplications for majority of the inputs. To achieve per-input approximation, we propose a clustering algorithm that groups similar weights in the CNN based on their importance, and design an iterative framework that decides dynamically how many clusters of weights should be fetched from off-chip memory for each individual input. We also propose new hardware structures to implement our framework on top of a recently proposed FPGA-based CNN accelerator. In our experiments with popular CNNs, we, on average, show 49% energy saving with less than 3% degradation in classification accuracy due to doing inference with only a fraction of the weights for the majority of the inputs. We also propose a greedy interleaving scheme, implemented in hardware, in order to improve the performance of the iterative procedure and compensate for its latency overhead.
This article studies the constrained switching (linear) system which is a discrete-time switched linear system whose switching sequences are constrained by a deterministic finite automaton. The stability of a constrai...
详细信息
This article studies the constrained switching (linear) system which is a discrete-time switched linear system whose switching sequences are constrained by a deterministic finite automaton. The stability of a constrained switching system is characterized by its constrained joint spectral radius that is known to be difficult to compute or approximate. Using the semitensor product of matrices, the matrix-form expression of a constrained switching system is shown to be equivalent to that of a lifted arbitrary switching system. Then, the constrained joint/generalized spectral radius of a constrained switching system is proven to be equal to the joint/generalized spectral radius of its lifted arbitrary switching system which can be approximated by off-the-shelf algorithms.
The nonlinear programming (NLP) problem to solve distribution-level optimal power flow (D-OPF) poses convergence issues and does not scale well for unbalanced distribution systems. The existing scalable D-OPF algorith...
详细信息
The nonlinear programming (NLP) problem to solve distribution-level optimal power flow (D-OPF) poses convergence issues and does not scale well for unbalanced distribution systems. The existing scalable D-OPF algorithms either use approximations that are not valid for an unbalanced power distribution system, or apply relaxation techniques to the nonlinear power flow equations that do not guarantee a feasible power flow solution. In this paper, we propose scalable D-OPF algorithms that simultaneously achieve optimal and feasible solutions by solving multiple iterations of approximate, or relaxed, D-OPF subproblems of low complexity. The first algorithm is based on a successive linear approximation of the nonlinear power flow equations around the current operating point, where the D-OPF solution is obtained by solving multiple iterations of a linear programming (LP) problem. The second algorithm is based on the relaxation of the nonlinear power flow equations as conic constraints together with directional constraints, which achieves optimal and feasible solutions over multiple iterations of a second-order cone programming (SOCP) problem. It is demonstrated that the proposed algorithms are able to reach an optimal and feasible solution while significantly reducing the computation time as compared to an equivalent NLP D-OPF model for the same distribution system.
Subset selection plays an important role in the field of evolutionary multiobjective optimization (EMO). Especially, in an EMO algorithm with an unbounded external archive (UEA), subset selection is an essential post-...
详细信息
Subset selection plays an important role in the field of evolutionary multiobjective optimization (EMO). Especially, in an EMO algorithm with an unbounded external archive (UEA), subset selection is an essential post-processing procedure to select a prespecified number of solutions as the final result. In this article, we discuss the efficiency of greedy subset selection for the hypervolume, inverted generational distance (IGD), and IGD plus (IGD+) indicators. Greedy algorithms usually efficiently handle the subset selection. However, when a large number of solutions are given (e.g., subset selection from tens of thousands of solutions in a UEA), they often become time consuming. Our idea is to use the submodular property, which is known for the hypervolume indicator, to improve their efficiency. First, we prove that the IGD and IGD+ indicators are also submodular. Next, based on the submodular property, we propose an efficient greedy inclusion algorithm for each indicator. We demonstrate through computational experiments that the proposed algorithms are much faster than the standard greedy subset selection algorithms. The proposed algorithms also help the research on performance indicators.
Adversarial attacks have been extensively studied in recent years since they can identify the vulnerability of deep learning models before deployed. In this paper, we consider the black-box adversarial setting, where ...
详细信息
Adversarial attacks have been extensively studied in recent years since they can identify the vulnerability of deep learning models before deployed. In this paper, we consider the black-box adversarial setting, where the adversary needs to craft adversarial examples without access to the gradients of a target model. Previous methods attempted to approximate the true gradient either by using the transfer gradient of a surrogate white-box model or based on the feedback of model queries. However, the existing methods inevitably suffer from low attack success rates or poor query efficiency since it is difficult to estimate the gradient in a high-dimensional input space with limited information. To address these problems and improve black-box attacks, we propose two prior-guided random gradient-free (PRGF) algorithms based on biased sampling and gradient averaging, respectively. Our methods can take the advantage of a transfer-based prior given by the gradient of a surrogate model and the query information simultaneously. Through theoretical analyses, the transfer-based prior is appropriately integrated with model queries by an optimal coefficient in each method. Extensive experiments demonstrate that, in comparison with the alternative state-of-the-arts, both of our methods require much fewer queries to attack black-box models with higher success rates.
We study approximation algorithms for the problem of minimizing the makespan on a set of machines with uncertainty on the processing times of jobs. In the model we consider, which goes back to Bertsimas et al. (Math. ...
详细信息
We study approximation algorithms for the problem of minimizing the makespan on a set of machines with uncertainty on the processing times of jobs. In the model we consider, which goes back to Bertsimas et al. (Math. Program. 98(1-3), 49-71 2003), once the schedule is defined an adversary can pick a scenario where deviation is added to some of the jobs' processing times. Given only the maximal cardinality of these jobs, and the magnitude of potential deviation for each job, the goal is to optimize the worst-case scenario. We consider both the cases of identical and unrelated machines. Our main result is an EPTAS for the case of identical machines. We also provide a 3-approximation algorithm and an inapproximability ratio of 2 - epsilon for the case of unrelated machines.
We study a class of generic charging path optimization problems arising from emerging networking applications, where mobile chargers are dispatched to deliver energy to mobile agents (e.g., robots, drones, vehicles), ...
详细信息
We study a class of generic charging path optimization problems arising from emerging networking applications, where mobile chargers are dispatched to deliver energy to mobile agents (e.g., robots, drones, vehicles), which have specified tasks and mobility patterns. We instantiate our work by focusing on finding the charging path maximizing the number of nodes charged within a fixed time horizon. We show that this problem is APX-hard. By recursively decomposing the problem into sub-problems of searching sub-paths, we design quasi-polynomial-time algorithms achieving logarithmic approximation to the optimum charging path. Our approximation algorithms can be further adapted and extended to solve a variety of charging path optimization and scheduling problems with realistic constraints, such as limited time and energy budget.
This paper shows a comparison between Vector fitting and rational Krylov fitting techniques for the determination of rational models concerning the fitting accuracy, the computational performances and the model order....
详细信息
This paper shows a comparison between Vector fitting and rational Krylov fitting techniques for the determination of rational models concerning the fitting accuracy, the computational performances and the model order. Primarily, the mathematics behind the second technique are presented. It should be noted that rational Krylov fitting have never been used in transmission line modeling. A new procedure is proposed to use rational Krylov fitting instead of vector fitting in the universal line model (ULM). Furthermore, it is demonstrated that this procedure has several advantages over the traditional one. Two illustrative examples involving a transmission system are presented for validation of the new procedure.
We study the visual complexity of animated transitions between point sets. Although there exist many metrics for point set similarity, these metrics are not adequate for this purpose, as they typically treat each poin...
详细信息
We study the maximum set coverage problem in the massively parallel model. In this setting, m sets that are subsets of a universe of n elements are distributed among m machines. In each round, these machines can commu...
详细信息
暂无评论