In this work, we focus on studying the differentiable relaxations of several linear regression problems, where the original formulations are usually both nonsmooth with one nonconvex term. Unfortunately, in most cases...
详细信息
In this work, we focus on studying the differentiable relaxations of several linear regression problems, where the original formulations are usually both nonsmooth with one nonconvex term. Unfortunately, in most cases, the standard alternating direction method of multipliers (ADMM) cannot guarantee global convergence when addressing these kinds of problems. To address this issue, by smoothing the convex term and applying a linearization technique before designing the iteration procedures, we employ nonconvex ADMM to optimize challenging nonconvex-convex composite problems. In our theoretical analysis, we prove the boundedness of the generated variable sequence and then guarantee that it converges to a stationary point. Meanwhile, a potential function is derived from the augmented Lagrange function, and we further verify that the objective function is monotonically nonincreasing. Under the Kurdyka-Lojasiewicz (KL) property, the global convergence is analyzed step by step. Finally, experiments on face reconstruction, image classification, and subspace clustering tasks are conducted to show the superiority of our algorithms over several state-of-the-art ones.
A worst case lower bound (WCLB) result obtained by Nemirovskii suggests that a potentially significant estimation accuracy enhancement may be achieved provided the true parameter vector is known to belong to a ball. I...
详细信息
A worst case lower bound (WCLB) result obtained by Nemirovskii suggests that a potentially significant estimation accuracy enhancement may be achieved provided the true parameter vector is known to belong to a ball. In this paper we discuss the many facets and implications of Nemirovskii's result by using linear regression as a vehicle for illustration. In particular, we address briefly such issues as biased versus unbiased estimation, minimax optimal estimation, tightness of the WCLB, and comparison of WCLB with the performance of the least squares estimator constrained to the ball and that of the linear minimax estimator. (C) 2000 Academic Press.
linear regression is a fundamental modeling tool in statistics and related fields. In this paper, we study an important variant of linear regression in which the predictor-response pairs are partially mismatched. We u...
详细信息
linear regression is a fundamental modeling tool in statistics and related fields. In this paper, we study an important variant of linear regression in which the predictor-response pairs are partially mismatched. We use an optimization formulation to simultaneously learn the underlying regression coefficients and the permutation corresponding to the mismatches. The combinatorial structure of the problem leads to computational challenges. We propose and study a simple greedy local search algorithm for this optimization problem that enjoys strong theoretical guarantees and appealing computational performance. We prove that under a suitable scaling of the number of mismatched pairs compared to the number of samples and features, and certain assumptions on problem data;our local search algorithm converges to a nearly-optimal solution at a linear rate. In particular, in the noiseless case, our algorithm converges to the global optimal solution with a linear convergence rate. Based on this result, we prove an upper bound for the estimation error of the parameter. We also propose an approximate local search step that allows us to scale our approach to much larger instances. We conduct numerical experiments to gather further insights into our theoretical results, and show promising performance gains compared to existing approaches.
linear regression serves information systems researchers and practitioners in different ways: experimentally verifying a priori models and theories, constructing data-based models and theories, and enabling data-based...
详细信息
linear regression serves information systems researchers and practitioners in different ways: experimentally verifying a priori models and theories, constructing data-based models and theories, and enabling data-based and model-based decision making. Drawing on these differences, regression applications occurring over the last decade are examined. Evidence gathered by this study suggests that these differences are often not acknowledged, leading to results that are frequently suboptimal and at times erroneous.
Routing protocols for vehicular ad hoc networks (VANETs) have attracted a lot of attention recently. Most of the researches emphasize on minimizing the end-to-end delay without paying attention to reducing the usage o...
详细信息
Routing protocols for vehicular ad hoc networks (VANETs) have attracted a lot of attention recently. Most of the researches emphasize on minimizing the end-to-end delay without paying attention to reducing the usage of radio. This paper focuses on delay-bounded routing, whose goal is to deliver messages to the destination within user-defined delay and to minimize the usage of radio because radio spectrum is a limited resource. The messages can be delivered to the destination by the hybrid of data muling (carried by the vehicle) and forwarding (transmitted through radio). In the existing protocol, a vehicle may only switch the delivery strategy (muling or forwarding) at an intersection according to the available time of the next road segment, which is between the current intersection and the next intersection. To improve previous works, our protocol uses linear regression to predict the available time and the traveling distance, and thus, the vehicle can switch to a proper delivery strategy at a proper moment and can reduce the number of relays by radio. Our protocol contains two schemes: the greedy and centralized schemes. The greedy scheme uses only the current sampling data to predict the available time and decide when to switch the delivery strategy, whereas the centralized scheme uses the global statistical information to choose a minimum-cost path. Simulation results justify the efficiency of the proposed protocol. Copyright (c) 2011 John Wiley & Sons, Ltd.
In graph embedding based dimensionality reduction methods, the number of K-nearest neighbors is usually needed to be manually chosen in high dimensional space. Graph construction by different number of K-nearest neigh...
详细信息
In graph embedding based dimensionality reduction methods, the number of K-nearest neighbors is usually needed to be manually chosen in high dimensional space. Graph construction by different number of K-nearest neighbor changes dramatically and seriously affects the performance of graph embedding based dimensionality reduction. How to automatically construct a graph is very important. In this paper, first, a discriminative L2-graph is investigated. It computes the edge weights using the class-specific samples and weighted ridge regression, avoiding manually choosing the K-nearest neighbors in traditional graph construction. Second, a discriminative L2-graph based dimensionality reduction method is proposed, named linear regression based Projections (LRP). LRP minimizes the ratio between the local compactness information and the total separability information to seek the optimal projection matrix. LRP is much faster than its counterparts, Sparsity Preserving Projections (SPP) and Collaborative Representation based Projections (CRP), since LRP is supervised and computes edge weights using class-specific samples while SPP and CRP are unsupervised and compute edge weights using all samples. The experimental results on benchmark face image databases show that the proposed LRP outperforms many existing representative linear dimensionality reduction methods. (C) 2018 Elsevier Inc. All rights reserved.
Minimum-variance unbiased estimates for linear regression models can be obtained by solving least-squares problems. The conjugate gradient method can be successfully used in solving the symmetric and positive definite...
详细信息
Minimum-variance unbiased estimates for linear regression models can be obtained by solving least-squares problems. The conjugate gradient method can be successfully used in solving the symmetric and positive definite normal equations obtained from these least-squares problems. Taking into account the results of Golub and Meurant (1997, 2009) [10,11], Hestenes and Stiefel (1952) [17], and Strakos and Tichy (2002) 1161, which make it possible to approximate the energy norm of the error during the conjugate gradient iterative process, we adapt the stopping criterion introduced by Arioli (2005) [18] to the normal equations taking into account the statistical properties of the underpinning linear regression problem. Moreover, we show how the energy norm of the error is linked to the chi(2)-distribution and to the Fisher-Snedecor distribution. Finally, we present the results of several numerical tests that experimentally validate the effectiveness of our stopping criteria. Crown Copyright (C) 2012 Published by Elsevier B.V. All rights reserved.
The purpose of statistical modeling is to select targets for some action, such as a medical treatment or a marketing campaign. Unfortunately, classical machine learning algorithms are not well suited to this task sinc...
详细信息
The purpose of statistical modeling is to select targets for some action, such as a medical treatment or a marketing campaign. Unfortunately, classical machine learning algorithms are not well suited to this task since they predict the results after the action, and not its causal impact. The answer to this problem is uplift modeling, which, in addition to the usual training set containing objects on which the action was taken, uses an additional control group of objects not subjected to it. The predicted true effect of the action on a given individual is modeled as the difference between responses in both groups. This paper analyzes two uplift modeling approaches to linear regression, one based on the use of two separate models and the other based on target variable transformation. Adapting the second estimator to the problem of regression is one of the contributions of the paper. We identify the situations when each model performs best and, contrary to several claims in the literature, show that the double model approach has favorable theoretical properties and often performs well in practice. Finally, based on our analysis we propose a third model which combines the benefits of both approaches and seems to be the model of choice for uplift linear regression. Experimental analysis confirms our theoretical results on both simulated and real data, clearly demonstrating good performance of the double model and the advantages of the proposed approach.
In this paper, we present a novel approach of face identification by formulating the pattern recognition problem in terms of linear regression. Using a fundamental concept that patterns from a single-object class lie ...
详细信息
In this paper, we present a novel approach of face identification by formulating the pattern recognition problem in terms of linear regression. Using a fundamental concept that patterns from a single-object class lie on a linear subspace, we develop a linear model representing a probe image as a linear combination of class-specific galleries. The inverse problem is solved using the least-squares method and the decision is ruled in favor of the class with the minimum reconstruction error. The proposed linear regression Classification (LRC) algorithm falls in the category of nearest subspace classification. The algorithm is extensively evaluated on several standard databases under a number of exemplary evaluation protocols reported in the face recognition literature. A comparative study with state-of-the-art algorithms clearly reflects the efficacy of the proposed approach. For the problem of contiguous occlusion, we propose a Modular LRC approach, introducing a novel Distance-based Evidence Fusion (DEF) algorithm. The proposed methodology achieves the best results ever reported for the challenging problem of scarf occlusion.
Studies of the statistical data of wind velocities at various altitudes indicate that the orthogonal components of wind at the same altitude are practically independent of each other statistically and each component i...
详细信息
Studies of the statistical data of wind velocities at various altitudes indicate that the orthogonal components of wind at the same altitude are practically independent of each other statistically and each component is approximately normally distributed. On the other hand, these data also indicate that there is a linear regression of each orthogonal component at one altitude on that at a different altitude. This property of linear regression can be used for the prediction of wind velocities at all altitudes knowing the velocity at one altitude, the regression coefficients, and the average wind profiles. The uncertainties and the confidence level of the prediction are discussed. Possible application is indicated. Mathematical representation of the wind profiles and the regression coefficients are mentioned briefly.
暂无评论