版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者机构:Google Mountain View CA 94043 USA Univ Southern Calif Dept Comp Sci Los Angeles CA 90089 USA Univ Southern Calif Los Angeles CA 90089 USA
出 版 物:《JOURNAL OF MACHINE LEARNING RESEARCH》 (机器学习研究杂志)
年 卷 期:2018年第19卷第1期
页 面:74-107页
核心收录:
学科分类:08[工学] 0811[工学-控制科学与工程] 0812[工学-计算机科学与技术(可授工学、理学学位)]
基 金:NSF Direct For Computer & Info Scie & Enginr Division Of Computer and Network Systems Funding Source: National Science Foundation
主 题:APPROXIMATION theory GREEDY algorithms SUBMODULAR functions PROBLEM solving REGRESSION analysis
摘 要:We introduce the submodularity ratio as a measure of how close to submodular a set function f is. We show that when f has submodularity ratio gamma, the greedy algorithm for maximizing f provides a (1 - e(-gamma)) approximation. Furthermore, when gamma is bounded away from 0, the greedy algorithm for minimum submodular cover also provides essentially an O(log n) approximation for a universe of n elements. As a main application of this framework, we study the problem of selecting a subset of k random variables from a large set, in order to obtain the best linear prediction of another variable of interest. We analyze the performance of widely used greedy heuristics;in particular, by showing that the submodularity ratio is lower-bounded by the smallest 2 k sparse eigenvalue of the covariance matrix, we obtain the strongest known approximation guarantees for the Forward Regression and Orthogonal Matching Pursuit algorithms. As a second application, we analyze greedy algorithms for the dictionary selection problem, and significantly improve the previously known guarantees. Our theoretical analysis is complemented by experiments on real-world and synthetic data sets;in particular, we focus on an analysis of how tight various spectral parameters and the submodularity ratio are in terms of predicting the performance of the greedy algorithms.