版权所有:内蒙古大学图书馆 技术提供:维普资讯• 智图
内蒙古自治区呼和浩特市赛罕区大学西街235号 邮编: 010021
作者单位:PennState University Libraries
学位级别:Doctor of Philosophy
授予年度:2013年
主 题:algorithms big data sublinear algorithms graph theory sparsification summarization
摘 要:Increasingly large amounts of structured data are being collected by personal computers,mobile devices, personal gadgets, sensors, etc., and stored in data centers operated by thegovernment and private companies. Processing of such data to extract key information isone of the main challenges faced by computer scientists. Developing methods for constructingcompact representations of large data is a natural way to approach this *** thesis is focused on rigorous mathematical and algorithmic solutions for sparsi-cation and summarization of large amounts of information using discrete combinatorialmethods. Areas of mathematics most closely related to it are graph theory, informationtheory and analysis of real-valued functions over discrete domains. These areas, somewhatsurprisingly, turn out to be related when viewed through the computational lens. In thisthesis we illustrate the power and limitations of methods for constructing small representationsof large data sets, such as graphs and databases, using a variety methods drawnfrom these *** primary goal of sparsication, summarization and sketching methods discussedhere is to remove redundancy in distributed systems, reduce storage space and save otherresources by compressing the representation, while preserving the key structural propertiesof the original data or system. For example, the size of the network can be reduced byremoving redundant nodes and links, while (approximately) preserving the distances orconnectivities between the *** thesis serves as an overview of results in [31, 30, 152, 38, 171, 33]. It also containsan extension of results in [152] obtained jointly with David Woodruff.