Predictive modeling of networked data finds many real-world applications, such as fraud detection in social networks, drug discovery in biomedical networks, paper topic classification in citation networks, and so fort...
详细信息
ISBN:
(数字)9781728183169
ISBN:
(纸本)9781728183176
Predictive modeling of networked data finds many real-world applications, such as fraud detection in social networks, drug discovery in biomedical networks, paper topic classification in citation networks, and so forth. Although the advanced machine learning approaches can help build reasonably accurate predictive models, their applicability is immensely hindered by the data labeling tasks, which are onerous, time-consuming, and error-prone. In this paper, we propose a novel active learning paradigm for networked data, named topology-and-content-aware (TACA) active learning, aiming to minimize the number of labels while achieving a desirable level of model accuracy. Overall, TACA advances existing works from two aspects: (1) TACA makes no assumption on the network property, whereas most existing works only perform effectively on a locally consistent network in which linked nodes are expected to share the same labels and (2) TACA generates queries without relying on model performance, thereby enjoying robust predictive results even when noises exist in the queried labels. Both theoretical and empirical evidences are presented, substantiating the effectiveness of and optimism our approach.
Based on structural hole theory, this study explores the differences in the ability of Chinese social media users with different characteristics to provide accurate information. Using over 20 million interactive data ...
详细信息
As a widely used data structure, graphs are good at characterizing data with internal associations, such as social and biological data. Tree structured data are special and are widely used in many real-world applicati...
详细信息
ISBN:
(数字)9781728181561
ISBN:
(纸本)9781728181578
As a widely used data structure, graphs are good at characterizing data with internal associations, such as social and biological data. Tree structured data are special and are widely used in many real-world applications, such as organizational structure analysis and genealogical knowledge graph reasoning. For example, in kinship knowledge graph analysis, when a genealogical tree is particularly large (more than 25 levels and 45,000 nodes), it is a great challenge to partition this large tree into a specified number of subtrees with succinct logic and a balanced number of nodes. Therefore, in this paper, we propose the TPA (tree partitioning algorithm) algorithm to achieve a balanced and succinct logic partition of large-scale tree structured data. TPA first extracts all related nodes from a massive graph database and then constructs the convergent subgraph into a complete tree with a specified root node. Specifically, several virtual nodes are supplemented for generation-skipping connected nodes to achieve correct node numbering and partitioning. Finally, a graph partitioning algorithm is executed on the complete tree to obtain a specified number of subtrees with succinct logic and balanced node scales. Experiments conducted on four real-world datasets verify the effectiveness of our TPA algorithm.
We know that compressive sensing can establish stable sparse recovery results from highly undersampled data under a restricted isometry property condition. In reality, however, numerous problems are coherent, and vast...
详细信息
Click-Through Rate (CTR) prediction is a core task in nowadays commercial recommender systems. Feature crossing, as the mainline of research on CTR prediction, has shown a promising way to enhance predictive performan...
详细信息
Deep neural network (DNN) compression can reduce the memory footprint of deep networks effectively, so that the deep model can be deployed on the portable devices. However, most of the existing model compression metho...
详细信息
ISBN:
(纸本)9781665423991
Deep neural network (DNN) compression can reduce the memory footprint of deep networks effectively, so that the deep model can be deployed on the portable devices. However, most of the existing model compression methods cost lots of time, e.g., vector quantization or pruning, which makes them inept to the real-world applications that need fast online computation. In this paper, we therefore explore how to accelerate the model compression process by reducing the computation cost. Then, we propose a new deep model compression method, termed Dictionary Pair-based data-Free Fast DNN Compression, which aims at reducing the memory consumption of DNNs without extra training and can greatly improve the compression efficiency. Specifically, our proposed method performs tensor decomposition on the DNN model with a fast dictionary pair learning-based reconstruction approach, which can be deployed on different layers (e.g., convolution and fully-connection layers). Given a pre-trained DNN model, we first divide the parameters (i.e., weights) of each layer into a series of partitions for dictionary pair-based fast reconstruction, which can potentially discover more fine-grained information and provide the possibility for parallel model compression. Then, dictionaries of less memory occupation are learned to reconstruct the weights. Extensive experiments on popular DNNs (i.e., VGG-16, ResNet-18 and ResNet-50) showed that our proposed weight compression method can significantly reduce the memory footprint and speed up the compression process, with less performance loss.
The linearly constrained convex composite programming problems whose objective function contains two blocks with each block being the form of nonsmooth+smooth arises frequently in multiple fields of applications. If b...
详细信息
In this paper, we study transformers for text-based games. As a promising replacement of recurrent modules in Natural Language Processing (NLP) tasks, the transformer architecture could be treated as a powerful state ...
详细信息
ISBN:
(数字)9781728145334
ISBN:
(纸本)9781728145341
In this paper, we study transformers for text-based games. As a promising replacement of recurrent modules in Natural Language Processing (NLP) tasks, the transformer architecture could be treated as a powerful state representation generator for reinforcement learning. However, the vanilla transformer is neither effective nor efficient to learn with a huge amount of weight parameters. Unlike existing research that encodes states using LSTMs or GRUs, we develop a novel lightweight transformer-based representation generator featured with reordered layer normalization, weight sharing and block-wise aggregation. The experimental results show that our proposed model not only solves single games with much fewer interactions, but also achieves better generalization on a set of unseen games. Furthermore, our model outperforms state-of-the-art agents in a variety of man-made games.
Cover ratio of cloud is a very important factor which affects the quality of a satellite image, therefore cloud detection from satellite images is a necessary step in assessing the image quality. The study on cloud de...
详细信息
Cover ratio of cloud is a very important factor which affects the quality of a satellite image, therefore cloud detection from satellite images is a necessary step in assessing the image quality. The study on cloud detection from the visual band of a satellite image is developed. Firstly, we consider the differences between the cloud and ground including high grey level, good continuity of grey level, area of cloud region, and the variance of local fractal dimension (VLFD) of the cloud region. A single cloud region detection method is proposed. Secondly, by introducing a reference satellite image and by comparing the variance in the dimensions corresponding to the reference and the tested images, a method that detects multiple cloud regions and determines whether or not the cloud exists in an image is described. By using several Ikonos images, the performance of the proposed method is demonstrated.
暂无评论