Different approaches have been developed for evaluating Sobol' indices for global sensitivity analysis (GSA). Among them sample-based approaches are extremely attractive because they can be purely driven by data a...
详细信息
Different approaches have been developed for evaluating Sobol' indices for global sensitivity analysis (GSA). Among them sample-based approaches are extremely attractive because they can be purely driven by data and estimate various Sobol' indices (e.g., first-order, higher-order, total-effects) for any individual or group of random variables using only one set of samples. However, such approaches usually rely on an accurate density estimation for the interested groups of random variables, which can be challenging for high-dimensional groups. For example, the commonly used kernel density estimation (KDE) suffers from curse of dimensionality. In this regard, this paper proposes a novel knowledge-enhanced machine learning approach for data-driven GSA for groups of random variables using sample-based approach and an emerging generative machinelearning model, i.e., normalizing flows (NFs), for high-dimensional density estimation. To facilitate reliable and robust NFs training, a knowledge distillation-based two-stage training strategy is developed. Two customized loss functions are introduced, which are inspired by domain knowledge in the context of sample-based approach for GSA. Two examples are considered to illustrate and verify the efficacy of the proposed approach. Results show that introducing NFs can significantly alleviate the curse of dimensionality in the traditional sample-based approach for GSA and improve accuracy of density estimation and estimation of Sobol' indices.
Extreme weather events exacerbated by global climate change have heightened urban waterlogging risks, particularly in rapidly urbanizing coastal areas such as Shenzhen, China. Traditional predictive models struggle to...
详细信息
Extreme weather events exacerbated by global climate change have heightened urban waterlogging risks, particularly in rapidly urbanizing coastal areas such as Shenzhen, China. Traditional predictive models struggle to address these challenges effectively due to incomplete data and the complex, multi-scale spatio-temporal dynamics associated with urban waterlogging. This study proposed a knowledge-enhanced predictive framework that combines Informed Similarity Transfer (IST) with a Hybrid Spatio-Temporal Model (HSTM) to address these issues comprehensively. IST method innovatively constructs a similarity index by integrating spatial proximity, land cover characteristics, and altitude data, which enables precise data imputation across meteorological monitoring stations, thus overcoming limitations in conventional data completion techniques that often fail in diverse urban settings. HSTM is a dual-stage model that leverages multi-source data and combines multi-class classification with regression to provide fine-grained, high-precision predictions of waterlogging risk levels and water depths. By achieving reliable and scalable predictions, this framework not only enhances urban waterlogging risk management but also offers a transferable solution for other cities with similar waterlogging vulnerabilities. This study contributes a robust, large-scale regionally adaptive approach to disaster risk reduction, advancing predictive urban water management amid growing climate-related uncertainties.
Despite the recent success of Graph Neural Networks (GNNs), their learning pipeline is guided only by the input graph and the desired output of certain tasks, failing to capture useful patterns when not enough data ar...
详细信息
ISBN:
(纸本)9781450392365
Despite the recent success of Graph Neural Networks (GNNs), their learning pipeline is guided only by the input graph and the desired output of certain tasks, failing to capture useful patterns when not enough data are presented. Existing attempts incorporate auxiliary knowledge to mitigate this issue, most of which are not in a unified structure or hard to obtain. Noticing that nodes in graphs usually form implicit hierarchical structures, we proposed to integrate category taxonomies into the learning process of GNNs. A category taxonomy is a form of domain knowledge with a hierarchical tree structure, which is widely adopted in real-world scenarios. In this paper, we introduce Taxonomy-enhanced Graph Neural Networks (Taxo-GNN). Specifically, we jointly optimize the taxonomy representation and node representation tasks, where categories in taxonomy are mapped to Gaussian distributions and nodes are embedded with the GNN framework. To characterize the bidirectional interaction between the taxonomy and the graph, the model is comprised of two modules, namely information distillation for taxonomy and knowledge fusion to graph. Information is first distilled from the graph and aligned with the hierarchical structure of the taxonomy in a bottom-to-top mechanism. After that, knowledge brought by the taxonomy is in turn fused to the graph convolution process, in the form of taxonomy-aware aggregation weights and taxonomy-augmented contexts. Extensive experiments on real-world datasets in multiple downstream tasks verify the effectiveness of our model.
machinelearning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectivenes...
详细信息
ISBN:
(纸本)9781450349512
machinelearning has been a big success story during the AI resurgence. One particular stand out success relates to learning from a massive amount of data. In spite of early assertions of the unreasonable effectiveness of data, there is increasing recognition for utilizing knowledge whenever it is available or can be created purposefully. In this paper, we discuss the indispensable role of knowledge for deeper understanding of content where (i) large amounts of training data are unavailable, (ii) the objects to be recognized are complex, (e.g., implicit entities and highly subjective content), and (iii) applications need to use complementary or related data in multiple modalities/media. What brings us to the cusp of rapid progress is our ability to (a) create relevant and reliable knowledge and (b) carefully exploit knowledge to enhance ML/NLP techniques. Using diverse examples, we seek to foretell unprecedented progress in our ability for deeper understanding and exploitation of multimodal data and continued incorporation of knowledge in learning techniques.
暂无评论