Bayesian modelling helps applied researchers to articulate assumptions about their data and develop models tailored for specific applications. Thanks to good methods for approximate posterior inference, researchers ca...
详细信息
Bayesian modelling helps applied researchers to articulate assumptions about their data and develop models tailored for specific applications. Thanks to good methods for approximate posterior inference, researchers can now easily build, use, and revise complicated Bayesian models for large and rich data. These capabilities, however, bring into focus the problem of model criticism. Researchers need tools to diagnose the fitness of their models, to understand where they fall short, and to guide their revision. In this paper, we develop a new method for Bayesian model criticism, the holdout predictive check (HPC). Holdout predictive check are built on posterior predictive check (PPC), a seminal method that checks a model by assessing the posterior predictive distribution on the observed data. However, PPC use the data twice—both to calculate the posterior predictive and to evaluate it—which can lead to uncalibrated p-values. Holdout predictive check, in contrast, compare the posterior predictive distribution to a draw from the population distribution, a heldout dataset. This method blends Bayesian modelling with frequentist assessment. Unlike the PPC, we prove that the HPC is properly calibrated. Empirically, we study HPC on classical regression, a hierarchical model of text data, and factor analysis.
The Berry-Esseen bound provides an upper bound on the Kolmogorov distance between a random variable and the normal *** this paper,we establish Berry-Esseen bounds with optimal rates for self-normalized sums of locally...
详细信息
The Berry-Esseen bound provides an upper bound on the Kolmogorov distance between a random variable and the normal *** this paper,we establish Berry-Esseen bounds with optimal rates for self-normalized sums of locally dependent random variables,assuming only a second-moment *** proof leverages Stein's method and introduces a novel randomized concentration inequality,which may also be of independent interest for other *** main results have applied to self-normalized sums of m-dependent random variables and graph dependency models.
The detection of community structures in complex networks has garnered significant attention in recent years. Given its NP-hardness, numerous evolutionary optimization-based approaches have been proposed. However, the...
详细信息
During recent decades, using credit cards represents a pivotal part of the financial lifeline. Credit cards and online payment gateways are vital elements in the world of world-wide-web. Given the fact that credit car...
详细信息
Accurate monitoring of urban waterlogging contributes to the city’s normal operation and the safety of residents’daily ***,due to feedback delays or high costs,existing methods make large-scale,fine-grained waterlog...
详细信息
Accurate monitoring of urban waterlogging contributes to the city’s normal operation and the safety of residents’daily ***,due to feedback delays or high costs,existing methods make large-scale,fine-grained waterlogging monitoring impossible.A common method is to forecast the city’s global waterlogging status using its partial waterlogging *** method has two challenges:first,existing predictive algorithms are either driven by knowledge or data alone;and second,the partial waterlogging data is not collected selectively,resulting in poor *** overcome the aforementioned challenges,this paper proposes a framework for large-scale and fine-grained spatiotemporal waterlogging monitoring based on the opportunistic sensing of limited bus *** framework follows the Sparse Crowdsensing and mainly comprises a pair of iterative predictor and *** predictor uses the collected waterlogging status and the predicted status of the uncollected area to train the graph convolutional neural *** combines both knowledge-driven and data-driven approaches and can be used to forecast waterlogging status in all regions for the upcoming *** selector consists of a two-stage selection procedure that can select valuable bus routes while satisfying budget *** experimental results on real waterlogging and bus routes in Shenzhen show that the proposed framework could easily perform urban waterlogging monitoring with low cost,high accuracy,wide coverage,and fine granularity.
Materials datasets usually contain many redundant(highly similar)materials due to the tinkering approach historically used in material *** redundancy skews the performance evaluation of machine learning(ML)models when...
详细信息
Materials datasets usually contain many redundant(highly similar)materials due to the tinkering approach historically used in material *** redundancy skews the performance evaluation of machine learning(ML)models when using random splitting,leading to overestimated predictive performance and poor performance on out-of-distribution *** issue is well-known in bioinformatics for protein function prediction,where tools like CD-HIT are used to reduce redundancy by ensuring sequence similarity among samples greater than a given *** this paper,we survey the overestimated ML performance in materials science for material property prediction and propose MD-HIT,a redundancy reduction algorithm for material *** MD-HIT to composition-and structure-based formation energy and band gap prediction problems,we demonstrate that with redundancy control,the prediction performances of the ML models on test sets tend to have relatively lower performance compared to the model with high redundancy,but better reflect models’true prediction capability.
Positive data are very common in many scientific fields and applications;for these data,it is known that estimation and inference based on relative error criterion are superior to that of absolute error *** prediction...
详细信息
Positive data are very common in many scientific fields and applications;for these data,it is known that estimation and inference based on relative error criterion are superior to that of absolute error *** prediction problems,conformal prediction provides a useful framework to construct flexible prediction intervals based on hypothesis testing,which has been actively studied in the past *** view of the advantages of the relative error criterion for regression problems with positive responses,in this paper,we combine the relative error criterion(REC)with conformal prediction to develop a novel REC-based predictive inference method to construct prediction intervals for the positive *** proposed method satisfies the finite sample global coverage guarantee and to some extent achieves the local *** conduct extensive simulation studies and two real data analysis to demonstrate the competitiveness of the new proposed method.
Conformal prediction is a powerful tool for uncertainty quantification, but its application to time-series data is constrained by the violation of the exchangeability assumption. Current solutions for time-series pred...
Purpose:This study focuses on understanding the collaboration relationships among mathematicians,particularly those esteemed as elites,to reveal the structures of their communities and evaluate their impact on the fie...
详细信息
Purpose:This study focuses on understanding the collaboration relationships among mathematicians,particularly those esteemed as elites,to reveal the structures of their communities and evaluate their impact on the field of ***/methodology/approach:Two community detection algorithms,namely Greedy Modularity Maximization and Infomap,are utilized to examine collaboration patterns among *** conduct a comparative analysis of mathematicians’centrality,emphasizing the influence of award-winning individuals in connecting network roles such as Betweenness,Closeness,and Harmonic ***,we investigate the distribution of elite mathematicians across communities and their relationships within different mathematical ***:The study identifies the substantial influence exerted by award-winning mathematicians in connecting network *** elite distribution across the network is uneven,with a concentration within specific communities rather than being evenly ***,the research identifies a positive correlation between distinct mathematical sub-fields and the communities,indicating collaborative tendencies among scientists engaged in related ***,the study suggests that reduced research diversity within a community might lead to a higher concentration of elite scientists within that specific *** limitations:The study’s limitations include its narrow focus on mathematicians,which may limit the applicability of the findings to broader scientific *** with manually collected data affect the reliability of conclusions about collaborative *** implications:This study offers valuable insights into how elite mathematicians collaborate and how knowledge is disseminated within mathematical *** these collaborative behaviors could aid in fostering better collaboration strategies among mathematicians and institutions,potentially enhancing scientific progre
This paper studies the joint tail behavior of two randomly weighted sums∑_(i=1)^(m)Θ_(i)X_(i)and∑_(j=1)^(n)θ_(j)Y_(j)for some m,n∈N∪{∞},in which the primary random variables{X_(i);i∈N}and{Y_(i);i∈N},respectiv...
详细信息
This paper studies the joint tail behavior of two randomly weighted sums∑_(i=1)^(m)Θ_(i)X_(i)and∑_(j=1)^(n)θ_(j)Y_(j)for some m,n∈N∪{∞},in which the primary random variables{X_(i);i∈N}and{Y_(i);i∈N},respectively,are real-valued,dependent and heavy-tailed,while the random weights{Θi,θi;i∈N}are nonnegative and arbitrarily dependent,but the three sequences{X_(i);i∈N},{Y_(i);i∈N}and{Θ_(i),θ_(i);i∈N}are mutually *** two types of weak dependence assumptions on the heavy-tailed primary random variables and some mild moment conditions on the random weights,we establish some(uniformly)asymptotic formulas for the joint tail probability of the two randomly weighted sums,expressing the insensitivity with respect to the underlying weak dependence *** applications,we consider both discrete-time and continuous-time insurance risk models,and obtain some asymptotic results for ruin probabilities.
暂无评论