Chromy (1979) proposed an unequal probability sampling algorithm, which is the default sequential method used in the SURVEYSELECT procedure of the SAS software. In this article, we demonstrate that Chromy sampling is ...
详细信息
Chromy (1979) proposed an unequal probability sampling algorithm, which is the default sequential method used in the SURVEYSELECT procedure of the SAS software. In this article, we demonstrate that Chromy sampling is equivalent to pivotal sampling. This makes it possible to estimate the variance unbiasedly for the randomized version of the method programmed in the SURVEYSELECT procedure.
Some factors affecting the physical and mental health of vocational college students, the sense of inferiority plays a very important role in cultivating students with physical and mental health. Inverse random under ...
详细信息
Some factors affecting the physical and mental health of vocational college students, the sense of inferiority plays a very important role in cultivating students with physical and mental health. Inverse random under sampling algorithm is improved based on integrated learning, which can improve the performance of the classifier. Stacking integrated learning and flip random sampling reduction algorithm SIRUS is proposed. Select the individual subjective factors studied in this paper is important in self-attribution and social objective factors are important social support factors, and the only demographic variables is a significant difference.
Before computer scientists became interested in unequal probability sampling methods, they were widely studied by survey statisticians. We show that sometimes the same sampling methods have been proposed again without...
详细信息
Before computer scientists became interested in unequal probability sampling methods, they were widely studied by survey statisticians. We show that sometimes the same sampling methods have been proposed again without reference to existing methods. We also show that methods that are not correct and that were widely discussed in the 1950s are being proposed again. We review the most common errors and misunderstandings about these methods. (c) 2022 The Author(s). Published by Elsevier Inc. This is an open access article under the CC BY license (http://***/licenses/by/4.0/).
In this paper we address the problem of fuzzy measures index calculation. On the basis of fuzzy sets, Murofushi and Soneda proposed an interaction index to deal with the relations between two individuals. This index w...
详细信息
In this paper we address the problem of fuzzy measures index calculation. On the basis of fuzzy sets, Murofushi and Soneda proposed an interaction index to deal with the relations between two individuals. This index was later extended in a common frame-work by Grabisch. Both indices are fundamental in the literature of fuzzy measures. Nevertheless, the corresponding calculation still presents a highly complex problem for which no approximation solution has been proposed yet. Then, using a representation of the Shapley based on orders, here we suggest an alternative calculation of the interaction index, both for the simple case of pairs of individuals, and for the more complex situation in which any set could be considered. This alternative representation facilitates the handling of these indices. Moreover, we draw on this representation to define two polynomial methods based on sampling to estimate the interaction index, as well as a method to approximate the generalized version of it. We provide some computational results to test the goodness of the proposed algorithms.& COPY;2022 The Author(s). Published by Elsevier B.V. This is an open access article under the CC BY license (http://***/licenses/by/4.0/).
Over the last decades, various "non-linear" MCMC methods have arisen. While appealing for their convergence speed and efficiency, their practical implementation and theoretical study remain challenging. In t...
详细信息
Over the last decades, various "non-linear" MCMC methods have arisen. While appealing for their convergence speed and efficiency, their practical implementation and theoretical study remain challenging. In this paper, we introduce a non-linear generalization of the Metropolis -Hastings algorithm to a proposal that depends not only on the current state, but also on its law. We propose to simulate this dynamics as the mean field limit of a system of interacting particles, that can in turn itself be understood as a generalisation of the Metropolis-Hastings algorithm to a population of particles. Under the double limit in number of iterations and number of particles we prove that this algorithm converges. Then, we propose an efficient GPU implementation and illustrate its performance on various examples. The method is particularly stable on multimodal exam-ples and converges faster than the classical methods.
In network analysis, the betweenness centrality of a node informally captures the fraction of shortest paths visiting that node. The computation of the betweenness centrality measure is a fundamental task in the analy...
详细信息
ISBN:
(纸本)9781450390965
In network analysis, the betweenness centrality of a node informally captures the fraction of shortest paths visiting that node. The computation of the betweenness centrality measure is a fundamental task in the analysis of modern networks, enabling the identification of the most central nodes in such networks. Additionally to being massive, modern networks also contain information about the time at which their events occur. Such networks are often called temporal networks. The temporal information makes the study of the betweenness centrality in temporal networks (i.e., temporal betweenness centrality) much more challenging than in static networks (i.e., networks without temporal information). Moreover, the exact computation of the temporal betweenness centrality is often impractical on even moderately-sized networks, given its extremely high computational cost. A natural approach to reduce such computational cost is to obtain high-quality estimates of the exact values of the temporal betweenness centrality. In this work we present ONBRA, the first sampling-based approximation algorithm for estimating the temporal betweenness centrality values of the nodes in a temporal network, providing rigorous probabilistic guarantees on the quality of its output. ONBRA is able to compute the estimates of the temporal betweenness centrality values under two different optimality criteria for the shortest paths of the temporal network. In addition, ONBRA outputs high-quality estimates with sharp theoretical guarantees leveraging on the empirical Bernstein bound, an advanced concentration inequality. Finally, our experimental evaluation shows that ONBRA significantly reduces the computational resources required by the exact computation of the temporal betweenness centrality on several real world networks, while reporting high-quality estimates with rigorous guarantees.
Energy conservation techniques are crucial to achieving high reliability in the Internet of Things (IoT) services, especially in the Massive IoT (MIoT), which stringently requires cost-effective and low-energy consump...
详细信息
Energy conservation techniques are crucial to achieving high reliability in the Internet of Things (IoT) services, especially in the Massive IoT (MIoT), which stringently requires cost-effective and low-energy consumption for battery-powered devices. Most of the proposed techniques generally assume that data acquiring and processing consume significantly lower than that of communication. Unfortunately, this assumption is incorrect in the MIoT scenario, which mostly involves the low-power wide-area network (LPWAN) and complex data sensing operations (e.g., biological and seismic sensing) using "power-hungry" sensors (e.g., gas sensors, seismometers). Thus, sensing actions may consume even more energy than transmission. In addition, none of them support end-users in controlling the trade-off between energy conservation and data precision. To deal with these issues, we propose an adaptive sampling algorithm that estimates the optimal sampling frequencies in real-time for IoT devices based on the changes of collected data. Given a user's saving desire, our algorithm could minimize the device's energy consumption while ensuring the precision of collected information. Practical experiments over IoT datasets have shown that our algorithm can reduce the number of acquired samples up to 20 times compared with a traditional fixed-rate approach at extremely low Normal Mean Error value around 3.45%.
Counting the number of occurrences of small connected subgraphs, called temporal motifs, has become a fundamental primitive for the analysis of temporal networks, whose edges are annotated with the time of the event t...
详细信息
ISBN:
(纸本)9781450384469
Counting the number of occurrences of small connected subgraphs, called temporal motifs, has become a fundamental primitive for the analysis of temporal networks, whose edges are annotated with the time of the event they represent. One of the main complications in studying temporal motifs is the large number of motifs that can be built even with a limited number of vertices or edges. As a consequence, since in many applications motifs are employed for exploratory analyses, the user needs to iteratively select and analyze several motifs that represent different aspects of the network, resulting in an inefficient, time-consuming process. This problem is exacerbated in large networks, where the analysis of even a single motif is computationally demanding. As a solution, in this work we propose and study the problem of simultaneously counting the number of occurrences of multiple temporal motifs, all corresponding to the same (static) topology (e.g., a triangle). Given that for large temporal networks computing the exact counts is unfeasible, we propose odeN, a sampling-based algorithm that provides an accurate approximation of all the counts of the motifs. We provide analytical bounds on the number of samples required by odeN to compute rigorous, probabilistic, relative approximations. Our extensive experimental evaluation shows that odeN enables the approximation of the counts of motifs in temporal networks in a fraction of the time needed by state-of-the-art methods, and that it also reports more accurate approximations than such methods.
The use of mathematical models for design space characterization has become commonplace in pharmaceutical quality-by-design, providing a systematic risk-based approach to assurance of quality. This paper presents a me...
详细信息
The use of mathematical models for design space characterization has become commonplace in pharmaceutical quality-by-design, providing a systematic risk-based approach to assurance of quality. This paper presents a methodology to complement sampling algorithms by computing the largest box inscribed within a given probabilistic design space at a desired reliability level. Such an encoding of the samples yields an operational envelope that can be conveniently communicated to process operators as independent ranges in process parameters. The first step involves training a feed-forward multi-layer perceptron as a surrogate of the sampled probability map. This surrogate is then embedded into a design centering problem, formulated as a semi-infinite program and solved using a cutting-plane algorithm. Effectiveness and computational tractability are demonstrated on the case study of a batch reactor with two critical process parameters. Copyright (C) 2021 The Authors.
This paper investigates the motion planning problem of planar m-link (m >= 4) closed chains among point obstacles with extension to arbitrary convex 2-D obstacles. The configuration space (C-space) of closed chains...
详细信息
This paper investigates the motion planning problem of planar m-link (m >= 4) closed chains among point obstacles with extension to arbitrary convex 2-D obstacles. The configuration space (C-space) of closed chains is embedded into two copies of m-3 dimensional tori. Two structural sets, the C-boundaries and the C-obstacles, are analyzed based upon the C-spaces of recursively constructed lower-dimensional closed chains. They contain essential structural information about the connectivity of the collision-free portion (C-free) of the C-space. By approximating each workspace obstacle by a set of points on the boundary after dilation, its corresponding C-obstacle is guaranteed to be covered by the C-obstacle of the convex hull of the point set. This permits a resolution-complete roadmap algorithm that puts specific bias for sampling the structural sets. Several benchmark examples are presented that compare the performance between our algorithm and the traditional algorithms. Animation videos and source codes are also provided which demonstrate the effectiveness of our method for closed chains of up to 20 links.
暂无评论