generalized linear models are a popular analytics tool with interpretable results and broad applicability, but require iterative estimation procedures that impose data transfer and computational costs that can be prob...
详细信息
generalized linear models are a popular analytics tool with interpretable results and broad applicability, but require iterative estimation procedures that impose data transfer and computational costs that can be problematic under some infrastructure constraints. We propose a doubly-sketched approximation of the iteratively re-weighted least squares algorithm to estimate generalizedlinear model parameters using a sequence of surrogate datasets. The procedure sketches once to reduce data transfer costs, and sketches again to reduce data computation costs, yielding wall-clock time savings. Regression coefficients and standard errors are produced, with comparison against literature methods. Asymptotic properties of the proposed procedure are shown, with empirical results from simulated and real-world datasets. The efficacy of the proposed method is investigated across a variety of commodity computational infrastructure configurations accessible to practitioners. A highlight of the present work is the estimation of a Poisson-log generalizedlinear model across almost 1.7 billion observations on a personal computer in 25 min.
generalized linear models provide a useful tool for analyzing data from quality-improvement experiments. We discuss why analysis must be done for all the data, not just for summarizing quantities, and show by examples...
详细信息
generalized linear models provide a useful tool for analyzing data from quality-improvement experiments. We discuss why analysis must be done for all the data, not just for summarizing quantities, and show by examples how residuals can be used for model checking. A restricted-maximum-likelihood-type adjustment for the dispersion analysis is developed.
We investigate the performance of a hybrid classifier for solving a classic problem in the area of image processing. We analyse the performance of this method for a specific classification task that is detecting skin ...
详细信息
ISBN:
(纸本)9783319212067;9783319212050
We investigate the performance of a hybrid classifier for solving a classic problem in the area of image processing. We analyse the performance of this method for a specific classification task that is detecting skin regions in a picture. Our approach consists in partitioning clustering the input dataset. Then, for each cluster we apply the well-known generalized linear models in order to identify the skin and non-skin points. We evaluate the performance of our approach using several well-known metrics. Besides, we compare the reached performance with the Feed-forward Neural Networks. The reached results prove that the proposed approach is a well-alternative for solving the skin-identification problem.
With a globally aging population, the burden of care of cognitively impaired older adults is becoming increasingly concerning. Instances of Alzheimer's disease and other forms of dementia are becoming ever more fr...
详细信息
ISBN:
(纸本)9781424479290
With a globally aging population, the burden of care of cognitively impaired older adults is becoming increasingly concerning. Instances of Alzheimer's disease and other forms of dementia are becoming ever more frequent. Earlier detection of cognitive impairment offers significant benefits, but remains difficult to do in practice. In this paper, we develop statistical models of the behavior of older adults within their homes using sensor data in order to detect the early onset of cognitive decline. Specifically, we use inhomogenous Poisson processes to model the presence of subjects within different rooms throughout the day in the home using unobtrusive sensing technologies. We compare the distributions learned from cognitively intact and impaired subjects using information theoretic tools and observe statistical differences between the two populations which we believe can be used to help detect the onset of cognitive decline.
Gambusia affinis (G. affinis) is an invasive fish species found in the Sundays River Valley of the Eastern Cape, South Africa, The relative abundance and population dynamics of G. affinis were quantified in five inter...
详细信息
Gambusia affinis (G. affinis) is an invasive fish species found in the Sundays River Valley of the Eastern Cape, South Africa, The relative abundance and population dynamics of G. affinis were quantified in five interconnected impoundments within the Sundays River Valley, This study utilised a G. affinis data set to demonstrate various, classical ANOVA models. generalized linear models were used to standardize catch per unit effort (CPUE) estimates and to determine environmental variables which influenced the CPUE, Based on the generalizedlinear model results dam age, mean temperature, Oreochromis mossambicus abundance and Glossogobius callidus abundance had a significant effect on the G. affinis CPUE. The Albany Angling Association collected data during fishing tag and release events. These data were utilized to demonstrate repeated measures designs. Mixed-effects models provided a powerful and flexible tool for analyzing clustered data such as repeated measures data and nested data, lienee it has become tremendously popular as a framework for the analysis of bio-behavioral experiments. The results show that the mixed-effects methods proposed in this study are more efficient than those based on generalized linear models. These data were better modeled with mixed-effects models due to their flexibility in handling missing data.
In generalized linear models with fixed design, under the assumption λ↑_n→∞ and other regularity conditions, the asymptotic normality of maximum quasi-likelihood estimator ^↑βn, which is the root of the quasi-li...
详细信息
In generalized linear models with fixed design, under the assumption λ↑_n→∞ and other regularity conditions, the asymptotic normality of maximum quasi-likelihood estimator ^↑βn, which is the root of the quasi-likelihood equation with natural link function ∑i=1^n Xi(yi -μ(Xi′β)) = 0, is obtained, where λ↑_n denotes the minimum eigenvalue of ∑i=1^nXiXi′, Xi are bounded p × q regressors, and yi are q × 1 responses.
This paper discusses the asymptotic properties of the SCAD(smoothing clipped absolute deviation)penalized quasi-likelihood estimator for generalized linear models with adaptive designs,which extend the related results...
详细信息
This paper discusses the asymptotic properties of the SCAD(smoothing clipped absolute deviation)penalized quasi-likelihood estimator for generalized linear models with adaptive designs,which extend the related results for independent observations to dependent *** certain conditions,the authors proved that the SCAD penalized method correctly selects covariates with nonzero coefficients with probability converging to one,and the penalized quasi-likelihood estimators of non-zero coefficients have the same asymptotic distribution they would have if the zero coefficients were known in *** is,the SCAD estimator has consistency and oracle *** last,the results are illustrated by some simulations.
Semiconductor yield modeling is essential to identify processing issues, improve quality, and meet customer demand. However, the massive amounts of data collected during the fabrication process and the number of histo...
详细信息
Semiconductor yield modeling is essential to identify processing issues, improve quality, and meet customer demand. However, the massive amounts of data collected during the fabrication process and the number of historical models available make yield modeling a complex and challenging task. This paper presents a methodology to guide the practitioner in determining what data should be collected, integrated, and aggregated, followed by a modeling strategy to forecast yield using generalized linear models based on defect metrology data. This technique yields results at both the die and the wafer levels, significantly outperforms existing models found in the literature based on prediction errors, and identifies significant factors that can drive process improvement. This method also allows the nested structure of the process to be considered in the model, improving predictive capabilities and violating fewer assumptions. An example is presented to discuss this approach and to demonstrate the advantages of these models over the models of the past.
Penalized empirical likelihood for generalized linear models with longitudinal data is considered. It is shown that the penalized empirical likelihood estimators have the oracle property. Also, we conclude that the as...
详细信息
Penalized empirical likelihood for generalized linear models with longitudinal data is considered. It is shown that the penalized empirical likelihood estimators have the oracle property. Also, we conclude that the asymptotic distribution of penalized empirical likelihood ratio test statistic is a chi-square distribution. The finite sample performance of the proposed method is evaluated by some simulations and a real data example.
In this paper, for the generalized linear models (GLMs) with diverging number of covariates, the asymptotic properties of maximum quasi-likelihood estimators (MQLEs) under some regular conditions are developed. Th...
详细信息
In this paper, for the generalized linear models (GLMs) with diverging number of covariates, the asymptotic properties of maximum quasi-likelihood estimators (MQLEs) under some regular conditions are developed. The existence, weak convergence and the rate of convergence and asymptotic normality of linear combination of MQLEs and asymptotic distribution of single linear hypothesis teststatistics are presented. The results are illustrated by Monte-Carlo simulations.
暂无评论