Analyzing incomplete data for inferring the structure of gene regulatory networks (GRNs) is a challenging task in bioinformatic. Bayesian network can be successfully used in this field. k-nearest neighbor, singular va...
详细信息
Analyzing incomplete data for inferring the structure of gene regulatory networks (GRNs) is a challenging task in bioinformatic. Bayesian network can be successfully used in this field. k-nearest neighbor, singular value decomposition (SVD)-based and multiple imputation by chained equations are three fundamental imputation methods to deal with missing values. Path consistency (pc) algorithm based on conditional mutual information (pcA-CMI) is a famous algorithm for inferring GRNs. This algorithm needs the data set to be complete. However, the problem is that pcA-CMI is not a stable algorithm and when applied on permuted gene orders, different networks are obtained. We propose an order independent algorithm, pcA-CMI-OI, for inferring GRNs. After imputation of missing data, the performances of pcA-CMI and pcA-CMI-OI are compared. Results show that networks constructed from data imputed by the SVD-based method and pcA-CMI-OI algorithm outperform other imputation methods and pcA-CMI. An undirected or partially directed network is resulted by pc-based algorithms. Mutual information test (MIT) score, which can deal with discrete data, is one of the famous methods for directing the edges of resulted networks. We also propose a new score, ConMIT, which is appropriate for analyzing continuous data. Results shows that the precision of directing the edges of skeleton is improved by applying the ConMIT score.
In the conventional factor-augmented vector autoregression (FAVAR), the extracted factors cannot be used in structural analysis because the factors do not retain a clear economic interpretation. This paper proposes a ...
详细信息
In the conventional factor-augmented vector autoregression (FAVAR), the extracted factors cannot be used in structural analysis because the factors do not retain a clear economic interpretation. This paper proposes a new method to identify macroeconomic factors, which is associated with better economic interpretations. Using an empirical-based search algorithm, we select variables that are individually caused by a single factor. These variables are then used to impose restrictions on the factor loading matrix, and we obtain an economic interpretation for each factor. We apply our method to time-series data in the USA and further conduct a monetary policy analysis. Our method yields stronger responses of price variables and muted responses of output variables than what the literature has found.
This paper considers the use of two machine learning algorithms to identify the causal relationships among retail prices, manufacturer prices, and number of packages sold. The two algorithms are pc and Linear Non-Gaus...
详细信息
This paper considers the use of two machine learning algorithms to identify the causal relationships among retail prices, manufacturer prices, and number of packages sold. The two algorithms are pc and Linear Non-Gaussian Acyclic Models (LiNGAM). The dataset studied comprises scanner data collected from the retail sales of carbonated soft drinks in the Chicago area. The pc algorithm is not able to assign direction among retail price, manufacturer price and quantity sold, whereas the LiNGAM algorithm is able to decide in every case, i.e., retail price leads manufacturer price and quantity sold.
This paper studies the effect of monetary policy in Thailand based on structural vector autoregression (SVAR) model. Unlike all existing studies, this paper (i) properly controls for external factors, (ii) uses the id...
详细信息
This paper studies the effect of monetary policy in Thailand based on structural vector autoregression (SVAR) model. Unlike all existing studies, this paper (i) properly controls for external factors, (ii) uses the identifying restrictions which are specified and justified from empirical evidence and (iii) studies the immediate as well as the short term effect of monetary policy. I find that several important stylized facts on the transmission mechanism of monetary policy need to be revised. (C) 2015 Elsevier Inc. All rights reserved.
Monthly observations on prices from 10 weight/gender classifications of Nebraska beef cattle are studied in an error correction model (ECM) framework. This study attempts a replication of the 2003 paper on Texas price...
详细信息
Monthly observations on prices from 10 weight/gender classifications of Nebraska beef cattle are studied in an error correction model (ECM) framework. This study attempts a replication of the 2003 paper on Texas prices by Bessler and Davis, where they find medium heifers (600-700 1b) at the center of price discovery. Using the ECM results Nebraska light steers are found to be weakly exogenous, with the innovation accounting results showing marked differences. Industry structure, production choices, and animal type and breeding herd differences between Texas and Nebraska are proposed as plausible reasons for partial (or incomplete) success at replication.
This article reevaluates the impulse response functions (IRFs) to a monetary policy shock of the structural vector autoregression (SVAR). Identifying restrictions are specified and justified based on empirical evidenc...
详细信息
This article reevaluates the impulse response functions (IRFs) to a monetary policy shock of the structural vector autoregression (SVAR). Identifying restrictions are specified and justified based on empirical evidence,i.e., conditional independence relations of variables, which is an important dimension that a good model must be able to mimic. The empirical-based approach is able to significant narrow down the set of admissible causal orders to identify the IRFs to a monetary policy shock (from 2,482 to 8). I find that most of the qualitative "stylized" features reported in the literature remain intact. However, the quantitative predictions are much less certain than what is commonly perceived.
A fundamental step in the pc causal discovery algorithm consists of testing for (conditional) independence. When the number of data records is very small, a classical statistical independence test is typically unable ...
详细信息
ISBN:
(纸本)9783319114330;9783319114323
A fundamental step in the pc causal discovery algorithm consists of testing for (conditional) independence. When the number of data records is very small, a classical statistical independence test is typically unable to reject the (null) independence hypothesis. In this paper, we are comparing two conflicting pieces of advice in the literature that in case of too few data records recommend (1) assuming dependence and (2) assuming independence. Our results show that assuming independence is a safer strategy in minimizing the structural distance between the causal structure that has generated the data and the discovered structure. We also propose a simple improvement on the pc algorithm that we call blacklisting. We demonstrate that blacklisting can lead to orders of magnitude savings in computation by avoiding unnecessary independence tests.
In real world applications, graphical statistical models are not only a tool for operations such as classification or prediction, but usually the network structures of the models themselves are also of great interest ...
详细信息
In real world applications, graphical statistical models are not only a tool for operations such as classification or prediction, but usually the network structures of the models themselves are also of great interest (e.g., in modeling brain connectivity). The false discovery rate (FDR), the expected ratio of falsely claimed connections to all those claimed, is often a reasonable error-rate criterion in these applications. However, current learning algorithms for graphical models have not been adequately adapted to the concerns of the FDR. The traditional practice of controlling the type I error rate and the type II error rate under a conventional level does not necessarily keep the FDR low, especially in the case of sparse networks. In this paper, we propose embedding an FDR-control procedure into the pc algorithm to curb the FDR of the skeleton of the learned graph. We prove that the proposed method can control the FDR under a user-specified level at the limit of large sample sizes. In the cases of moderate sample size (about several hundred), empirical experiments show that the method is still able to control the FDR under the user-specified level, and a heuristic modification of the method is able to control the FDR more accurately around the user-specified level. The proposed method is applicable to any models for which statistical tests of conditional independence are available, such as discrete models and Gaussian models.
Conditional independence graphs are now widely applied in science and industry to display interactions between large numbers of variables. However, the computational load of structure identification grows with the num...
详细信息
Conditional independence graphs are now widely applied in science and industry to display interactions between large numbers of variables. However, the computational load of structure identification grows with the number of nodes in the network and the sample size. A tailored version of the pc algorithm is proposed which is based on mutual information tests with a specified testing order, combined with false negative reduction and false positive control. It is found to be competitive with current structure identification methodologies for both estimation accuracy and computational speed and outperforms these in large scale scenarios. The methodology is also shown to approximate dense networks. The comparisons are made on standard benchmarking data sets and an anonymized large scale real life example.
Insight into brain development and organization can be gained by computing correlations between structural and functional measures in parcellated cortex. Partial correlations can often reduce ambiguity in correlation ...
详细信息
ISBN:
(纸本)9781457718588
Insight into brain development and organization can be gained by computing correlations between structural and functional measures in parcellated cortex. Partial correlations can often reduce ambiguity in correlation data by identifying those pairs of regions whose similarity cannot be explained by the influence of other regions with which they may both interact. Consequently a graph with edges indicating non-zero partial correlations may reveal important subnetworks obscured in the correlation data. Here we describe and investigate pc*, a graph pruning algorithm for identification of the partial correlation network in comparison to direct calculation of partial correlations from the inverse of the sample correlation matrix. We show that pc* is far more robust and illustrate its use in the study of covariation in cortical thickness in ROIs defined on a parcellated cortex.
暂无评论