However, as the variogram estimates are correlated among themselves it is the variance—covariance matrix of the estimates that is needed. In practice, the experimental variogram may be calculated for different direc- tions and, for irregularly located data, an angle tolerance must also be specified. The variogram is calculated for lags that are integral multiples of h h, 2h,. In fact, it will be shown [see Eq. Additionally, in order to obtain the result given in Equation 7 , Cressie assumes that data occur in a transect and are equally spaced, which, in the geosciences, is not generally the case.
Our intention in this paper is to provide a formula for the variance—covariance matrix of the experimental variogram using minimal assumptions.
We invoke Gaussianity solely for the purpose of evaluating fourth-order moments. Additional assumptions made by Genton in his approach are data independence and, for the multidimensional case, isotropy.
The first of these as- sumptions is unrealistic in the geosciences and the second is not always the case in practice. A nonparametric means of obtaining approximate confidence limits for a sample variogram is by using the jack-knife method. Resampling methods such as the jack-knife were developed originally to deal with independent and identically distributed random variables, which is not usually the case in geostatistical applications where the data are spatially correlated.
In such cases the variance given by Equation 8 is only an approximation. There are several other alternative approaches to variogram modeling includ- ing parametric approaches such as maximum likelihood Kitanidis, and Bayesian inference Ecker and Gelfand, Equation 16 gives the variance—covariance matrix of the experimental variogram. A similar but not equivalent expression obtained under several asymptotic assumptions may be found in Cressie As a consequence of the assumptions, the results of Cressie cannot be applied to some variogram models such as the exponential or the Gaussian Cressie, , p.
Equation 17 is general and takes into account the multiple use of the data. This idea can be clarified by means of a simple example. The experimental var- iogram is calculated for a given distance h with two possible situations shown in Figure 1. Comparing 20 with 19 it is easy to see that the effect is as if the number of experimental data pairs is 1. A, The reuse of information in experimental variogram calculation.
B, The same number of experi- mental pairs but without reusing information. This example gets complicated as the number of experimental data increases, but it is easily evaluated using Equations 16 and With the variance—covariance matrix of the sampling distribution and the asymptotic results for the distribution of the sample variogram of Davis and Borgman , it is possible to construct confidence intervals to indicate the uncertainty of the estimates.
This is the well- known Bonferroni method for simultaneous confidence intervals Rice, The variance—covariance matrix may also be used to fit a theoretical model to the experimental variogram by weighted or by generalized least squares. A different approach for fitting the variogram by generalized least squares has been proposed by Genton Genton uses an explicit formula for the covariance structure derived under the assumption that the experimental data are independent, while in this paper the variance—covariance has been derived taking into account the possibility that the data may be correlated.
In practice, in geostatistical operations such as kriging or conditional simu- lation, the experimental variogram is not used directly but rather the theoretical model fitted to it Journel and Huijbregts, A problem that needs to be con- sidered is how to incorporate the uncertainty of the experimental variogram into uncertainty in variogram model parameters. Although a complete study of this topic is beyond the scope of this paper, a straightforward approximate solution is suggested below.
The solution is related to another application of the variance—covariance ma- trix of the variogram estimates: its use for fitting a theoretical variogram model by non-linear, generalized least squares.
K is the number of experimental variogram lag values and n is the number of variogram model parameters. The application to the evaluation of the uncertainty is illustrated in the fol- lowing case study. Figure 2 shows the location of the experimental data with an arbitrary origin 0, 0 and the coordinates for the X and Y directions given in kilometers.
The basic univariate statistics of the experimental data are given in Table 1. For the calculation of these statistics, each experimental data value was weighted equally. Although it is clear from Figure 2 that there are small clusters of experi- mental data, and better point statistics could be obtained using some declustering Table 1. Spatial locations of the experimental data. For illustrative purposes, an isotropic variogram is assumed.
The omnidirec- tional experimental variogram is shown in Figure 3. It has been calculated using Equation 3 , for eight distance lags with an elementary lag of km and a lag tolerance of 50 km. We can either use constrained opti- mization, or employ a parameterization that enforces this condition. We describe here five different parameterizations for variance-covariance matrices that ensure positive definiteness, while leaving the estimation problem unconstrained.
We compare the parameterizations based on their computa- tional efficiency and statistical interpretability. The results described here are particularly useful in maximum likelihood and restricted maximum likelihood estimation in mixed effects models, but are also applicable to other areas of statistics.
In addition, the statistical properties of constrained estimates, such as asymptotic proper- The estimation of variance-covariance matrices through opti- ties, can be difficult to characterize. Verifying that a given symmetric matrix is positive sure that the resulting estimate is positive semi-definite.
This semi-definite is essentially as difficult as employing one of the kind of estimation problem occurs, for example, in the analysis unconstrained parameterizations we will describe later. In these models, because the random effects are unobserved quantities, no sam- An unconstrained estimation approach for variance- ple variance-covariance matrix type of estimator, that would au- covariance matrices in a Bayesian context using matrix tomatically be positive semi-definite, is available.
Indirect es- logarithms can be found in Leonard and Hsu Lindstrom timation methods must be used. The estimation of a variance- and Bates , describe the use of Cholesky factors covariance matrix through optimization of a log-likelihood func- for implementing unconstrained estimation of random effects tion may occur even when a sample variance-covariance estima- variance-covariance matrices in linear and nonlinear mixed tor is available, if, for example, one is interested in using maxi- effects models using likelihood and restricted likelihood.
In general, we can use numerically or analyt- reparameterized in such way that the resulting estimate must be ically determined second derivatives of the objective function to positive semi-definite. As Dennis, Jr. The simplest cases are those where there are variance-covariance matrix that are expressed in the original pa- simple inequality constraints on the parameters and even in those rameterization.
Our conclusions and to obtain confidence intervals for the variances and covari- and suggestions for further research are presented in Section 4. We do not assume any further structure for. The rationale behind all parameteriza- that it is computationally simple and stable. The first three parameterizations presented below use the factorization to be positive then is unique.
We call this parameterization the log-Cholesky In some of the parameterizations there are particular compo- parameterization. These can include the eigenvalues of , which ing uniquely defined. As in the Cholesky parameterization the are important in considering when the matrix is ill-conditioned, parameters lack direct interpretation in terms of the original vari- ances and covariances, except for L A the individual variances or standard deviations, and the correla- tions.
This has implications on parameter iden- Xi and Xj. If confidence intervals in the parameter space. T To ensure uniqueness of the spherical parameterization we must have 2. The spectral decomposition 2.
Uniqueness can p 2. This can be attained, within an unconstrained estima- tion framework, by using a parameterization suggested by Jupp position. Similarly to the Cholesky and log-Cholesky parameteriza- rectly. U The eigenvector matrix in 2. This Householder parameterization is 0. This lesson explains how to use matrix methods to generate a variance-covariance matrix from a matrix of raw data.
Variance is a measure of the variability or spread in a set of data. Mathematically, it is the average squared deviation from the mean score.
We use the following formula to compute population variance. N is the number of scores in a set of scores X is the mean of the N scores. X i is the i th raw score in the set of scores x i is the i th deviation score in the set of scores Var X is the variance of all the scores in the set. Covariance is a measure of the extent to which corresponding elements from two sets of ordered data move in the same direction. We use the following formula to compute population covariance.
N is the number of scores in each set of data X is the mean of the N scores in the first data set X i is the i the raw score in the first set of scores x i is the i th deviation score in the first set of scores Y is the mean of the N scores in the second data set Y i is the i the raw score in the second set of scores y i is the i th deviation score in the second set of scores Cov X , Y is the covariance of corresponding scores in the two sets of data.
0コメント