September
17, 2011 Posted by karmadsen under
blog, Geostatistics, Modeling Software, Statistics
2 Comments
In
attempting to find a rule of thumb for the minimum number of observation points
needed for kriging, I found out that it’s not particularly
straightforward. Kriging is a linear
least squares estimation algorithm. Even in simple least square regression there
is a lot of guess work involved in determining the minimum sample size needed
prior to conducting a study. Various
estimation methods exists to help researchers design regression analysis
studies based on the number of variables and the the desired power level. But in kriging, there are added levels of
complexity. The x,y distribution of
points matters (i.e. they shouldn’t all be clumped together). The volatility of the z-value matters (i.e.
you need more points to capture busy spatial trends).
Various
methods exist to design spatial sampling for interpolation. The simplest and most obvious is to sample on
a grid, with the resolution as dense as feasible given the project budget. If a specific portion of the sample area is
especially sensitive, the resolution of sampling could be higher in that region
and lower elsewhere. Once kriging has
been conducted on a set of points, kriging mathematics can be used to project
where the “next” sampling point or points should be to reduce uncertainty in
the model.
Rather than
establishing a rule of thumb for the minimum number of sampling points to use,
a more important question is how to use the points that you have to get the
most use out of them. Each prediction
within the calculated surface is based on the nearest observation points. The number of neighbors to use is set by the
user. According to the Harvard School of
Public Health, a general rule of thumb is to use a sizable fraction of the
total data set: “For example, for 100 data points, I would try to use at least
25 neighbors, and more if possible. For
1000, I would use at least 25-50 and ideally a few hundred, but the
computations may be too slow with this many.”
In general,
more is better when it comes to observation points. It’s also possible to have too many
observation points for kriging. During
the kriging calculation, an N x N matrix (where N is the number of sample
points) must be inverted. This can
become computationally intensive. Also,
the N x N matrix becomes ill-conditioned when sample points are located closely
together. Thus, kriging works best for
sparse sample sets.2
1. Ciol,
M.A. (2008) Presentation: Sample Size and Power Calculations. University
of Washington.
2. Swiler,
L.P., Slepoy, R., Giunta, A.A. Evaluation of Sampling Methods in Constructin
Response Surface Approximations. Albuquerque ,
NM : Sandia National Laboratories.
American Institute
of Aeronautics and
Astronautics.
Δεν υπάρχουν σχόλια:
Δημοσίευση σχολίου