Raster models

<< Click to Display Table of Contents >>

Navigation:  Surface and Field Analysis > Modeling Surfaces >

Raster models

A raster, grid or image file may be thought of as a rectangular array of identical square cells, each of which has an associated value (which may be zero or a code for “missing value”). The array therefore comprises a number of rows and columns. In some instances (as in mathematics) rows run from 1 to n from the top and from 1 to m from the left, whilst in others (including standard geographic referencing) the rows may be in the reverse order, i.e. the first row is actually the bottom of the array, or most southerly position, with subsequent rows representing successive steps to the north (northings). Columns in this model correspond to east-west location, and may be referred to as eastings. In Cartesian coordinates eastings correspond to x-values and northings to y-values. The contents of cells are then considered as field values or surface heights, and conventionally referred to by the letter z. A raster grid is thus an array of n rows and m columns {x,y,z}. Where multiple bands or values exist for each (x,y) location a series of triples {x,y,z1}, {x,y,z2}, {x,y,z3} … are recorded, and typically treated as a set of separate (associated) surfaces. Collections of grid datasets that are adjacent are often called tiles, and separate tiles may be processed independently or may first require combining into a single larger grid before analysis takes place. This process is known as creating a mosaic in some GIS packages, or a combination operation in others. When such datasets are combined there may be user-selectable rules on how any small overlaps of edges should be treated — typically these default to a First model, i.e. the values in the first grid to be processed take precedence, but other options might include averaging, taking the largest or smallest values etc. The advantage of combining grids as a mosaic is that analysis is conducted on the entire dataset (region) of interest, taking into account all of the recorded values, and minimizing edge effects; the main disadvantage is that such datasets can become extremely large and thus very cumbersome and slow to process.

We can describe the 2-step (Moore von Neumann, or MvN) neighborhood of a single cell of a grid structure in (at least) two separate ways, depending on the referencing system applied (Figure 6‑4).

Figure 6‑4 Raster file neighborhoods


Here we have illustrated referencing for cells using compass-type notation and using numerical offset notation. Both may be extended to cells further away from z*, but the numerical system is simpler as it remains in the same format (2 values) as more distant cells are included.

Using either of the notational forms we can define operations on raster grids. For example, the rate of change in the x-direction and y-direction about z* (the first-order partial derivatives) can be estimated as:


where the ∆x and ∆y values are determined by the grid resolution. These are the formulas used by the Surfer and Idrisi packages, amongst others. Basic second order differentials, required for curvature calculations, follow a similar format:

An alternative approximation, first introduced by Horn (1981) uses all 8 immediate neighbors of z*. ArcGIS is an example of a GIS package that uses this method for some of its calculations: for the x-direction the directly east and west positions are weighted 2 whilst the remaining 4 directions (NE, NW, SE, SW) are weighted 1; for the y-direction the directly north and south points are weighted 2 and the remainder weighted 1. The result is a slightly more complicated pair of formulas for the first-order differentials that have been found to be better than many alternatives, especially for rough surfaces. Using these approximations both slope and aspect can be estimated for each point of a grid (see further, Section 6.2.1, Gradient, slope and aspect). The formulas are based on standard finite difference expressions:

Surface curvature measures (including plan, profile and tangential curvature) require second order derivatives, and again these may use relatively simple formulas or more complex variants. ArcGIS uses a method of estimation based on fitting a partial fourth-order polynomial to the 3x3 window in which z* sits, whilst Surfer applies simpler differential approximations. Both approaches are described in more detail in Section 6.2, Surface Geometry. Landserf uses a different method from those described above for all of its main surface analysis facilities. The method is based on a quadratic surface approximation at each grid cell, using the immediate 8-cell neighborhood (i.e. a 3x3 window) as standard. It also provides the option to use a larger window size and a distance decay function for weighting cells that are more distant (such functions are described in Section 4.4.5, Distance decay models).The quadratic fitted surface is of the form:


where a, b, c, d, e and f are constants to be estimated. The fitted quadratic can be analytically differentiated, the components equated to 0, and the slope, aspect and curvature components derived directly from this process. The result is a set of very simple expressions in terms of the constants for the main surface characteristics: aspect is simply computed as:

A=tan-1(e/d), and slope as:

It should be noted that the question of determining the coefficients remains, and since there are only 6 of these and potentially 9 cell values to utilize, a level of redundancy exists permitting a range of choices or procedures for coefficient selection, with least squares fitting being one such choice.

An alternative, devised by Zevenbergen and Thorne (1987), is to fit a 9 parameter function that is of quartic (4th) order but only quadratic in x and y:


This function has the advantage that all its parameters are uniquely determined from the values of the cells in the 3x3 window, and the function passes exactly through each data point, including the central point. The coefficients a, b, c and i are not used in computations of slope, aspect or curvature, so may be ignored for these measures. The coefficient i=z*, the value at the central point of the 3x3 window (z0,0). The coefficients g and h turn out to be identical to the simple expressions used by Surfer and others as estimates of first order derivatives, and these determine both slope and aspect computations so the method reduces to simple finite differences for these measures. The expressions for plan and profile curvature are more complex, and use all of the coefficients dh. This quartic model is the one used by ArcGIS as described above, SAGA (as its default option) and in the RiverTools facility Extract|Finite difference grid. The terms d‑h are also determined in exactly the same manner as for the second-order derivatives used by Surfer, as defined above. The derivation and use of these expressions is discussed in more detail in Moore et al. (1991, 1993), Wood (1996) and Zevenbergen and Thorne (1987). Useful analyses of alternative algorithms and the associated issues of errors and uncertainty analysis are provided by Jones (1998), Zhang and Goodchild (2002, Chapter 5) and Zhou and Liu (2004).

Surfaces represented by grids suffer a number of disadvantages. These include:

very large requirements for data storage, processing and display — the test grid TQ81NE that will be used for illustrative purposes here is 1.4Mbytes as an NTF file, 1.7Mbytes as an ASCII or Surfer grid, and less than 150kbytes as an ArcGIS TIN file and only 33kbytes as a 500-vertex Landserf TIN file (albeit with some degree of information loss). Image files are often far larger — a single USGS High Resolution Orthoimage covering approximately 4.5kmx4.5km in color at 0.3m resolution requires 660Mbytes of memory to load (40Mbytes when compressed with no information loss)

a fixed grid size and orientation that does not reflect variations in underlying surface complexity

lack of clarity regarding linear and point-like features

use of 3x3 windows (8-cell neighborhoods) can result in strong directional bias in some surface computations. Larger window sizes and more sophisticated local models can alleviate some of these difficulties

On the other hand grids are extremely convenient for data manipulation, combination and display, which explains their widespread popularity.