Cost distance

<< Click to Display Table of Contents >>

Navigation:  Building Blocks of Spatial Analysis > Distance Operations >

Cost distance

The term cost distance is widely used within GIS and in this context has a dual meaning: (i) the notion of an alternative family of distance metrics; and (ii) a procedure for determining least cost paths across continuous surfaces, typically using grid representations

Within GIS the term is distinct from the computation of varying costs or times of travel across a network. The term cost in this connection is also a generic idea, not necessarily implying financial cost, but some composite measure that varies (i.e. is not constant) across the study region, and which needs to be taken into account when computing paths and distances.

An example serves to clarify this notion. Suppose that we wish to construct a section of road between two points, A(0,2) and B(4,2), where costs are constant over the region of interest. In this case the distance is simply 4 units and the path involved is a straight line in the plane (Figure 4‑57, bold line). Construction costs will simply be 4k units, where k is some constant (the cost per unit length of path construction). Now suppose that construction costs also depend on the cost of land acquisition, and that land costs vary in our study region in a simple manner — greater to the “north” of our region than to the “south”. In this situation it is cheaper to construct our road with a curve in it, taking advantage of the fact that land costs are lower to the south (nearer the x-axis in our model).

Figure 4‑57 Cost distance model

costdistance1

If costs increase in a precisely linear manner, the optimal (least cost) path will have a smooth curve form of the kind shown by the thinner line (a shape like a chain hanging between two posts). This case turns out to be solvable mathematically, as are cases where costs vary in radially symmetric manner (e.g. around a city center), whereas almost every other example of varying costs cannot be solved precisely and we need to resort to computational procedures to find the least cost path.

Accumulated cost surfaces and least cost paths

Within GIS packages solutions to such problems are found using a family of algorithms known as Accumulated Cost Surface (ACS) methods — for a fuller discussion of ACS methods and representational accuracy see Douglas (1994) and Eastman (1989). The ACS procedure is applied to raster datasets, in which the primary input surface is a complete grid of generalized costs, i.e. every cell is assigned an absolute or relative cost measure, where costs must be (positive) ratio-scaled variables. In Figure 4‑58 we show a hypothetical cost grid for the same example as shown in Figure 4‑57. Values increase linearly away from the x-axis, from 0.5 to 2.5. This essentially a much simplified representation of the function f(x,y)=y, so f(x,y)=0 when y=0 for all x and increases steadily so at y=2 f(x,y)=2. In practice cost functions should rarely if ever have zero-valued sections, so this example should be considered illustrative!

Providing a gridded representation of the cost surface highlights the importance of using a sufficiently fine representation of the surface, since the solution path must be computed by adding up the costs along alternative paths. If the grid resolution is quite coarse, as in Figure 4‑58, the solution path will not have the curved form of the optimal path, but will simply be a very blocky stepped line of cells approximating the curve. Amending the grid resolution (halving the cell size to 0.25 units) as shown in Figure 4‑59A improves both the cost surface representation and the possible solution paths that are considered.

Figure 4‑58 Cost surface as grid

costdistance2

Figure 4‑59 Grid resolution and cost distance

costdistance3

If we look at a path in Figure 4‑58 that starts at A(0,2) and goes diagonally down towards C(2,0) and then back up to B(4,2) we have cell costs of 2+1.5+1+.5+.5+1+1.5+2=10 units (times cell size of 0.5 gives a cost distance of 5 units). But this path is longer than the sum of the steps — all the diagonal steps are √2 (1.414… units) longer than horizontal and vertical steps, so the total cost must be increased for such steps by this factor, giving a final cost distance of 6.85 units. This is less than the cost of 8 units that would be incurred for a straight line path from A to B. But can we be sure this is the best route possible for this problem and is the calculation “correct”?

ACS provides a systematic procedure for answering this question. With this algorithm we start at A and sum up the costs in all directions, step by step, choosing the least cost increment at each step. Eventually, when we arrive at B, we will have the overall least cost of getting to B, and if we have kept track of our route (the steps we have taken) we will have the least cost path from A to B. Thus ACS is a form of spread algorithm, with values increasing in all directions depending on the grid resolution, cost surface and algorithmic implementation. Almost all GIS implementations of ACS use immediate neighbors in calculating values (i.e. queen’s moves, to the 8 adjacent cells) but some facilitate calculation to neighbors further away. GRASS, for example, offers the option in the r.cost function, of using knight’s move neighbors (one step up and one diagonal) in its calculations. Distance transform methods utilize a much wider range of neighbors, and hence are not restricted to 8-directional cost surfaces and path alignments, and thus do not exhibit the common directional bias and computational errors of ACS distance models.

As we start to consider the ACS idea more closely we see that a number of additional issues need to be addressed. The first is that in a GIS-enabled grid structure cells (or more specifically cell centers) are frequently used as references rather than grid lines. Thus our starting point A cannot be (0,2) exactly but must be a cell center and our end points will also all be cell centers. For example, see Figure 4‑59B where A(0,2) is approximated on the grid by the point A(0.125, 1.875). For the best results we should estimate the value of the cost function at cell centers if possible, rather than at cell edges or grid lines. In the example we are using we have ignored this latter factor, which will often be the case since finer detail for within cell estimation may not be available (although could be obtained by interpolation).

A second issue is that if we take grid cell centers as our reference point, the start point will be half way across the first cell, and the next point will be half way across an adjacent cell, so it can be argued that the cell values (costs) should be allocated 50:50 over this one step, and likewise for steps 2, 3 etc. This can be seen in Figure 4‑59B where the starting point for A is shown as a selected cell center and the step to A′ in the diagonally adjacent cell lies 50% in a cell with cost 2 and 50% in a cell with cost 1.75. This 50:50 division of costs is the normal approach used in GIS packages. The averaged cost from A to A is then (2+1.75)/2=3.75/2 and thus the incremental cost distance is √2*0.25*3.75/2 units (adjusting for the diagonal path using √2 and for the cell size of 0.25 units in this example).

We now have all the basic elements we need in order to compute accumulated costs and identify the least cost path from A to B. Using our example cost grid, with a cell size of 0.25 units, we need to carry out the following operations using a typical GIS:

Create a source grid that contains a 0 in the cell in which our starting point A lies, and some other value (e.g. ‑1, 1, 9999) for all other grid cells. This can be done with a text editor and converted to a GIS compatible grid format if necessary

Create a cost grid in the same manner, exactly the same size and the same resolution as the source grid, but containing the costs associated with each cell position (i.e. similar to the cells in Figure 4‑59a)

Run the cost distance facility within the GIS on these two input grids, requesting the output as a cost distance (ACS) grid, and also requesting (if necessary) that a direction or tracking dataset is generated that can be subsequently used to identify least cost paths

Finally, to create the least cost paths, a target grid is required, which contains the target or destination points (again as 0s or a similar cell identifier for the target points). The least cost path facility in the GIS is then run using the target grid, ACS grid and tracking grid

A sample result for our test problem is shown in Figure 4‑60. Each grid cell has been separately colored and a series of additional destination points (cells) have been included in the least cost path finding stage. The least cost path from A to B, corresponding to our original problem, hugs the bottom of the diagram as it seeks to take advantage of the low cost first row (cells with a value of only 0.25 units). Routes to other cells may share common paths for parts of their route and may not be unique, as indicated by the two routes to the cell diagonally below point A to the right. If the cost surface had been flatter, for example f(x,y)=y/4+1, the ACS model with the cell resolution illustrated would fail to select the correct path, but would show a horizontal path from A to B. To obtain a more accurate estimate of the correct path and cost distance a much finer grid would be required. For this latter example the true path is a gentle curve with a minimum value at around y=1.65 and total cost of 5.88 units.

Figure 4‑60 Accumulated cost surface and least cost paths

clip0114

This simple example provides a flavor of cost distance computation, but this procedure (or family of procedures) has far more to offer. The first and perhaps most important point to note is that the source and target features do not need to be single points. They can be multiple distinct points (cells), lines (linear groups of cells), or regions (clusters of adjacent cells). In all such cases the ACS procedure computes the lowest total distance to a target cell from its nearest source cell. If requested the GIS may also produce an allocation raster, which identifies which source each cell in the grid is closest to (typically by giving each cell a unique integer identifier). Obstacles (e.g. lakes, cliffs etc.) may be included in these models by setting cells in the cost raster to very high values or by the GIS explicitly supporting unavailable (masked) cells (e.g. in Idrisi, coding barrier cells as -1).

The input cost raster may be, and often is, a composite grid created from a number of other (normalized) input grids by simple map algebraic operations. The final cost grid may also be normalized to provide an indication for each cell of the relative cost, difficulty or friction of crossing that cell. Figure 4‑61 illustrates the result of such a process, which has been applied to determining possible alternative routes for a main road in part of South-East England. The map shows the existing road (the A259T) from Hastings in the south-west to Brenzett in the north-east. Three alternative routes, identified here by arrows, were generated by ACS methods. These routes were generated from cost surfaces that included different weighted values for a series of inputs: land-use (grade and type of land); SSSIs (Sites of special scientific interest); areas of designated Ancient Woodland; the slope of the land across the study area (derived from a DEM); existing (buffered) built-up areas (many of which are conservation areas) and ancient monuments; National Trust land; and existing (buffered) road and rail routes. The final route alternatives were then computed from the composite ACS surface and smoothed to reduce the effects of the gridding of all the datasets and output paths. This information was then provided as input to a public consultation exercise.

This is a relatively simple example of a general class of problems categorized as corridor planning ― selection of alternative routes for roads, rail lines, power lines or other purposes frequently requires the involvement of many parties, and formal multi-criteria decision support systems, such as AHP and ANP, have been extensively used in this field ― see, for example, Azis (1990), Bailey (2003) and Piantanakulchai (2005).

Figure 4‑61 Alternative route selection by ACS

clip0115.zoom50

 

The above examples are based on the procedures utilized in ArcGIS, Spatial Analyst. Many other packages offer similar facilities. For example, Idrisi offers two alternative algorithms: one, called the pushbroom algorithm, is similar to distance transform procedures, and is fast but not suitable for use with more complex problems (for example truly implementing barriers); a second Idrisi procedure, costgrow, is much slower but does facilitate complex friction patterns and fixed barriers. Idrisi also allows for anisotropy in the cost modeling process, with some directions (e.g. downhill) being preferable to others (e.g. uphill). SAGA, GRASS and PCRaster provide similar functions, with inputs optionally including a friction surface (with relative friction values and absolute barriers) and a physical surface, with movements restricted to uphill, downhill and either stopping at or crossing flat areas. Such procedures are clearly targeted at hydrological and related studies, so whilst being well suited to these environments are not readily generalized. Many such programs, particularly those with an environmental sciences and remote sensing background, perform the path finding stage using a form of steepest descent procedure (often described as a drainage procedure) applied to the ACS. This approach is not guaranteed to provide the correct shortest path — tracked paths should be used. This is illustrated in Figure 4‑62, where local Euclidean grid distances and 50:50 splitting of costs between the cells shaded in gray has been used to generate the ACS grid, Figure 4‑62B. The steepest path from the target cell back to the source cell (Figure 4‑62C) is not the least cost path, as its total accumulated cost is 13.726. The correct least cost path is shown in Figure 4‑62D. ACS procedures are subject to a number of other problems, for example: becoming trapped in localized plateaus or pits; inadequate handling of multiple equivalent paths; or failing when input and/or output datasets are specified to operate in integer rather than float mode.

Figure 4‑62 Steepest path vs tracked path

A. Cost surface

 

 

 

 

B. Accumulated cost surface (ACS)

 

1

5

5

5

1

 

4.414

7.414

12.414

8.828

5.828

1

5

5

5

1

 

3.414

6.656

11.656

8.07

4.828

1

5

5

5

1

 

2.414

5.656

6.656

5.414

3.828

1

5

5

1

1

 

1.414

3

4.242

2.414

3.414

1

1

1

1

1

 

1

0

1

2

3

 

 

 

 

 

 

 

 

 

 

 

C. ACS Steepest descent path

 

D. ACS Least cost path

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

A number of GIS packages incorporate extensions to the standard cost distance/ACS model. For example, ArcGIS provides a variant of the ACS procedure which it describes as Path Distance. This is essentially the same as ACS but includes options such as inclusion of surface distance (using a surface raster or DEM), and vertical and horizontal movement rasters that are utilized in adjusting the optimum path computation. TNTMips also includes a path analysis procedure that takes into account the physical surface form, but the authors are cautious about its optimality. Idrisi provides a similar facility, Disperse, designed to provide a simple force-based model of anisotropic object dispersal (SAGA provides comparable facilities). In this case the inputs are a set of four grid files: (i) the location of the target and source features (non-zero entries are targets, zeros are sources); (ii) and (iii) two grids, one containing the magnitude and the other containing the direction (azimuth) of the anisotropic force field, i.e. a vector field modeled using a pair of grids; and (iv) an output grid. The notion of the magnitude of the force field in this implementation is as the inverse of a friction field, F, force=1/F where F in this case is taken to be greater than 1. The anisotropic dispersal model is either user specified or utilizes an inbuilt cosine function that modifies the stated friction field F'=F*f, where

α is an angular difference between the angle being considered and the stated friction angle, and k is a parameter (k=2 is the default). The process may be modeled as a series of steps (time periods) by pre-specifying a maximum dispersal distance at each stage.