Distance Operations

<< Click to Display Table of Contents >>

Navigation:  Building Blocks of Spatial Analysis >

Distance Operations

Fundamental to all spatial analysis is the use of distance calculations. If our model of the world (mapped region under investigation) is regarded as a flat plane (2D Euclidean space) or a perfect sphere, then we can use the simple distance formulas dE and dS provided in Section 4.2, Geometric and Related Operations, to compute distances between point pairs. These provide the basic building blocks for many forms of spatial analysis within most GIS packages. However, these two formulas are too limiting for certain forms of analysis and additional distance measures are required. In some cases these alternative measures (or metrics) can be computed directly from the coordinates of input point pairs, but in others computation is an incremental process. In fact incremental methods that compute length along a designated path (between a series of closely spaced point pairs) by summing the lengths of path segments, are the most general way of determining distance, i.e. the notions of distance and path are inextricably linked.

Figure 4‑53 provides a simple cross-sectional illustration of the kind of issues that arise. We wish to determine the distance separating points A and B. If A and B are not too far apart (e.g. less than 10 kms) we could use a high precision laser rangefinder to establish the slope distance between A and B, assuming there is no atmospheric distortion. In practice there will be some distortion and the laser wave path will need to be adjusted in order to provide an estimated slope path distance. This in turn will require further adjustment if it is to be referenced to a common datum or a level terrain surface. In each case the distance recorded between A and B will be different.

From the diagram (Figure 4‑53) it is clear that none of these distances corresponds to the actual distance across the terrain surface along a fixed transect, nor to a distance adjusted or computed to reflect the particular model of the Earth or region of the globe we are using. In some cases these differences will be small, whilst in other they may be highly significant.

Figure 4‑53 Alternative measures of terrain distance


In general, simple Euclidean 2D (incremental) computations are satisfactory for many problems involving relatively smooth terrains extending over an area of perhaps 20km by 20km, even using suitably defined longitude and latitude coordinates (assuming these latter coordinates are not inside the Arctic or Antarctic regions). Closer to the poles the distortion of lines of latitude results in increasing errors in this formula and a variant known as the Polar Coordinate Flat-Earth formula may be used as an improved approximation:

In this formula we are using polar coordinates (φ,λ) to represent latitude and longitude expressed in radians, and R = the radius of the terrestrial sphere. With this formula the error is at most 20 meters over 20 kms even at 88degN or S. For example, the formula evaluates the distance between A(70degN,30degN) and B(71degN,31degN) as 117.6 kms (using the WGS84 ellipsoid), 112.8 kms 10deg further north, and 111.4 kms at 87degN/88degN. One degree separation (ignoring longitude differences) would be 111.3 kms. However, for: (a) larger regions, or problems where surface distance is important; (b) problems involving networks; and (c) for problems involving variable costs; alternative procedures are required. Within mainstream GIS packages a range of tools exist to support such analysis, including the provision of support for Euclidean and Spherical distance (but not always ellipsoidal distance), network distance and cost distance (where cost is a generalized concept as we discuss in the subsections that follow).

Most software packages offer at least three categories of distance measure: (i) coordinate-based distance measures in the plane or on a sphere (which may be adjusted to a more close approximation of the Earth’s shape); (ii) network-based measures determined by summing the stored length or related measure of a single link or multiple set of links along a specified route (e.g. shortest, quickest); and (iii) polyline-based measures, determined by summing the length of stored straight line segments (sequences of closely spaced plane coordinates) representing a given feature, such as a boundary. The first and third of these distance measures utilize the simple and familiar formulas for distance in uniform 2D and 3D Euclidean space and on the surface of a perfect sphere or spheroid. It is important to check to see if distance computations provided represent true distances (e.g. in miles or kms) since Euclidean computations on raw latitude/longitude values produces incorrect results.

These formulas make two fundamental assumptions: (i) the distance to be measured is to be calculated along the shortest physical path in the selected space — this is defined to be the shortest straight line between the selected points (Euclidean straight lines or great circle arcs on the sphere); and (ii) that the space is completely uniform — there are no variations in terms of direction or location. These assumptions facilitate the use of expressions that only require knowledge of the coordinates of the initial and final locations, and thereby avoid the difficult question of how you actually get from A to B.

In most cases measured terrestrial distances are reduced to a common base, such as the WGS84 reference ellipsoid. Note that the reference ellipsoid may be above or below the geoidal surface, and neither distance corresponds to distance across the landscape. Laser-based measurement is rarely used for distances of 50km or more, and increasingly DGPS measurements of location followed by direct coordinate-based calculations avoid the requirement for the wave path adjustments discussed above.

Whilst coordinate-based formulation is very convenient, it generally means that a GIS will not provide true distances between locations across the physical surface. It is possible for such systems to provide slope distance, and potentially surface distance along a selected transect or profile, but such measures are not provided in most packages. A number of packages do provide the option to specify transects, as lines or polylines, and then to provide a range of information relating to these transects. Examples include the the Grid|Slice command procedures in Surfer, the Profiles and Elevations facilities in Manifold (Surface Tools) and the Profile facility in ArcGIS 3D Analyst. In addition to path profiles (elevations, lengths and related transect statistics) such paths may be linked with volumetric cut-and-fill analyses (see further Sections 6.2.3, Directional derivatives, and 6.2.6, Pit filling).

For problems that are not constrained to lie on networks it is possible to derive distances from the surface representation within the GIS. For example this can be achieved using a regular or irregular lattice representation along whose edges a set of links may be accumulated to form a (shortest) path between the origin and destination points. For the subset of problems concerned with existing networks, such questions could be restricted to locations that lie on these networks (or at network nodes) where path link lengths have been previously obtained. Both of these approaches immediately raise a series of additional questions: how accurate is the representation being used and does it distort the resulting path? which path through the network or surface representation should be used (e.g. shortest length, least effort, most scenic)? how are intermediate start and end points to be handled? are selected paths symmetric (in terms of physical distance, time, cost, effort etc.)? what constraints on the path are being applied (e.g. in terms of gradient, curvature, restriction to the existing physical surface etc.)? is the measure returned understandable and meaningful in terms of the requirements? how are multi-point/multi-path problems to be handled? Answers to some of these questions are discussed in Section 4.4.2, Cost distance, but we first discuss the notion of a metric.