Spatial and temporal Coverage

<< Click to Display Table of Contents >>

Navigation:  Big Data and Geospatial Analysis > Challenges of Big Data >

Spatial and temporal Coverage

In addition to representing biased subsets of the population, data are usually partial representations of phenomena across space and time. Activity of some kind is usually required to generate data, and these are bounded by geography and time. Even when machines are used to automatically collect data, their geographic placement may not be uniform and may be influenced by the demographic biases described above. Most commonly the responsibility of recording data is handed to the people the data pertain to. Therefore, their behavior will heavily influence the data they generate and they may also be influenced by unobserved external factors. Failure to account for the geotemporal biases may lead to the misrepresentation of real-world processes.

Some new forms of data can be generated anywhere and at any time due to the use of new technologies such as hand-held devices. However, despite the presumed freedom of these data collection activities, they will still have some spatial and temporal restrictions. People cannot be assumed to actively contribute data randomly or at regular intervals – unless data are machine generated of course. Taking georeferenced social media data as an example, the action of uploading a georeferenced post may be more common in tourist or social settings than in workplaces. And indeed, some factors such as time of day/night or poor data communications will inhibit the ability to upload content. Therefore, distribution of georeferenced social media does not represent the distribution of people (and activities) across time and space, despite there being strong associations in some settings.

At the international level, Taylor and Overton’s first law of geographical information still stands true today at a general level, viz: poorer nations are also poorer in good quality data (Taylor and Overton, 1991). There is a digital divide in a range of different types of geographical data. For example, the quality and precision of OSM is far better in developed parts of the world where there are more contributors and there is generally better connectivity. In addition, social media usage is generally lower in developing nations. Although Internet connectivity is improving globally, developing nations and those with lower information equality still lag behind. This is unfortunate, as the developing parts of the world are often in the greatest need of good quality geographic information.