Whilst concerns about the use of Big Data have been raised by many parties, ethics are frequently overlooked by practitioners who have been enticed by new and innovative forms of insight (Puschmann and Burgess, 2014). The majority of Big Data production and analysis currently occur within the private sector. Given that data and algorithms are commercial assets, their methodologies are often trade secrets and thus not subject to independent scrutiny and critique. Data on people are extremely sensitive, both to the people they relate to and to the organizations that own the data. In 1995 Ground Truth was published to bring attention to the social implications of geographic information systems and geographic databases of people (Pickles, 1995). All of the chapters in this seminal collection express concern, to differing degrees, over the impacts of GIS, some of which focused on the negative connotations geographic research can have on society. However, the ethical concerns have in fact been exacerbated in recent years owing to the proliferation of data which are now analyzed at the individual level.
The collection of personal data is integral to the personalization of services (as demonstrated by the Target loyalty programme described above). However, the scope and use of Big Data can result in "dataveillance", a process through which individuals’ details are routinely collected and stored creating detailed databases about people and their behavior (Dodge and Kitchin, 2007). It is therefore often possible to identify individuals and their behaviors from pooling different sources of Big Data. In addition, some data present very detailed footprints of user locations making it possible for analysts to distinguish personal characteristics about individuals (including their place of residence; Valli and Hannay, 2010). Therefore, when data outputs are publicly disseminated, efforts are undertaken to ensure that data are not disclosive of individuals. Usually, this entails aggregating the data into predetermined spatial units. However, this means that in many cases, researchers cannot take full advantage of the fine granularity of Big Data and must succumb to the challenges of ecological fallacy. It is therefore challenging to secure privacy in some datasets whilst retaining their quality and precision.
Data protection is becoming a more contentious issue and this is also being reflected by new legislation around the world such as the General Data Protection Regulation (GDPR) which is scheduled to come into effect in the EU on 25 May 2018. Once access has been granted, the researcher needs to ensure they are compliant with data protection procedures and also have a clear ethical steer. A useful resource is the British Academy and Royal Society’s 2017 report entitled “Data management and use: Governance in the 21st Century” (Royal Society, 2017), which summarizes the ethical issues that arise from data in today’s society and how the issues might be mitigated.