How Do We Head Off the Collateral Damage of Big Data?
In 2015 we wrote two articles addressing some of the problems associated with social data practices in the digital information age. One discussed the need to re-conceive how people as social entities are included, represented, and theorized in big data environments. The second addressed the long developmental history of the “big data” paradigm and how big data has been used to document and analyse the socially “manufactured” attributes of groups and populations. A key conclusion of this initial work was what we saw as the considerable potential of such data to have coercive effects on social groups identified and “managed” through such systems and technologies. This piece focuses more directly on the idea of data itself being coercive and how such coercion plays out in our emerging big data environment.
Epistemic machineries and ontological effects
Ian Hacking has explored the pervasive and iterative effects of category constructions in social environments. A central issue in knowledge creation today is the pervasiveness of quantification. To paraphrase Rob Kitchin, we are now in the midst of a data revolution, one which relies on the capture, analysis, and visual representation of enlarged quantitative data, in increasingly digital formats, each of which is amenable to multiple analytical and visualisation techniques. In this context the ability to quantify what was previously considered inaccessible has become so embedded in the discourse of “innovative knowledge production” as to constitute an epistemic virtue. As a result, big data quantification has not only become a mechanism for extracting information but an idea with social and political power in its own right.
Over the past few years a vigorous debate has emerged about the ways in which digital technologies rely on modelling, algorithms, and related “artificial intelligence agents” for their epistemic authority and influence. In this context, the algorithm is shorthand for more complex programming and analysis problems. One issue to arise is that of analysis producing or reinforcing social prejudices entering the digital domain via algorithm design and application. Since many algorithms seek to combine and quantify complex social phenomena in a reductive manner, there may be limited scope for critical reflexivity in their production. At the same time there is an inherent risk in the application of such tools in the construction of specific “social problems” in the same way that Charles Booth created both maps of poverty and maps of “criminality” in London. An ethics of big data is also beginning to emerge in response to these types of concerns.
The application of many quite conventional mathematical techniques can acquire a social and political force through their integration within information technology systems, software, and model development. This, we propose, is due in part to a lack of critical appraisal of concepts drawn from social science and social policy and their normalisation for broader political agendas.
Data as coercive concept and practice
This is how data becomes a coercive instrument in our present time. Information about people as individuals or groups has always had the potential for coercion, which the history of the modern census and some of its consequences make very clear. The digital quantification of individuals and groups who are constructed under the modern nation state as an actual or potential problem provides new ways for them to be acted on, in, and through information. A classic output of the development of modern social science and its policy application has been the emergence of the category of “wicked problems.” This is a principle of modern policy design but it generally minimizes the potentially coercive nature of both data and informationally informed social policy.
Quantification, data collection, and its analysis suit the observers of such problems, and those who frame them as intractable problems in the first instance. Links between, for example, poverty and ill health, or racism and crime, or lack of education and unemployment, persist in generating information about the same “wicked problems” in many places and across time. In such contexts, we propose, coercive data documents the broader socio-political system’s effects rather than the characteristics of the social group being observed. The counter-argument to the principle of quantification as an epistemic virtue is no less coercive: the unquantified does not matter and can be shown not to matter because there is no record of their effect. In summary, that which is uncounted is not a real (political) problem.
Coercion at a distance
As mentioned, in Victorian era, the map was used to spatially “capture” and inquire on social patterns including poverty and the location of the various classes, especially the impoverished and “semi-criminal.” This process has been enthusiastically adapted and extended in the digital age. Mapping crime “hotspots” is a growing industry in the applied spatial sciences. Mapping these phenomena, often in real time, can have the effect not only of potentially preventing some categories of crime, depending on how the data and mapping are used, but also of criminalizing groups and even entire communities because they exist in these quasi-militarized spaces – counted, quantified, visually tracked (CTV), mapped and analysed. This process is facilitated by the increasing sophistication of spatial data technologies we are seeing emerge, up to and including multidimensional datascapes and urban dashboards.
Technology also makes coercion both scalable and extensible. One of the rising domains of technological innovation lies in various forms of spatial science. These include technologies such as GPS (global positioning system), GIS (geographic information systems), LIDAR (light imaging, detection, and ranging), remote sensing, high-resolution imaging systems and their applications. One of the most obvious and high-profile areas of application lies in the UAV or drone sector, unmanned aerial vehicles used by military, police, and, increasingly, civilian organisations, on which a growing scholarship now exists. Drones can bring together several technologies in one tidy, coercive, digital package (cameras, weapons, etc.). They can be integrated into larger information systems so that they are sensory extensions of some centralized, or distributed, information processing environment, including the weaponized map. Civil uses aside, the results of these converging technologies have serious implications for the communities they are used to monitor.
A problem we see here is the continuing confusion of a particular social outcome with the processes that produce it. Louk Hulsman argued that crime has no ontological reality because the concept of crime is the product rather than the object of criminal policy. Yet stigmatised and criminalized groups experience direct and coercive ontological effects (patrols, monitoring, searches, pull-overs) which serve to situate the problem in the individuals or communities rather than in the wider society which defines, constitutes, and manages the “problem,” its punishment and its consequences. As intersectional theory and statistics attest, this coercive ontology is magnified by what Patricia Hill Collins calls the “reciprocally constructing phenomena” of gender, ethnicity, poverty, age, and so on. In our rapidly developing big data environments, the algorithm itself runs the risk of becoming yet another weapon against the already stigmatized social group.
Conclusion
Digital information systems have undergone much development over the last two centuries; greatly enhanced in the post-World War Two period by the rise of digitization. The application of these systems to social phenomena builds on a deeply held belief in the veracity of mathematically and statistically produced information. The lack of a widespread and public critique of quantitative methods and their application lends power to the existing and potentially coercive power of digital information systems and their attendant methods, and enhances the potential for “collateral damage” associated with such applications.
More important than this is that without a critical approach to big data concepts, methods and applications, we run the risk that it becomes an increasingly sophisticated paradigm for coercion. These methods are already ubiquitous and they will only continue to proliferate. If we lack mediating critiques and practices, the threat is that by “managing” the social domain through them, we are using sophisticated 21st century technologies to support entrenched 19th century ideologies, values and practices.