Representation-Induced Algorithmic Bias

Stanton, Simon C.; Dermoudy, Julian; Ollington, Robert

doi:10.1007/978-3-030-97546-3_9

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 13151))

Included in the following conference series:

Australasian Joint Conference on Artificial Intelligence

2175 Accesses

Abstract

In conceiving of autonomous agents able to employ adaptive cooperative behaviours we identify the need to effectively assess the equivalence of agent behavior under conditions of external change. Reinforcement learning algorithms rely on input from the environment as the sole means of informing and so reifying internal state. This paper investigates the assumption that isomorphic representations of environment will lead to equivalent behaviour. To test this equivalence-of assumption we analyse the variance between behavioural profiles in a set of agents using fourteen foundational reinforcement-learning algorithms across four isomorphic representations of the classical Prisoner’s Dilemma gameform. A behavioural profile exists as the aggregated episode-mean distributions of the game outcomes CC, CD, DC, and DD generated from the symmetric selfplay repeated stage game across a two-axis sweep of input parameters: the principal learning rate, $\alpha $, and the discount factor $\gamma $, which provides 100 observations of the frequency of the four game outcomes, per algorithm, per gameform representation. A measure of equivalence is indicated by a low variance displayed between any two behavioural profiles generated by any one single algorithm. Despite the representations being theoretically equivalent analysis reveals significant variance in the behavioural profiles of the tested algorithms at both aggregate and individual outcome scales. Given this result, we infer that the isomorphic representations tested in this study are not necessarily equivalent with respect to the induced reachable space made available to any particular algorithm, which in turn can lead to unexpected agent behaviour. Therefore, we conclude that structure-preserving operations applied to environmental reward signals may introduce a vector for algorithmic bias.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Networked Independent Reinforcement Learners Playing an Evolutionary Game

Intrinsic fluctuations of reinforcement learning promote cooperation

Article Open access 24 January 2023

Learning to alternate

Article 12 April 2018

Code Availability

A repository of code used in this study, and further supplementary material, is available at https://github.com/simoncstanton/equivalence_study.

References

Angwin, J., Larson, J., Mattu, S., Kirchner, L.: Machine bias. ProPublica (2016). https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing. Accessed 05 Jul 2021
Arthur, B.F.: Complexity economics: why does economics need this different approach? In: Complexity Economics: Proceedings of the Santa Fe Institute’s 2019 Fall Symposium. Santa Fe Institute 2019 Fall Symposium, Santa Fe Institute (2019)
Google Scholar
Ashlock, D., Kim, E.-Y.: Fingerprinting: visualization and automatic analysis of prisoner’s dilemma strategies. IEEE Trans. Evol. Comput. 12(5), 647–659 (2008). https://doi.org/10.1109/TEVC.2008.920675
Article Google Scholar
Ashlock, D., Kim, E.-Y., Ashlock, W.: A fingerprint comparison of different Prisoner’s Dilemma payoff matrices. In: Proceedings of the 2010 IEEE Conference on Computational Intelligence and Games, pp. 219–226 (2010). https://doi.org/10.1109/ITW.2010.5593352
Barto, A.G., Anandan, P.: Pattern-Recognizing Stochastic Learning Automata (1985). https://doi.org/10.1109/tsmc.1985.6313371
Article Google Scholar
Barto, A.G., Sutton, R.S., Anderson, C.W.: Neuronlike adaptive elements that can solve difficult learning control problems. IEEE Trans. Syst. Man Cybern. SMC-13(5), 834–846 (1983). https://doi.org/10.1109/TSMC.1983.6313077
Billard, E.A.: Adaptation in a stochastic prisoner’s dilemma with delayed information. Biosystems 37(3), 211–227 (1996). https://doi.org/10.1016/0303-2647(95)01560-4
Article Google Scholar
Brams, S.J.: Theory of Moves. Cambridge University Press (1994)
MATH Google Scholar
Bryson, J.J.: Patiency is not a virtue: the design of intelligent systems and systems of ethics. Ethics Inf. Technol. 20(1), 15–26 (2018). https://doi.org/10.1007/s10676-018-9448-6
Article Google Scholar
Crandall, J.W., et al.: Cooperating with machines. Nat. Commun. 9(1), 233 (2018). https://doi.org/10.1038/s41467-017-02597-8
Article Google Scholar
Crandall, J.W., et al.: Supplementary material—cooperating with machines. Nat. Commun. 9(1) (2018b). https://static-content.springer.com/esm/art%3A10.1038%2Fs41467-017-02597-8/MediaObjects/41467_2017_2597_MOESM1_ESM.pdf. Accessed 07 Jan 2020
Grim, P.: Spatialization and greater generosity in the stochastic Prisoner’s Dilemma. Biosystems 37(1), 3–17 (1996). https://doi.org/10.1016/0303-2647(95)01541-8
Article Google Scholar
Herzing, D.L.: Profiling nonhuman intelligence: An exercise in developing unbiased tools for describing other “types” of intelligence on earth. Acta Astronaut. 94(2), 676–680 (2014). https://doi.org/10.1016/j.actaastro.2013.08.007
Article Google Scholar
Hooker, S.: Moving beyond “algorithmic bias is a data problem”. Patterns 2(4) (2021). https://doi.org/10.1016/j.patter.2021.100241
Hooker, S., Moorosi, N., Clark, G., Bengio, S., Denton, E.: Characterising bias in compressed models (2020). https://arxiv.org/abs/2010.03058v2 Accessed 28 Jun 2021
Leibo, J.Z., Zambaldi, V., Lanctot, M., Marecki, J., Graepel, T.: Multi-agent reinforcement learning in sequential social dilemmas. In: Proceedings of the 16th International Conference on Autonomous Agents and Multiagent Systems, pp. 464–473 (2017)
Google Scholar
Macy, M.W., Flache, A.: Learning dynamics in social dilemmas. Proc. Natl. Acad. Sci. 99(suppl 3), 7229–7236 (2002). https://doi.org/10.1073/pnas.092080099
Article MATH Google Scholar
Patton, D.U., Brunton, D.-W., Dixon, A., Miller, R.J., Leonard, P., Hackman, R.: Stop and frisk online: theorizing everyday racism in digital policing in the use of social media for identification of criminal conduct and associations. Soc. Media + Soc. 3(3) (2017). https://doi.org/10.1177/2056305117733344
Rahwan, I., et al.: Machine behaviour. Nature 568(7753), 477 (2019). https://doi.org/10.1038/s41586-019-1138-y
Article Google Scholar
Rapoport, A., Guyer, M., Gordon, D.G.: The 2 × 2 Game. University of Michigan Press, Ann Arbor (1976)
Google Scholar
Robinson, D., Goforth, D.: The topology of the 2×2 games: a new periodic table. Routledge (2005). https://doi.org/10.4324/9780203340271
Article MATH Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (1st ed.). The MIT Press, Cambridge (1998)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction (2nd edn.). The MIT Press, Cambridge (2018)
Google Scholar
Szepesvári, C.: Algorithms for Reinforcement Learning. Morgan and Claypool Publishers, San Rafael (2010)
Google Scholar
Tan, M.: Multi-agent reinforcement learning: independent vs. cooperative agents. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 330–337 (1993)
Google Scholar
Tversky, A., Kahneman, D.: Judgment under uncertainty: heuristics and biases. Science 185(4157), 1124–1131 (1974). https://doi.org/10.1126/science.185.4157.1124
Article Google Scholar
Waller, R.R., Waller, R.: The machine mind: beyond transparent biases. In: Paper presented at Kinds of Intelligence Workshop Series: Cognitive Science Beyond the Human, Leverhulme Centre for the Future of Intelligence. http://lcfi.ac.uk/projects/kinds-of-intelligence/. 25 June 2021
Winfield, A.F., Michael, K., Pitt, J., Evers, V.: Machine ethics: the design and governance of ethical ai and autonomous systems. Proc. IEEE 107(3), 509–517 (2019). https://doi.org/10.1109/JPROC.2019.2900622
Article Google Scholar

Download references

Acknowledgements

We would like to acknowledge the use of the high-performance computing facilities provided by the Tasmanian Partnership for Advanced Computing (TPAC) funded and hosted by the University of Tasmania. This research is supported by an Australian Government Research Training Program (RTP) Scholarship.

Author information

Authors and Affiliations

University of Tasmania, Hobart, Australia
Simon C. Stanton, Julian Dermoudy & Robert Ollington

Authors

Simon C. Stanton
View author publications
You can also search for this author in PubMed Google Scholar
Julian Dermoudy
View author publications
You can also search for this author in PubMed Google Scholar
Robert Ollington
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Simon C. Stanton .

Editor information

Editors and Affiliations

University of Technology Sydney, Sydney, NSW, Australia
Guodong Long
RMIT University, Melbourne, SA, Australia
Xinghuo Yu
University of Queensland, Brisbane, QLD, Australia
Sen Wang

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Stanton, S.C., Dermoudy, J., Ollington, R. (2022). Representation-Induced Algorithmic Bias. In: Long, G., Yu, X., Wang, S. (eds) AI 2021: Advances in Artificial Intelligence. AI 2022. Lecture Notes in Computer Science(), vol 13151. Springer, Cham. https://doi.org/10.1007/978-3-030-97546-3_9

Download citation

DOI: https://doi.org/10.1007/978-3-030-97546-3_9
Published: 19 March 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-97545-6
Online ISBN: 978-3-030-97546-3
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics