Do Language Models Understand Morality? Towards a Robust Detection of Moral Content

Bulla, Luana; Gangemi, Aldo; Mongiovì, Misael

doi:10.1007/978-3-031-58202-8_7

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 14520))

Included in the following conference series:

International Workshop on Value Engineering in AI

291 Accesses
3 Citations

Abstract

The task of detecting moral values in text has significant implications in various fields, including natural language processing, social sciences, and ethical decision-making. Previously proposed supervised models often suffer from overfitting, leading to hyper-specialized moral classifiers that struggle to perform well on data from different domains. To address this issue, we introduce novel systems that leverage abstract concepts and common-sense knowledge acquired from Large Language Models (LLMs) and Natural Language Inference (NLI) models during previous stages of training on multiple data sources. By doing so, we aim to develop versatile and robust methods for detecting moral values in real-world scenarios. Our approach uses the GPT-based Davinci model as a zero-shot ready-made unsupervised multi-label classifier for moral values detection, eliminating the need for explicit training on labeled data. To assess the performance and versatility of this method, we compare it with a smaller NLI-based zero-shot model. The results show that the NLI approach achieves competitive results compared to the Davinci model. Furthermore, we conduct an in-depth investigation of the performance of supervised systems in the context of cross-domain multi-label moral value detection. This involves training supervised models on different domains to explore their effectiveness in handling data from different sources and comparing their performance with the unsupervised methods. Our contributions encompass a thorough analysis of both supervised and unsupervised methodologies for cross-domain value detection. We introduce the Davinci model as a state-of-the-art zero-shot unsupervised moral values classifier, pushing the boundaries of moral value detection without the need for explicit training on labeled data. Additionally, we perform a comparative evaluation of our approach with the supervised models, shedding light on their respective strengths and weaknesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 54.99; Price excludes VAT (USA)

Softcover Book: USD 64.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

A survey on moral foundation theory and pre-trained language models: current advances and challenges

Article Open access 24 March 2025

The extended Moral Foundations Dictionary (eMFD): Development and applications of a crowd-sourced approach to extracting moral intuitions from text

Article 14 July 2020

The Quarrel of Local Post-hoc Explainers for Moral Values Classification in Natural Language Processing

Notes

1.
All materials and code are accessible at https://github.com/LuanaBulla/Detection-of-Morality-in-Text/tree/main.
2.
https://huggingface.co/roberta-large-mnli.
3.
https://huggingface.co/roberta-large.
4.
https://pytorch.org/docs/stable/generated/torch.nn.MultiLabelSoftMarginLoss.html.

References

Asprino, L., Bulla, L., De Giorgis, S., Gangemi, A., Marinucci, L., Mongiovì, M.: Uncovering values: detecting latent moral content from natural language with explainable and non-trained methods. In: Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp. 33–41 (2022)
Google Scholar
Atari, M., Haidt, J., Graham, J., Koleva, S., Stevens, S.T., Dehghani, M.: Morality beyond the weird: how the nomological network of morality varies across cultures. J. Personal. Soc. Psychol. (2022)
Google Scholar
Brown, T., et al.: Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020)
Google Scholar
Bulla, L., Gangemi, A., et al.: Towards distribution-shift robust text classification of emotional content. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 8256–8268 (2023)
Google Scholar
Fulgoni, D., Carpenter, J., Ungar, L., Preoţiuc-Pietro, D.: An empirical exploration of moral foundations theory in partisan news sources. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp. 3730–3736 (2016)
Google Scholar
Garten, J., Boghrati, R., Hoover, J., Johnson, K.M., Dehghani, M.: Morality between the lines: detecting moral sentiment in text. In: Proceedings of IJCAI 2016 workshop on Computational Modeling of Attitudes (2016)
Google Scholar
Graham, J., et al.: Moral foundations theory: the pragmatic validity of moral pluralism. In: Advances in Experimental Social Psychology, vol. 47, pp. 55–130. Elsevier (2013)
Google Scholar
Guo, S., Mokhberian, N., Lerman, K.: A data fusion framework for multi-domain morality learning. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 17, pp. 281–291 (2023)
Google Scholar
Hoover, J., et al.: Moral foundations twitter corpus: a collection of 35k tweets annotated for moral sentiment. Soc. Psychol. Personal. Sci. 11(8), 1057–1071 (2020)
Google Scholar
Hopp, F.R., Fisher, J.T., Cornell, D., Huskey, R., Weber, R.: The extended moral foundations dictionary (eMFD): development and applications of a crowd-sourced approach to extracting moral intuitions from text. Behav. Res. Methods 53(1), 232–246 (2021)
Google Scholar
Huang, X., Wormley, A., Cohen, A.: Learning to adapt domain shifts of moral values via instance weighting. In: Proceedings of the 33rd ACM Conference on Hypertext and Social Media, pp. 121–131 (2022)
Google Scholar
Hulpuş, I., Kobbe, J., Stuckenschmidt, H., Hirst, G.: Knowledge graphs meet moral values. In: Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics, pp. 71–80 (2020)
Google Scholar
Kennedy, B., et al.: Moral concerns are differentially observable in language. Cognition 212, 104696 (2021)
Google Scholar
Kobbe, J., Rehbein, I., Hulpus, I., Stuckenschmidt, H.: Exploring morality in argumentation. In: Proceedings of the 7th Workshop on Argument Mining. Association for Computational Linguistics, ACL (2020)
Google Scholar
Kwak, H., An, J., Jing, E., Ahn, Y.Y.: Frameaxis: characterizing microframe bias and intensity with word embedding. PeerJ Comput. Sci. 7, e644 (2021)
Google Scholar
Liscio, E., et al.: What does a text classifier learn about morality? An explainable method for cross-domain comparison of moral rhetoric. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 14113–14132 (2023)
Google Scholar
Liscio, E., Dondera, A., Geadau, A., Jonker, C., Murukannaiah, P.: Cross-domain classification of moral values. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 2727–2745 (2022)
Google Scholar
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
van Luenen, A.F.: Recognising moral foundations in online extremist discourse: a cross-domain classification study (2020)
Google Scholar
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017)
Google Scholar
Mokhberian, N., Abeliuk, A., Cummings, P., Lerman, K.: Moral framing and ideological bias of news. In: Aref, S., et al. (eds.) SocInfo 2020. LNCS, vol. 12467, pp. 206–219. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60975-7_16
Google Scholar
Priniski, J.H., et al.: Mapping moral valence of tweets following the killing of george floyd. arXiv preprint arXiv:2104.09578 (2021)
Trager, J., et al.: The moral foundations reddit corpus. arXiv preprint arXiv:2208.05545 (2022)
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017)

Download references

Acknowledgments

We acknowledge financial support from the H2020 projects TAILOR: Foundations of Trustworthy AI - Integrating Reasoning, Learning and Optimization – EC Grant Agreement number 952215 – and the Italian PNRR MUR project PE0000013–FAIR: Future Artificial Intelligence Research.

Author information

Authors and Affiliations

University of Catania, Catania, Italy
Luana Bulla & Misael Mongiovì
ISTC - National Research Council, Rome, Catania, Italy
Luana Bulla, Aldo Gangemi & Misael Mongiovì
University of Bologna, Bologna, Italy
Aldo Gangemi

Authors

Luana Bulla
View author publications
You can also search for this author in PubMed Google Scholar
Aldo Gangemi
View author publications
You can also search for this author in PubMed Google Scholar
Misael Mongiovì
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Luana Bulla .

Editor information

Editors and Affiliations

Artificial Intelligence Research Institute, Bellaterra, Spain
Nardine Osman
Studio Stelluti, Brussel, Belgium
Luc Steels

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Bulla, L., Gangemi, A., Mongiovì, M. (2024). Do Language Models Understand Morality? Towards a Robust Detection of Moral Content. In: Osman, N., Steels, L. (eds) Value Engineering in Artificial Intelligence. VALE 2023. Lecture Notes in Computer Science(), vol 14520. Springer, Cham. https://doi.org/10.1007/978-3-031-58202-8_7

Download citation

DOI: https://doi.org/10.1007/978-3-031-58202-8_7
Published: 22 August 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58204-2
Online ISBN: 978-3-031-58202-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Do Language Models Understand Morality? Towards a Robust Detection of Moral Content