Abstract
The task of detecting moral values in text has significant implications in various fields, including natural language processing, social sciences, and ethical decision-making. Previously proposed supervised models often suffer from overfitting, leading to hyper-specialized moral classifiers that struggle to perform well on data from different domains. To address this issue, we introduce novel systems that leverage abstract concepts and common-sense knowledge acquired from Large Language Models (LLMs) and Natural Language Inference (NLI) models during previous stages of training on multiple data sources. By doing so, we aim to develop versatile and robust methods for detecting moral values in real-world scenarios. Our approach uses the GPT-based Davinci model as a zero-shot ready-made unsupervised multi-label classifier for moral values detection, eliminating the need for explicit training on labeled data. To assess the performance and versatility of this method, we compare it with a smaller NLI-based zero-shot model. The results show that the NLI approach achieves competitive results compared to the Davinci model. Furthermore, we conduct an in-depth investigation of the performance of supervised systems in the context of cross-domain multi-label moral value detection. This involves training supervised models on different domains to explore their effectiveness in handling data from different sources and comparing their performance with the unsupervised methods. Our contributions encompass a thorough analysis of both supervised and unsupervised methodologies for cross-domain value detection. We introduce the Davinci model as a state-of-the-art zero-shot unsupervised moral values classifier, pushing the boundaries of moral value detection without the need for explicit training on labeled data. Additionally, we perform a comparative evaluation of our approach with the supervised models, shedding light on their respective strengths and weaknesses.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others

Notes
References
Asprino, L., Bulla, L., De Giorgis, S., Gangemi, A., Marinucci, L., Mongiovì, M.: Uncovering values: detecting latent moral content from natural language with explainable and non-trained methods. In: Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp. 33–41 (2022)
Atari, M., Haidt, J., Graham, J., Koleva, S., Stevens, S.T., Dehghani, M.: Morality beyond the weird: how the nomological network of morality varies across cultures. J. Personal. Soc. Psychol. (2022)
Brown, T., et al.: Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020)
Bulla, L., Gangemi, A., et al.: Towards distribution-shift robust text classification of emotional content. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 8256–8268 (2023)
Fulgoni, D., Carpenter, J., Ungar, L., Preoţiuc-Pietro, D.: An empirical exploration of moral foundations theory in partisan news sources. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp. 3730–3736 (2016)
Garten, J., Boghrati, R., Hoover, J., Johnson, K.M., Dehghani, M.: Morality between the lines: detecting moral sentiment in text. In: Proceedings of IJCAI 2016 workshop on Computational Modeling of Attitudes (2016)
Graham, J., et al.: Moral foundations theory: the pragmatic validity of moral pluralism. In: Advances in Experimental Social Psychology, vol. 47, pp. 55–130. Elsevier (2013)
Guo, S., Mokhberian, N., Lerman, K.: A data fusion framework for multi-domain morality learning. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 17, pp. 281–291 (2023)
Hoover, J., et al.: Moral foundations twitter corpus: a collection of 35k tweets annotated for moral sentiment. Soc. Psychol. Personal. Sci. 11(8), 1057–1071 (2020)
Hopp, F.R., Fisher, J.T., Cornell, D., Huskey, R., Weber, R.: The extended moral foundations dictionary (eMFD): development and applications of a crowd-sourced approach to extracting moral intuitions from text. Behav. Res. Methods 53(1), 232–246 (2021)
Huang, X., Wormley, A., Cohen, A.: Learning to adapt domain shifts of moral values via instance weighting. In: Proceedings of the 33rd ACM Conference on Hypertext and Social Media, pp. 121–131 (2022)
Hulpuş, I., Kobbe, J., Stuckenschmidt, H., Hirst, G.: Knowledge graphs meet moral values. In: Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics, pp. 71–80 (2020)
Kennedy, B., et al.: Moral concerns are differentially observable in language. Cognition 212, 104696 (2021)
Kobbe, J., Rehbein, I., Hulpus, I., Stuckenschmidt, H.: Exploring morality in argumentation. In: Proceedings of the 7th Workshop on Argument Mining. Association for Computational Linguistics, ACL (2020)
Kwak, H., An, J., Jing, E., Ahn, Y.Y.: Frameaxis: characterizing microframe bias and intensity with word embedding. PeerJ Comput. Sci. 7, e644 (2021)
Liscio, E., et al.: What does a text classifier learn about morality? An explainable method for cross-domain comparison of moral rhetoric. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 14113–14132 (2023)
Liscio, E., Dondera, A., Geadau, A., Jonker, C., Murukannaiah, P.: Cross-domain classification of moral values. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 2727–2745 (2022)
Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)
van Luenen, A.F.: Recognising moral foundations in online extremist discourse: a cross-domain classification study (2020)
Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017)
Mokhberian, N., Abeliuk, A., Cummings, P., Lerman, K.: Moral framing and ideological bias of news. In: Aref, S., et al. (eds.) SocInfo 2020. LNCS, vol. 12467, pp. 206–219. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60975-7_16
Priniski, J.H., et al.: Mapping moral valence of tweets following the killing of george floyd. arXiv preprint arXiv:2104.09578 (2021)
Trager, J., et al.: The moral foundations reddit corpus. arXiv preprint arXiv:2208.05545 (2022)
Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017)
Acknowledgments
We acknowledge financial support from the H2020 projects TAILOR: Foundations of Trustworthy AI - Integrating Reasoning, Learning and Optimization – EC Grant Agreement number 952215 – and the Italian PNRR MUR project PE0000013–FAIR: Future Artificial Intelligence Research.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Bulla, L., Gangemi, A., Mongiovì, M. (2024). Do Language Models Understand Morality? Towards a Robust Detection of Moral Content. In: Osman, N., Steels, L. (eds) Value Engineering in Artificial Intelligence. VALE 2023. Lecture Notes in Computer Science(), vol 14520. Springer, Cham. https://doi.org/10.1007/978-3-031-58202-8_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-58202-8_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-58204-2
Online ISBN: 978-3-031-58202-8
eBook Packages: Computer ScienceComputer Science (R0)