Skip to main content

Do Language Models Understand Morality? Towards a Robust Detection of Moral Content

  • Conference paper
  • First Online:
Value Engineering in Artificial Intelligence (VALE 2023)

Abstract

The task of detecting moral values in text has significant implications in various fields, including natural language processing, social sciences, and ethical decision-making. Previously proposed supervised models often suffer from overfitting, leading to hyper-specialized moral classifiers that struggle to perform well on data from different domains. To address this issue, we introduce novel systems that leverage abstract concepts and common-sense knowledge acquired from Large Language Models (LLMs) and Natural Language Inference (NLI) models during previous stages of training on multiple data sources. By doing so, we aim to develop versatile and robust methods for detecting moral values in real-world scenarios. Our approach uses the GPT-based Davinci model as a zero-shot ready-made unsupervised multi-label classifier for moral values detection, eliminating the need for explicit training on labeled data. To assess the performance and versatility of this method, we compare it with a smaller NLI-based zero-shot model. The results show that the NLI approach achieves competitive results compared to the Davinci model. Furthermore, we conduct an in-depth investigation of the performance of supervised systems in the context of cross-domain multi-label moral value detection. This involves training supervised models on different domains to explore their effectiveness in handling data from different sources and comparing their performance with the unsupervised methods. Our contributions encompass a thorough analysis of both supervised and unsupervised methodologies for cross-domain value detection. We introduce the Davinci model as a state-of-the-art zero-shot unsupervised moral values classifier, pushing the boundaries of moral value detection without the need for explicit training on labeled data. Additionally, we perform a comparative evaluation of our approach with the supervised models, shedding light on their respective strengths and weaknesses.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 54.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 64.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    All materials and code are accessible at https://github.com/LuanaBulla/Detection-of-Morality-in-Text/tree/main.

  2. 2.

    https://huggingface.co/roberta-large-mnli.

  3. 3.

    https://huggingface.co/roberta-large.

  4. 4.

    https://pytorch.org/docs/stable/generated/torch.nn.MultiLabelSoftMarginLoss.html.

References

  1. Asprino, L., Bulla, L., De Giorgis, S., Gangemi, A., Marinucci, L., Mongiovì, M.: Uncovering values: detecting latent moral content from natural language with explainable and non-trained methods. In: Proceedings of Deep Learning Inside Out (DeeLIO 2022): The 3rd Workshop on Knowledge Extraction and Integration for Deep Learning Architectures, pp. 33–41 (2022)

    Google Scholar 

  2. Atari, M., Haidt, J., Graham, J., Koleva, S., Stevens, S.T., Dehghani, M.: Morality beyond the weird: how the nomological network of morality varies across cultures. J. Personal. Soc. Psychol. (2022)

    Google Scholar 

  3. Brown, T., et al.: Language models are few-shot learners. Adv. Neural Inf. Process. Syst. 33, 1877–1901 (2020)

    Google Scholar 

  4. Bulla, L., Gangemi, A., et al.: Towards distribution-shift robust text classification of emotional content. In: Findings of the Association for Computational Linguistics: ACL 2023, pp. 8256–8268 (2023)

    Google Scholar 

  5. Fulgoni, D., Carpenter, J., Ungar, L., Preoţiuc-Pietro, D.: An empirical exploration of moral foundations theory in partisan news sources. In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC’16), pp. 3730–3736 (2016)

    Google Scholar 

  6. Garten, J., Boghrati, R., Hoover, J., Johnson, K.M., Dehghani, M.: Morality between the lines: detecting moral sentiment in text. In: Proceedings of IJCAI 2016 workshop on Computational Modeling of Attitudes (2016)

    Google Scholar 

  7. Graham, J., et al.: Moral foundations theory: the pragmatic validity of moral pluralism. In: Advances in Experimental Social Psychology, vol. 47, pp. 55–130. Elsevier (2013)

    Google Scholar 

  8. Guo, S., Mokhberian, N., Lerman, K.: A data fusion framework for multi-domain morality learning. In: Proceedings of the International AAAI Conference on Web and Social Media, vol. 17, pp. 281–291 (2023)

    Google Scholar 

  9. Hoover, J., et al.: Moral foundations twitter corpus: a collection of 35k tweets annotated for moral sentiment. Soc. Psychol. Personal. Sci. 11(8), 1057–1071 (2020)

    Google Scholar 

  10. Hopp, F.R., Fisher, J.T., Cornell, D., Huskey, R., Weber, R.: The extended moral foundations dictionary (eMFD): development and applications of a crowd-sourced approach to extracting moral intuitions from text. Behav. Res. Methods 53(1), 232–246 (2021)

    Google Scholar 

  11. Huang, X., Wormley, A., Cohen, A.: Learning to adapt domain shifts of moral values via instance weighting. In: Proceedings of the 33rd ACM Conference on Hypertext and Social Media, pp. 121–131 (2022)

    Google Scholar 

  12. Hulpuş, I., Kobbe, J., Stuckenschmidt, H., Hirst, G.: Knowledge graphs meet moral values. In: Proceedings of the Ninth Joint Conference on Lexical and Computational Semantics, pp. 71–80 (2020)

    Google Scholar 

  13. Kennedy, B., et al.: Moral concerns are differentially observable in language. Cognition 212, 104696 (2021)

    Google Scholar 

  14. Kobbe, J., Rehbein, I., Hulpus, I., Stuckenschmidt, H.: Exploring morality in argumentation. In: Proceedings of the 7th Workshop on Argument Mining. Association for Computational Linguistics, ACL (2020)

    Google Scholar 

  15. Kwak, H., An, J., Jing, E., Ahn, Y.Y.: Frameaxis: characterizing microframe bias and intensity with word embedding. PeerJ Comput. Sci. 7, e644 (2021)

    Google Scholar 

  16. Liscio, E., et al.: What does a text classifier learn about morality? An explainable method for cross-domain comparison of moral rhetoric. In: Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 14113–14132 (2023)

    Google Scholar 

  17. Liscio, E., Dondera, A., Geadau, A., Jonker, C., Murukannaiah, P.: Cross-domain classification of moral values. In: Findings of the Association for Computational Linguistics: NAACL 2022, pp. 2727–2745 (2022)

    Google Scholar 

  18. Liu, Y., et al.: Roberta: a robustly optimized bert pretraining approach. arXiv preprint arXiv:1907.11692 (2019)

  19. van Luenen, A.F.: Recognising moral foundations in online extremist discourse: a cross-domain classification study (2020)

    Google Scholar 

  20. Lundberg, S.M., Lee, S.I.: A unified approach to interpreting model predictions. Adv. Neural Inf. Process. Syst. 30 (2017)

    Google Scholar 

  21. Mokhberian, N., Abeliuk, A., Cummings, P., Lerman, K.: Moral framing and ideological bias of news. In: Aref, S., et al. (eds.) SocInfo 2020. LNCS, vol. 12467, pp. 206–219. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-60975-7_16

    Google Scholar 

  22. Priniski, J.H., et al.: Mapping moral valence of tweets following the killing of george floyd. arXiv preprint arXiv:2104.09578 (2021)

  23. Trager, J., et al.: The moral foundations reddit corpus. arXiv preprint arXiv:2208.05545 (2022)

  24. Williams, A., Nangia, N., Bowman, S.R.: A broad-coverage challenge corpus for sentence understanding through inference. arXiv preprint arXiv:1704.05426 (2017)

Download references

Acknowledgments

We acknowledge financial support from the H2020 projects TAILOR: Foundations of Trustworthy AI - Integrating Reasoning, Learning and Optimization – EC Grant Agreement number 952215 – and the Italian PNRR MUR project PE0000013–FAIR: Future Artificial Intelligence Research.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Luana Bulla .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Bulla, L., Gangemi, A., Mongiovì, M. (2024). Do Language Models Understand Morality? Towards a Robust Detection of Moral Content. In: Osman, N., Steels, L. (eds) Value Engineering in Artificial Intelligence. VALE 2023. Lecture Notes in Computer Science(), vol 14520. Springer, Cham. https://doi.org/10.1007/978-3-031-58202-8_7

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-58202-8_7

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-58204-2

  • Online ISBN: 978-3-031-58202-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics