Abstract
Deep neural network (DNN)-based approaches have rapidly gained prominence as a leading method in automating bug assignment, a task that is both time-intensive and critical for effective bug triage in software development. However, DNNs have been shown to be vulnerable - subtle perturbations to their inputs can lead these models to generate unpredictable and erroneous outputs. To mitigate this issue, contrastive learning (CL) has been increasingly adopted. CL is designed to learn discriminative representations by contrasting similar and dissimilar data points. Although CL has demonstrated its effectiveness in domains such as computer vision (CV) and natural language processing (NLP), its application in automating bug assignments remains unexplored. In this paper, we propose a fixer-level supervised contrastive learning method specifically tailored for automated bug assignment, aiming to enhance the robustness and effectiveness of DNN-based bug assignment approaches. Our approach calculates a similarity score between two bug fixers by assessing the semantic similarity of their historically resolved bug reports. By contrasting bug reports resolved by similar fixers (positive examples) against those addressed by different fixers (negative examples) within the same batch, this approach aims to learn robust bug report representations for bug assignment. We conducted an empirical study to evaluate the effectiveness of our approach against two widely used strategies for improving the robustness of neural networks. This evaluation also includes a baseline scenario where no specific strategy is applied, i.e., the original network. For this comparative analysis, we utilized three distinct neural network architectures, i.e., Bidirectional Long Short-Term Memory(Bi-LSTM) using Embeddings from Language Models (ELMo), Bi-LSTM with attention using ELMo, and Bidirectional Encoder Representations from Transformer (BERT), as well as three widely used datasets and a combined dataset composed of the above three datasets. We also investigated the impact of varying important hyperparameters on the considered approaches. Our experimental results show that the proposed approach has varying degrees of improvement over the original networks and those with baselines in terms of top-k (k=1, 5, 10) accuracy and Mean Reciprocal Rank (MRR), ranging from 0.83% to 11.14%. Moreover, all three studied neural networks with the proposed approach have lower degradations than the original networks and those with baselines against the adversarial examples generated by the Projected Gradient Descent (PGD) algorithm. This indicates that our approach can better enhance the robustness of all the networks studied than the baselines. Furthermore, the proposed approach achieves better results in small-size learning settings (5%, 10%, and 15% of the labelled bug reports of each dataset were used for model training).















Similar content being viewed by others
Data Availability Statements
The results and source code related to this study are available at https://github.com/AI4BA/dl4ba.
References
Adomavicius G, Zhang J (2016) Classification, ranking, and top-k stability of recommendation algorithms. INFORMS J Comput 28(1):129–147
Alkhazi B, DiStasi A, Aljedaani W, Alrubaye H, Ye X, Mkaouer MW (2020) Learning to rank developers for bug report assignment. Appl Soft Comput 95:106667
Anvik J, Hiew L, Murphy GC (2006) Who should fix this bug? In: Proceedings of the 28th international conference on software engineering. ICSE ’06, pp 361–370
Anvik J, Murphy GC (2011) Reducing the effort of bug report triage: recommenders for development-oriented decisions. ACM Trans Softw Eng Methodol 20(3):10–11035. https://doi.org/10.1145/2000791.2000794
Aung TWW, Wan Y, Huo H, Sui Y (2022) Multi-triage: a multi-task learning framework for bug triage. J Syst Softw 184:111133
Bird S, Klein E, Loper E (2009) Natural language processing with python: analyzing text with the natural language toolkit. “ O’Reilly Media, Inc."
Chakraborty A, Alam M, Dey V, Chattopadhyay A, Mukhopadhyay D (2021) A survey on adversarial attacks and defences. CAAI Trans Intell Technol 6(1):25–45
Chen ST, Cornelius C, Martin J et al (2019) Shapeshifter: Robust physical adversarial attack on faster R-CNN object detector. In: Machine learning and knowledge discovery in databases: european conference, ECML PKDD 2018. Springer, pp 52–68
Chen T, Kornblith S, Norouzi M, Hinton G (2020) A simple framework for contrastive learning of visual representations. In: Proceedings of the 37th international conference on machine learning. PMLR, pp 1597–1607
Chen J, Zhang R, Mao Y, Xu J (2022a) Contrastnet: A contrastive learning framework for few-shot text classification. In: Proceedings of the AAAI conference on artificial intelligence, vol 36, pp 10492–10500
Cheng X, Zhang G, Wang H, Sui Y (2022b) Path-sensitive code embedding via contrastive learning for software vulnerability detection. In: Proceedings of the 31st ACM SIGSOFT international symposium on software testing and analysis, pp 519–531
Choquette-Choo CA, Sheldon D, Proppe J, Alphonso-Gibbs J, Gupta H (2019) A multi-label, dual-output deep neural network for automated bug triaging. In: Proceedings of the 18th IEEE international conference on machine learning and applications (ICMLA), pp 937–944
Cliff N (1993) Dominance statistics: Ordinal analyses to answer ordinal questions. Psychol Bull 114(3):494
Cubranic D, Murphy GC (2004) Automatic bug triage using text categorization. In: Proceedings of the Sixteenth international conference on software engineering & knowledge engineering (SEKE’2004), Banff, Alberta, Canada, June 20-24, 2004, pp 92–97
Dedík V, Rossi B (2016) Automated bug triaging in an industrial context. In: Proceedings of the 42th euromicro conference on software engineering and advanced applications (SEAA), pp 363–367
Devlin J, Chang M, Lee K, Toutanova K (2019) BERT: Pre-training of deep bidirectional transformers for language understanding. In: Proceedings of the 2019 conference of the north american chapter of the association for computational linguistics: human language technologies, NAACL-HLT 2019, Minneapolis, MN, USA, June 2-7, 2019, Volume 1 (Long and Short Papers), pp. 4171–4186. Association for Computational Linguistics
Dodge J, Ilharco G, Schwartz R, Farhadi A, Hajishirzi H, Smith NA (2020) Fine-tuning pretrained language models: Weight initializations, data orders, and early stopping. CoRR. abs/2002.06305 2002.06305
Dyer C (2014) Notes on noise contrastive estimation and negative sampling. CoRR. abs/1410.8251 1410.8251
Fang H, Wang S, Zhou M, Ding J, Xie P (2020) CERT: Contrastive self-supervised learning for language understanding. arXiv:2005.12766
Fang S, Zhang T, Tan Y, Jiang H, Xia X, Sun X (2023) Representthemall: A universal learning representation of bug reports. In: 2023 IEEE/ACM 45th International conference on software engineering (ICSE), pp 602–614. https://doi.org/10.1109/ICSE48619.2023.00060
Florea A-C, Anvik J, Andonie R (2017) Parallel implementation of a bug report assignment recommender using deep learning. In: Artificial neural networks and machine learning – ICANN 2017. Springer, Cham, pp 64–71
Gao T, Yao X, Chen D (2021) Simcse: Simple contrastive learning of sentence embeddings. In: Proceedings of the 2021 conference on empirical methods in natural language processing, EMNLP 2021, Virtual Event / Punta Cana, Dominican Republic, 7-11 November, 2021. Association for Computational Linguistics, pp 6894–6910
Garg S, Ramakrishnan G (2020) BAE: BERT-based adversarial examples for text classification. In: Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP), pp 6174–6181
Giorgi J, Nitski O, Wang B, Bader G (2021) DeCLUTR: Deep contrastive learning for unsupervised textual representations. In: Proceedings of the 59th annual meeting of the association for computational linguistics and the 11th international joint conference on natural language processing (Volume 1: Long Papers), pp 879–895
Goodfellow IJ, Shlens J, Szegedy C (2015) Explaining and harnessing adversarial examples. In: Proceedings of the 3rd international conference on learning representations, ICLR 2015, San Diego, CA, USA, May 7-9, 2015, Conference Track Proceedings
Goodfellow I, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S, Courville A, Bengio Y (2020) Generative adversarial networks. Commun ACM 63(11):139–144
Gunel B, Du J, Conneau A, Stoyanov V (2021) Supervised contrastive learning for pre-trained language model fine-tuning. In: Proceedings of the 9th International conference on learning representations, ICLR 2021, Virtual Event, Austria, May 3-7, 2021. OpenReview.net
Guo S, Zhang X, Yang X, Chen R, Guo C, Li H, Li T (2020) Developer activity motivated bug triaging: via convolutional neural network. Neural Process Lett 51(3):2589–2606
He K, Fan H, Wu Y, Xie S, Girshick RB (2020) Momentum contrast for unsupervised visual representation learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp 9726–9735. Computer Vision Foundation / IEEE
Jahanshahi H, Cevik M (2022) S-DABT: Schedule and dependency-aware bug triage in open-source bug tracking systems. Inf Softw Technol 151:107025
Jelihovschi EG, Faria JC, Allaman IB (2014) Scottknott: A package for performing the scott-knott clustering algorithm in R. TEMA (Sao Carlos). 15:3–17
Lee D-G, Seo Y-S (2019) Systematic review of bug report processing techniques to improve software management performance. J Inf Process Syst 15(4):967–985
Khosla P, Teterwak P, Wang C, Sarna A, Tian Y, Isola P, Maschinot A, Liu C, Krishnan D (2020) Supervised contrastive learning. In: Advances in neural information processing systems 33: annual conference on neural information processing systems 2020, NeurIPS 2020, December 6-12, 2020, Virtual
Krasner H (2022) The cost of poor software quality in the US: A 2022 Report. Technical report, Consortium for Information & Software Quality. https://www.synopsys.com/content/dam/synopsys/sig-assets/reports/cpsq-report-nov-22-1.pdf
Lee S-R, Heo M-J, Lee C-G, Kim M, Jeong G (2017) Applying deep learning based automatic bug triager to industrial projects. In: Proceedings of the 11th joint meeting on foundations of software engineering, pp 926–931
Luo F, Yang P, Li S, Ren X, Sun X (2020) CAPT: Contrastive pre-training for learning denoised sequence representations. arXiv:2010.06351
Madry A, Makelov A, Schmidt L, Tsipras D, Vladu A (2018) Towards deep learning models resistant to adversarial attacks. In: 6th International conference on learning representations, ICLR 2018, Vancouver, BC, Canada, April 30 - May 3, 2018, Conference Track Proceedings. OpenReview.net,
Mani S, Sankaran A, Aralikatte R (2019) Deeptriage: Exploring the effectiveness of deep learning for bug triaging. In: Proceedings of the ACM India joint international conference on data science and management of data, pp 171–179
Matter D, Kuhn A, Nierstrasz O (2009) Assigning bug reports using a vocabulary-based expertise model of developers. In: Proceedings of the 6th international working conference on mining software repositories, pp 131–140
Meng Y, Xiong C, Bajaj P, Bennett P, Han J, Song X et al (2021) COCO-LM: Correcting and contrasting text sequences for language model pretraining. Adv Neural Inf Process Syst 34:23102–23114
Mohammadi S, Chapon M (2020) Investigating the performance of fine-tuned text classification models based-on BERT. In: 2020 IEEE 22nd International Conference on High Performance Computing and Communications; IEEE 18th International Conference on Smart City; IEEE 6th International Conference on Data Science and Systems (HPCC/SmartCity/DSS). IEEE, pp 1252–1257
Nowak J, Taspinar A, Scherer R (2017) LSTM recurrent neural networks for short text and sentiment classification. In: Artificial intelligence and soft computing. Springer, Cham, pp 553–562
Ozdag M (2018) Adversarial attacks and defenses against deep neural networks: a survey. Procedia Comput Sci 140:152–161
Papernot N, McDaniel P, Jha S, Fredrikson M, Celik ZB, Swami A (2016) The limitations of deep learning in adversarial settings. In: 2016 IEEE European symposium on security and privacy (EuroS &P). IEEE, pp 372–387
Peters ME, Neumann M, Iyyer M, Gardner M, Clark C, Lee K, Zettlemoyer L (2018) Deep contextualized word representations. In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers). Association for Computational Linguistics, New Orleans, Louisiana, pp 2227–2237
Prenner JA, Robbes R (2021) Making the most of small software engineering datasets with modern machine learning. IEEE Trans Softw Eng 48(12):5050–5067
Prenner JA, Robbes R (2022) Making the most of small software engineering datasets with modern machine learning. IEEE Trans Softw Eng 48(12):5050–5067. https://doi.org/10.1109/TSE.2021.3135465
Reimers N, Gurevych I (2019) Sentence-BERT: Sentence embeddings using siamese BERT-networks. In: Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th International joint conference on natural language processing, EMNLP-IJCNLP 2019, Hong Kong, China, November 3-7, 2019. Association for Computational Linguistics, pp 3980–3990
Romano J, Kromrey JD (2006) Appropriate statistics for ordinal level data: Should we really be using t-test and Cohen’s d for evaluating group differences on the nsse and other surveys. In: the Annual meeting of the florida association of institutional research, pp 1–31
Shafahi A, Najibi M, Ghiasi MA, Xu Z, Dickerson J, Studer C, Davis LS, Taylor G, Goldstein T (2019) Adversarial training for free! In: Wallach H, Larochelle H, Beygelzimer A, Alché-Buc F, Fox E, Garnett R (eds) Advances in Neural Information Processing Systems, vol. 32. Curran Associates, Inc.,. https://proceedings.neurips.cc/paper_files/paper/2019/file/7503cfacd12053d309b6bed5c89de212-Paper.pdf
Song D, Eykholt K, Evtimov I, Fernandes E, Li B, Rahmati A, Tramer F, Prakash A, Kohno T (2018) Physical adversarial examples for object detectors. In: 12th USENIX Workshop on Offensive Technologies (WOOT 18)
Sun X, Yang H, Xia X, Li B (2017) Enhancing developer recommendation with supplementary information via mining historical commits. J Syst Softw 134:355–368
Tian Y, Wijedasa D, Lo D, Le Goues C (2016) Learning to rank for bug report assignee recommendation. In: 2016 IEEE 24th International Conference on Program Comprehension (ICPC), pp 1–10. https://doi.org/10.1109/ICPC.2016.7503715
Tsai YT, Yang MC, Chen HY (2019) Adversarial attack on sentiment classification. In: Proceedings of the 2019 ACL Workshop BlackboxNLP: analyzing and interpreting neural networks for NLP, pp 233–240
Oord A, Li Y, Vinyals O (2018) Representation learning with contrastive predictive coding. CoRR. abs/1807.03748 1807.03748
Voorhees EM et al (1999) The trec-8 question answering track report. In: Trec, vol. 99, pp 77–82
Wang X, Yang Y, Deng Y, He K (2021) Adversarial training with fast gradient projection method against synonym substitution based text attacks. In: Proceedings of the AAAI conference on artificial intelligence, vol 35, pp 13997–14005
Wang L, Xu X, Ouyang K, Duan H, Lu Y, Zheng H-T (2022) Self-supervised dual-channel attentive network for session-based social recommendation. In: 2022 IEEE 38th International conference on data engineering (ICDE), pp 2034–2045
Wang W-Y, Wu C-H, He J (2023) Clebpi: Contrastive learning for bug priority inference. Inf Softw Technol 164:107302. https://doi.org/10.1016/j.infsof.2023.107302
Wang R, Ji X, Xu S, Tian Y, Jiang S, Huang R (2024) An empirical assessment of different word embedding and deep learning models for bug assignment. J Syst Softw 210:111961. https://doi.org/10.1016/j.jss.2024.111961
Wu L, Li J, Wang Y et al (2021) R-drop: Regularized dropout for neural networks. Adv Neural Inf Process Syst 34:10890–10905
Xia X, Lo D, Wang X, Zhou B (2015) Dual analysis for recommending developers to resolve bugs. J Softw Evol Process 27(3):195–220
Xiao Su RW, Dai X (2022) Contrastive learning-enhanced nearest neighbor mechanism for multi-label text classification. In: Proceedings of the 60th annual meeting of the association for computational linguistics (Volume 2: Short Papers), pp 672–679
Xi S, Yao Y, Xiao X, Xu F, Lu J (2018) An effective approach for routing the bug reports to the right fixers. In: Proceedings of the 10th asia-pacific symposium on internetware. Internetware ’18
Xuan J, Jiang H, Hu Y, Ren Z, Zou W, Luo Z, Wu X (2014) Towards effective bug triage with software data reduction techniques. IEEE Trans Knowl Data Eng 27(1):264–280
Yang X, Steck H, Guo Y, Liu Y (2012) On top-k recommendation using social networks. In: Proceedings of the 6th ACM conference on recommender systems, pp 67–74
Yang G, Zhang T, Lee B (2014) Towards semi-automatic bug triage and severity prediction based on topic model and multi-feature of bug reports. In: Proceedings of the 38th annual computer software and applications conference. IEEE, pp 97–106
Yin Y, Dong X, Xu T (2018) Rapid and efficient bug assignment using ELM for IOT software. IEEE Access 6:52713–52724
Yuan X, He P, Zhu Q, Li X (2019) Adversarial examples: attacks and defenses for deep learning. IEEE Trans Neural Netw Learn Syst 30(9):2805–2824
Yu X, Wan F, Tang B, Zhan D, Peng Q, Yu M, Wang Z, Cui S (2021) Deep bug triage model based on multi-head self-attention mechanism. In: CCF Conference on computer supported cooperative work and social computing. Springer, pp 107–119
Zhang T, Yang G, Lee B, Lua EK (2014) A novel developer ranking algorithm for automatic bug triage using topic model and developer relations. In: 2014 21st Asia-Pacific Software Engineering Conference, vol. 1, pp 223–230. https://doi.org/10.1109/APSEC.2014.43
Zhang T, Jiang H, Luo X, Chan ATS (2016a) A literature review of research in bug resolution: tasks, challenges and future directions. Comput J 59(5):741–773. https://doi.org/10.1093/comjnl/bxv114
Zhang T, Chen J, Yang G, Lee B, Luo X (2016b) Towards more accurate severity prediction and fixer recommendation of software bugs. J Syst Softw 117:166–184. https://doi.org/10.1016/j.jss.2016.02.034
Zhang T, Chen J, Jiang H, Luo X, Xia X (2017) Bug report enrichment with application of automated fixer recommendation. In: 2017 IEEE/ACM 25th International conference on program comprehension (ICPC), pp 230–240
Zhang C, Chen B, Chen L, Peng X, Zhao W (2019) A large-scale empirical study of compiler errors in continuous integration. In: Proceedings of the 2019 27th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering. ESEC/FSE 2019, pp. 176–187. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3338906.3338917
Zhang WE, Sheng QZ, Alhazmi A, Li C (2020) Adversarial attacks on deep-learning models in natural language processing: a survey. ACM Trans Intell Syst Technol 11(3):1–41
Zheng Z, Hong P (2018) Robust detection of adversarial attacks by modeling the intrinsic properties of deep neural networks. In: Advances in neural information processing systems 31: annual conference on neural information processing systems 2018, NeurIPS 2018, December 3-8, 2018, Montréal, Canada, pp 7924–7933
Acknowledgements
The authors would like to thank anonymous reviewers for their valuable comments and helpful suggestions. This work is partially supported by the National Natural Science Foundation of China under Grant NO.61673384.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interest
The authors declare that they have no conflict of interest.
Additional information
Communicated by: Christoph Treude.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Wang, R., Ji, X., Tian, Y. et al. Fixer-level supervised contrastive learning for bug assignment. Empir Software Eng 30, 76 (2025). https://doi.org/10.1007/s10664-025-10634-0
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-025-10634-0