Abstract
Software code review, i.e., the practice of having other team members critique changes to a software system, is a well-established best practice in both open source and proprietary software domains. Prior work has shown that formal code inspections tend to improve the quality of delivered software. However, the formal code inspection process mandates strict review criteria (e.g., in-person meetings and reviewer checklists) to ensure a base level of review quality, while the modern, lightweight code reviewing process does not. Although recent work explores the modern code review process, little is known about the relationship between modern code review practices and long-term software quality. Hence, in this paper, we study the relationship between post-release defects (a popular proxy for long-term software quality) and: (1) code review coverage, i.e., the proportion of changes that have been code reviewed, (2) code review participation, i.e., the degree of reviewer involvement in the code review process, and (3) code reviewer expertise, i.e., the level of domain-specific expertise of the code reviewers. Through a case study of the Qt, VTK, and ITK projects, we find that code review coverage, participation, and expertise share a significant link with software quality. Hence, our results empirically confirm the intuition that poorly-reviewed code has a negative impact on software quality in large systems using modern reviewing tools.













Similar content being viewed by others
Notes
References
Bacchelli A, Bird C (2013) Expectations, Outcomes, and Challenges of Modern Code Review. In: Proceedings of the 35th Int’l Conference on Software Engineering (ICSE), pp 712–721
Baysal O, Kononenko O, Holmes R, Godfrey MW (2013) The Influence of Non-technical Factors on Code Review. In: Proceedings of the 20th Working Conference on Reverse Engineering (WCRE), pp 122–131
Beller M , Bacchelli A, Zaidman A, Juergens E (2014) Modern Code Reviews in Open-Source Projects: Which Problems Do They Fix?. In: Proceedings of the 11th Working Conference on Mining Software Repositories (MSR), pp 202–211
Bettenburg N, Hassan A E, Adams B, German DM (2014) Management of community contributions: A case study on the Android and Linux software ecosystems. Empirical Software Engineering, To appear
Bird C, Nagappan N, Murphy B, Gall H, Devanbu P (2011) Don’t Touch My Code! Examining the Effects of Ownership on Software Quality. In: Proceedings of the 8th joint meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 4–14
Chambers J M, Hastie T J (eds) (1992) Statistical Models in S, Wadsworth and Brooks/Cole, chap 4
Efron B (1986) How Biased is the Apparent Error Rate of a Prediction Rule. J Am Stat Assoc 81(394):461—470
Fagan M E (1976) Design and Code Inspections to Reduce Errors in Program Development. IBM Syst J 15(3):182–211
Graves T L, Karr A F, Marron J S, Siy H (2000) Predicting Fault Incidence using Software Change History. Trans Softw Eng (TSE) 26(7):653–661
Hamasaki K, Kula RG, Yoshida N, Cruz AEC, Fujiwara K, Iida H (2013) Who Does What during a Code Review? Datasets of OSS Peer Review Repositories
Harrell FE Jr (2002) Regression Modeling Strategies, 1st edn. Springer
Harrell FE Jr (2014) rms: Regression Modeling Strategies. http://biostat.mc.vanderbilt.edu/rms, r package version 4.2-1
Harrell FE Jr, Lee KL, Califf RM, Pryor DB, Rosati RA (1984) Regression modelling strategies for improved prognostic prediction. Stat Med 3(2):143–152
Harrell FE Jr, Lee KL, Matchar DB, Reichert TA (1985) Regression models for prognostic prediction: advantages, problems, and suggested solutions. Cancer Treatment Reports 69(10):1071–1077
Hassan AE (2008) Automated Classification of Change Messages in Open Source Projects. In: Proceedings of the 23rd Int’l Symposium on Applied Computing (SAC), pp 837–841
Hassan AE (2009) Predicting Faults Using the Complexity of Code Changes. In: Proceedings of the 31st Int’l Conference on Software Engineering (ICSE), pp 78–88
Hastie T, Tibshirani R, Friedman J (2009) Elements of Statistical Learning. 2nd edn. Springer
Herraiz I, German DM, Gonzalez-Barahona JM, Robles G (2008) Towards a Simplification of the Bug Report form in Eclipse. In: Proceedings of the 5th Working Conference on Mining Software Repositories (MSR), pp 145–148
Jiang Y, Adams B, German DM (2013) Will My Patch Make It? And How Fast?: Case Study on the Linux Kernel. In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR), pp 101–110
Kamei Y, Matsumoto S, Monden A, ichi Matsumoto K, Adams B, Hassan AE (2010) Revisiting Common Bug Prediction Findings Using Effort-Aware Models. In: Proceedings of the 26th Int’l Conference on Software Maintenance (ICSM), pp 1–10
Kamei Y, Shihab E, Adams B, Hassan A E, Mockus A, Sinha A, Ubayashi N (2013) A Large-Scale Empirical Study of Just-in-Time Quality Assurance. Trans Softw Eng (TSE) 39(6):757–773
Kemerer CF, Paulk MC (2009) The Impact of Design and Code Reviews on Software Quality: An Empirical Study Based on PSP Data. Trans Softw Eng (TSE) 35 (4):534–550
Kim S, Whitehead EJ Jr, Zhang Y (2008) Classifying software changes: Clean or buggy. Trans Softw Eng (TSE) 34(2):181–196
Koru AG, Zhang D, Emam KE, Liu H (2009) An Investigation into the Functional Form of the Size-Defect Relationship for Software Modules. Trans Softw Eng (TSE) 35(2):293–304
Mäntylä M V, Lassenius C (2009) What Types of Defects Are Really Discovered in Code Reviews. Trans Softw Eng (TSE) 35(3):430–448
Matsumoto S, Kamei Y, Monden A, ichi Matsumoto K, Nakamura M (2010) An analysis of developer metrics for fault prediction. In: Proceedings of the 6th Int’l Conference on Predictive Models in Software Engineering (PROMISE), pp 18:1–18:9
McCabe TJ (1976) A complexity measure, p Proceedings of the 2nd Int’l Conference on Software Engineering (ICSE), p 407
McIntosh S, Kamei Y, Adams B, Hassan AE (2014) The Impact of Code Review Coverage and Code Review Participation on Software Quality: A Case Study of the QT, VTK, and ITK Projects. In: Proceedings of the 11th Working Conference on Mining Software Repositories (MSR), pp 192–201
Menzies T, Stefano JSD , Chapman M , McGill K (2002) Metrics That Matter. In: Proc of the 27th Annual NASA Goddard/IEEE Software Engineering Workshop, pp 51–57
Mockus A, Votta LG (2000) Identifying Reasons for Software Changes Using Historic Databases. In: Proceedings of the 16th Int’l Conference on Software Maintenance (ICSM), pp 120–130
Mockus A, Weiss D M (2000) Predicting Risk of Software Changes. Bell Labs Tech J 5(2):169–180
Mockus A, Fielding RT, Herbsleb JD (2002) Two Case Studies of Open Source Software Development: Apache and Mozilla. Trans Softw Eng Methodol (TOSEM) 11 (3):309–346
Mukadam M, Bird C, Rigby PC (2013) Gerrit Software Code Review Data from Android. In: Proceedings of the 10th Working Conference on Mining Software Repositories (MSR), pp 45–48
Nagappan N, Ball T (2005) Use of relative code churn measures to predict system defect density. In: Proceedings of the 27th Int’l Conference on Software Engineering (ICSE), pp 284–292
Nagappan N, Ball T (2007) Using Software Dependencies and Churn Metrics to Predict Field Failures: An Empirical Case Study. In: Proceedings of the 1st Int’l Symposium on Empirical Software Engineering and Measurement (ESEM), pp 364–373
Nagappan N, Ball T, Zeller A (2006) Mining metrics to predict component failures. In: Proceedings of the 28th Int’l Conference on Software Engineering (ICSE), pp 452–461
Porter A, Siy H, Mockus A, Votta L (1998) Understanding the Sources of Variation in Software Inspections. Trans Softw Eng Methodol (TOSEM) 7(1):41–79
R Core Team (2013) R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria. http://www.R-project.org/
Rahman F, Devanbu P (2011) Ownership, Experience and Defects: A Fine-Grained Study of Authorship. In: Proceedings of the 33rd Int’l Conference on Software Engineering (ICSE), pp 491–500
Rahman F, Devanbu P (2013) How, and why, process metrics are better. In: Proceedings of the 35th Int’l Conference on Software Engineering (ICSE), pp 432–441
Rigby PC, Bird C (2013) Convergent Contemporary Software Peer Review Practices. In: Proceedings of the 9th joint meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 202–212
Rigby PC, Storey MA (2011) Understanding Broadcast Based Peer Review on Open Source Software Projects. In: Proceedings of the 33rd Int’l Conference on Software Engineering (ICSE), pp 541– 550
Rigby PC, German DM, Storey MA (2008) Open Source Software Peer Review Practices: A Case Study of the Apache Server. In: Proceedings of the 30th Int’l Conference on Software Engineering (ICSE), pp 541–550
Rigby PC, German DM, Cohen L, Storey MA (2014) Peer Review on Open Source Software Projects: Parameters, Statistical Models, and Theory. To appear
Sarle WS (1990) The VARCLUS Procedure. In: SAS/STAT User’s Guide, 4th edn, SAS Institute.Inc
Shannon CE (1948) A Mathematical Theory of Communication. The Bell System Technical Journal 27:379–423, 623–656
Shihab E , Mockus A, Kamei Y, Adams B , Hassan AE (2011) High-Impact Defects: A Study of Breakage and Surprise Defects. In: Proceedings of the 8th joint meeting of the European Software Engineering Conference and the Symposium on the Foundations of Software Engineering (ESEC/FSE), pp 300–310
Tanaka T, Sakamoto K, Kusumoto S, ichi Matsumoto K, Kikuno T (1995) Improvement of Software Process by Process Description and Benefit Estimation. In: Proceedings of the 17th Int’l Conference on Software Engineering (ICSE) pp 123–132
Acknowledgments
The authors would like to acknowledge Frank Harrell Jr. for the insightful discussions and assistance with the configuration and debugging of the rms R package. The authors would also like to thank the anonymous reviewers for their fruitful comments on the earlier version of this work (McIntosh et al. 2014).
This research was partially supported by the Natural Sciences and Engineering Research Council of Canada (NSERC) and JSPS KAKENHI Grant Numbers 24680003 and 25540026.
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Sung Kim and Martin Pinzger
Appendix: A: Example Scripts
Rights and permissions
About this article
Cite this article
McIntosh, S., Kamei, Y., Adams, B. et al. An empirical study of the impact of modern code review practices on software quality. Empir Software Eng 21, 2146–2189 (2016). https://doi.org/10.1007/s10664-015-9381-9
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10664-015-9381-9