Abstract
Context
Due to the dependency relations among software, vulnerabilities in software supply chains (SSC) may cause more serious security threats than independent software systems. This poses new challenges for ensuring software security including the spread of risks and the increase in maintenance costs.
Objective
To address the challenges, there needs a deep understanding of how a vulnerability is in SSC in terms of vulnerability source, propagation, localization, and repair. However, no studies have been conducted specifically for this purpose.
Method
To fill this gap, we provide an experience study of real-world vulnerability characteristics in the context of SSCs. Specifically, we examine the vulnerability source first and further study the fine-grained vulnerability propagation, localization, and repair of libraries and their corresponding client programs.
Results
The key findings are summarized as follows: a) 99% of vulnerabilities in client programs are caused by their dependencies, and 81.26% of SSC vulnerabilities detected by package-level analysis are false positives; b) for vulnerability localization, the vulnerability database does not have enough information to help direct localization, but the vulnerability descriptions in the open-source vulnerability database provide much important information for indirect localization. c) client developers deal with vulnerable dependencies in many ways including upgrading dependencies, modifying client code, and deleting relevant code or vulnerable dependencies.
Conclusions
Based on these observations, we make suggestions for future research in this direction: a) when testing important client programs, vulnerability detection tools should pay attention to both client code and the dependent libraries; b) localizing vulnerability based on vulnerability descriptions is not straightforward, hence a proper combination of program analysis and description analysis is expected to improve localization accuracy; c) there can be various strategies for dealing with vulnerable libraries, and automating the enforcement of those strategies will be expected.











Similar content being viewed by others
Data Availability
The source and generating process of datasets are presented in Section 3. Data supporting the findings of this study are publicly available in the GitHub repository, https://github.com/YijunShen/supply_chain_vul
Notes
https://libraries.io is one of the largest open-source data platform which monitors open source packages across 32 different package managers.
References
Abreu R, Zoeteweij P, Van Gemund AJ (2007) On the accuracy of spectrum-based fault localization. In: Testing: Academic and industrial conference practice and research techniques-mutation (TAICPART-MUTATION 2007), pp 89–98. https://doi.org/10.1007/10.1109/TAIC.PART.2007.13
AFL (2019) American fuzzy lop (afl). http://lcamtuf.coredump.cx/afl Accessed 08 Apr 2023
Alfadel M, Costa DE, Shihab E (2021) Empirical analysis of security vulnerabilities in python packages. In: 2021 IEEE International conference on software analysis, evolution and reengineering (SANER), pp 446–457. https://doi.org/10.1109/SANER50967.2021.00048
Almhana R, Mkaouer MW, Kessentini M, Ouni A (2016) Recommending relevant classes for bug reports using multi-objective search. 2016 31st IEEE/ACM International conference on automated software engineering (ASE) pp 286–295. https://doi.org/10.1145/2970276.2970344
BlackDuck (2023) Black duck. https://www.synopsys.com/software-integrity/security-testing/software-composition-analysis.html Accessed 31 May 2023
Bui QC, Scandariato R, Ferreyra NED (2022) Vul4j: a dataset of reproducible java vulnerabilities geared towards the study of program repair techniques. In: 2022 IEEE/ACM 19th International conference on mining software repositories (MSR), pp 464–468. https://doi.org/10.1145/3524842.3528482
Chakraborty S, Krishna R, Ding Y, Ray B (2020) Deep learning based vulnerability detection: are we there yet? IEEE Trans Softw Eng 48:3280–3296. https://doi.org/10.1109/TSE.2021.3087402
Cox J, Bouwers E, Van Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: 2015 IEEE/ACM 37th IEEE International conference on software engineering (ICSE), vol 2, pp 109–118. https://doi.org/10.1109/ICSE.2015.140
CVEProgram (2024) Key details phrasing. https://www.cve.org/Resources/General/Key-Details-Phrasing.pdf Accessed 12 June 2024
CWE (2021) 2021 cwe top 25 most dangerous software weaknesses. https://cwe.mitre.org/top25/archive/2021/2021_cwe_top25.html Accessed 25 April 2024
Dann A, Plate H, Hermann B, Ponta SE, Bodden E (2022) Identifying challenges for oss vulnerability scanners - a study & test suite. IEEE Trans Softw Eng 48(9):3613–3625. https://doi.org/10.1109/TSE.2021.3101739
Debroy V, Wong WE (2013) A consensus-based strategy to improve the quality of fault localization. Softw Pract Experience 43:989–1011. https://doi.org/10.1002/spe.1146
Decan A, Mens T, Constantinou E (2018) On the impact of security vulnerabilities in the npm package dependency network. 2018 IEEE/ACM 15th International conference on mining software repositories (MSR) pp 181–191. https://doi.org/10.1145/3196398.3196401
Dependabot (2023) Dependabot. https://github.com/dependabot Accessed 30 May 2023
Dependency-Check (2023) Owasp dependency check. https://owasp.org/www-project-dependency-check Accessed 30 May 2023
Dodge Y (2008) Spearman rank correlation coefficient. In: The Concise encyclopedia of statistics, Springer, New York, NY, pp 502–505. https://doi.org/10.1007/978-0-387-32833-1_379
Enck W, Williams L (2022) Top five challenges in software supply chain security: observations from 30 industry and government organizations. IEEE Secur Priv 20(2):96–100. https://doi.org/10.1109/MSEC.2022.3142338
Foundation TL (2020) Open source software supply chain security. https://www.linuxfoundation.org/tools/open-source-software-supply-chain-security/ Accessed 06 May 2022
Gao X, Wang B, Duck GJ, Ji R, Xiong Y, Roychoudhury A (2020) Beyond tests: program vulnerability repair via crash constraint extraction. ACM Trans Softw Eng Methodol (TOSEM) 30(2). https://doi.org/10.1145/3418461
Gartner (2021) Application development will shift to application assembly and integration. https://www.gartner.com/en/newsroom/press-releases/2021-11-10-gartner-says-cloud-will-be-the-centerpiece-of-new-digital-experiences Accessed 12 Apr 2023
Gkortzis A, Feitosa D, Spinellis D (2021) Software reuse cuts both ways: an empirical analysis of its relationship with security vulnerabilities. J Syst Softw 172:110653. https://doi.org/10.1016/j.jss.2020.110653
Goues CL, Pradel M, Roychoudhury A (2019) Automated program repair. Commun ACM 62:56–65. https://doi.org/10.1145/3318162
Grieco G, Grinblat GL, Uzal LC, Rawat S, Feist J, Mounier L (2016) Toward large-scale vulnerability discovery using machine learning. Proceedings of the Sixth ACM Conference on Data and Application Security and Privacy. https://doi.org/10.1145/2857705.2857720
Huang Z, Lie D, Tan G, Jaeger T (2019) Using safety properties to generate vulnerability patches. In: Proceedings of the 40th IEEE symposium on security and privacy (S &P), pp 539–554. https://doi.org/10.1109/SP.2019.00071
Imtiaz N, Thorn S, Williams L (2021) A comparative study of vulnerability reporting by software composition analysis tools. In: Proceedings of the 15th ACM / IEEE International symposium on empirical software engineering and measurement (ESEM). https://doi.org/10.1145/3475716.347576
Islam MJ, Pan R, Nguyen G, Rajan H (2020) Repairing deep neural networks: fix patterns and challenges. In: Proceedings of the ACM/IEEE 42nd international conference on software engineering, pp 1135–1146. https://doi.org/10.1145/3377811.3380378
Jang J, Agrawal A, Brumley D (2012) Redebug: finding unpatched code clones in entire os distributions. In: 2012 IEEE Symposium on security and privacy, pp 48–62. https://doi.org/10.1109/SP.2012.13
Kula RG, German DM, Ouni A, Ishio T, Inoue K (2018) Do developers update their library dependencies? an empirical study on the impact of security advisories on library migration. Empir Softw Eng 23:384–417. https://doi.org/10.1007/s10664-017-9521-5
Lam AN, Nguyen AT, Nguyen HA, Nguyen TN (2017) Bug localization with combination of deep learning and information retrieval. In: 2017 IEEE/ACM 25th International conference on program comprehension (ICPC), pp 218–229. https://doi.org/10.1109/ICPC.2017.24
Lauinger T, Chaabane A, Arshad S, Robertson W, Wilson C, Kirda E (2017) Thou shalt not depend on me: analysing the use of outdated javascript libraries on the web. https://www.ndss-symposium.org/wp-content/uploads/2017/09/ndss2017_02B-1_Lauinger_paper.pdf Accessed 02 July 2024
Li F, Paxson V (2017) A large-scale empirical study of security patches. In: Proceedings of the 2017 ACM SIGSAC conference on computer and communications security, pp 2201–2215. https://doi.org/10.1145/3133956.3134072
Li Z, Zou D, Xu S, Ou X, Jin H, Wang S, Deng Z, Zhong Y (2018) Vuldeepecker: a deep learning-based system for vulnerability detection. In: Network and Distributed Systems Security (NDSS) symposium 2018. https://doi.org/10.14722/ndss.2018.23158
Liu C, Chen S, Fan L, Chen B, Liu Y, Peng X (2022) Demystifying the vulnerability propagation and its evolution via dependency trees in the npm ecosystem. 2022 IEEE/ACM 44th International conference on software engineering (ICSE) pp 672–684. https://doi.org/10.1145/3510003.3510142
Liu K, Koyuncu A, Bissyande TF, Kim D, Klein J, Le Traon Y (2019) You cannot fix what you cannot find! an investigation of fault localization bias in benchmarking automated program repair systems. In: 2019 12th IEEE Conference on software testing, validation and verification (ICST), pp 102–113. https://doi.org/10.1109/ICST.2019.00020
Mirhosseini S, Parnin C (2017) Can automated pull requests encourage software developers to upgrade out-of-date dependencies? In: 2017 32nd IEEE/ACM International conference on automated software engineering (ASE), pp 84–94. https://doi.org/10.1109/ASE.2017.8115621
Moreno L, Treadway JJ, Marcus A, Shen W (2014) On the use of stack traces to improve text retrieval-based bug localization. In: 2014 IEEE International conference on software maintenance and evolution, pp 151–160. https://doi.org/10.1109/ICSME.2014.37
Perl H, Dechand S, Smith M, Arp D, Yamaguchi F, Rieck K, Fahl S, Acar Y (2015) Vccfinder: finding potential vulnerabilities in open-source projects to assist code audits. In: Proceedings of the 22nd ACM SIGSAC conference on computer and communications security, pp 426–437. https://doi.org/10.1145/2810103.2813604
Pfretzschner B, ben Othmane L (2017) Identification of dependency-based attacks on node.js. In: Proceedings of the 12th international conference on availability, reliability and security, pp 1–6. https://doi.org/10.1145/3098954.3120928
Pham NH, Nguyen TT, Nguyen HA, Nguyen TN (2010) Detection of recurring software vulnerabilities. Proceedings of the 25th IEEE/ACM international conference on automated software engineering pp 447–456. https://doi.org/10.1145/1858996.1859089
Ponta SE, Plate H, Sabetta A, Bezzi M, Dangremont C (2019) A manually-curated dataset of fixes to vulnerabilities of open-source software. In: Proceedings of the 16th international conference on mining software repositories, pp 383–387. https://doi.org/10.1109/MSR.2019.00064
Prana GAA, Sharma A, Shar LK, Foo D, Santosa AE, Sharma A, Lo D (2021) Out of sight, out of mind? how vulnerable dependencies affect open-source projects. Empir Softw Eng 26(4):1–34. https://doi.org/10.1007/s10664-021-09959-3
Reid D, Jahanshahi M, Mockus A (2022) The extent of orphan vulnerabilities from code reuse in open source software. In: Proceedings of the 44th international conference on software engineering, pp 2104–2115. https://doi.org/10.1145/3510003.3510216
Reps T, Ball T, Das M, Larus J (1997) The use of program profiling for software maintenance with applications to the year 2000 problem. In: Joint european software engineering conference and symposium on the foundations of software engineering (ESEC/FSE), pp 432–449. https://doi.org/10.1145/267896.267925
Saha RK, Lease M, Khurshid S, Perry DE (2013) Improving bug localization using structured information retrieval. In: 2013 28th IEEE/ACM International conference on automated software engineering (ASE), pp 345–355. https://doi.org/10.1109/ASE.2013.6693093
Santelices R, Jones JA, Yu Y, Harrold MJ (2009) Lightweight fault-localization using multiple coverage types. In: 2009 IEEE 31st International conference on software engineering, pp 56–66. https://doi.org/10.1109/ICSE.2009.5070508
Sejfia A, Schäfer M (2022) Practical automated detection of malicious npm packages. In: 2022 IEEE/ACM 44th International conference on software engineering (ICSE), pp 1681–1692. https://doi.org/10.1145/3510003.3510104
Serebryany K, Bruening D, Potapenko A, Vyukov D (2012) Addresssanitizer: a fast address sanity checker. https://www.usenix.org/conference/atc12/technical-sessions/presentation/serebryany Accessed 30 June 2024
Sheng VS, Provost F, Ipeirotis PG (2008) Get another label? improving data quality and data mining using multiple, noisy labelers. In: Proceedings of the 14th ACM SIGKDD international conference on knowledge discovery and data mining, pp 614–622. https://doi.org/10.1145/1401890.1401965
Sonatype (2021) 2021 state of the software supply chain. https://www.sonatype.com/resources/state-of-the-software-supply-chain-2021 Accessed 06 May 2022
Soto-Valero C, Harrand N, Monperrus M, Baudry B (2021) A comprehensive study of bloated dependencies in the maven ecosystem. Empir Softw Eng 26:45. https://doi.org/10.1007/s10664-020-09914-8
Staicu CA, Pradel M, Livshits B (2016) Understanding and automatically preventing injection attacks on node.js. https://www.doc.ic.ac.uk/~livshits/papers/tr/nodejs_tr.pdf Accessed 02 July 2024
Synopsys (2022) 2022 open source security and risk analysis report. https://www.synopsys.com/software-integrity/resources/analyst-reports/open-source-security-risk-analysis.html Accessed 06 May 2022
Synopsys (2024) 2024 open source security and risk analysis report. https://www.synopsys.com/software-integrity/resources/analyst-reports/open-source-security-risk-analysis.html Accessed 12 June 2024
Tan X, Gao K, Zhou M, Zhang L (2022) An exploratory study of deep learning supply chain. In: 2022 IEEE/ACM 43rd International conference on software engineering (ICSE), pp 86–98. https://doi.org/10.1145/3510003.3510199
Viega J, Bloch J, Kohno Y, McGraw G (2000) Its4: a static vulnerability scanner for c and c++ code. In: Proceedings 16th annual computer security applications conference (ACSAC’00), pp 257–267. https://doi.org/10.1109/ACSAC.2000.898880
Vu DL, Pashchenko I, Massacci F, Plate H, Sabetta A (2020) Towards using source code repositories to identify software supply chain attacks. In: Proceedings of the 2020 ACM SIGSAC conference on computer and communications security, pp 2093–2095. https://doi.org/10.1145/3372297.3420015
Wang S, Lo D, Lawall J (2014) Compositional vector space models for improved bug localization. In: 2014 IEEE International conference on software maintenance and evolution, pp 171–180. https://doi.org/10.1109/ICSME.2014.39
Wang Y, Chen B, Huang K, Shi B, Xu C, Peng X, Wu Y, Liu Y (2020) An empirical study of usages, updates and risks of third-party libraries in java projects. In: 2020 IEEE International conference on software maintenance and evolution (ICSME), pp 35–45. https://doi.org/10.1109/ICSME46990.2020.00014
Wijayasekara D, Manic M, McQueen M (2014) Vulnerability identification and classification via text mining bug databases. In: IECON 2014-40th Annual conference of the IEEE industrial electronics society, pp 3612–3618. https://doi.org/10.1109/IECON.2014.7049035
Wong CP, Xiong Y, Zhang H, Hao D, Zhang L, Mei H (2014) Boosting bug-report-oriented fault localization with segmentation and stack-trace analysis. In: 2014 IEEE International conference on software maintenance and evolution, pp 181–190. https://doi.org/10.1109/ICSME.2014.40
Wong WE, Gao R, Li Y, Abreu R, Wotawa F (2016) A survey on software fault localization. IEEE Trans Softw Eng 42(8):707–740. https://doi.org/10.1007/10.1109/TSE.2016.2521368
Wu Y, Yu Z, Wen M, Li Q, Zou D, Jin H (2023) Understanding the threats of upstream vulnerabilities to downstream projects in the maven ecosystem. In: 2023 IEEE/ACM 45th International conference on software engineering (ICSE), pp 1046–105 https://doi.org/10.1109/ICSE48619.2023.00095
Wyss E, De Carli L, Davidson D (2022) What the fork? finding hidden code clones in npm. In: Proceedings of the 44th international conference on software engineering, pp 2415–2426. https://doi.org/10.1145/3510003.3510168
Xia P, Matsushita M, Yoshida N, Inoue K (2013) Studying reuse of out-dated third-party code in open source projects. J Inf Process 9:155–161. https://doi.org/10.11309/JSSST.30.4_98
Xie X, Chen TY, Kuo FC, Xu B (2013) A theoretical analysis of the risk evaluation formulas for spectrum-based fault localization. ACM Trans Softw Eng Methodol (TOSEM) 22:1–40. https://doi.org/10.1145/2522920.2522924
Xu Z, Chen B, Chandramohan M, Liu Y, Song F (2017) Spain: security patch analysis for binaries towards understanding the pain and pills. In: 2017 IEEE/ACM 39th International conference on software engineering (ICSE), pp 462–472. https://doi.org/10.1109/ICSE.2017.49
Xuan J, Monperrus M (2014) Learning to combine multiple ranking metrics for fault localization. In: 2014 IEEE International conference on software maintenance and evolution, pp 191–200. https://doi.org/10.1109/ICSME.2014.41
Yamaguchi F, Wressnegger C, Gascon H, Rieck K (2013) Chucky: exposing missing checks in source code for vulnerability discovery. Proceedings of the 2013 ACM SIGSAC conference on Computer & communications security https://doi.org/10.1145/2508859.2516665
Yasumatsu T, Watanabe T, Kanei F, Shioji E, Akiyama M, Mori T (2019) Understanding the responsiveness of mobile app developers to software library updates. In: Proceedings of the ninth ACM conference on data and application security and privacy, pp 13–24. https://doi.org/10.1145/3292006.3300020
Youm KC, Ahn J, Kim J, Lee E (2015) Bug localization based on code change histories and bug reports. In: 2015 Asia-Pacific software engineering conference (APSEC), pp 190–197. https://doi.org/10.1109/APSEC.2015.23
Zapata RE, Kula RG, Chinthanet B, Ishio T, Matsumoto K, Ihara A (2018) Towards smoother library migrations: a look at vulnerable dependency migrations at function level for npm javascript packages. In: 2018 IEEE International conference on software maintenance and evolution (ICSME), pp 559–563. https://doi.org/10.1007/10.1109/ICSME.2018.00067
Zhou J, Zhang H, Lo D (2012) Where should the bugs be fixed? more accurate information retrieval-based bug localization based on bug reports. In: 2012 34th International conference on software engineering (ICSE), pp 14–24. https://doi.org/10.1109/ICSE.2012.6227210
Zimmermann M, Staicu CA, Tenny C, Pradel M (2019) Smallworld with high risks: a study of security threats in the npm ecosystem. In: Proceedings of the 28th USENIX conference on security symposium, pp 995–1010. https://doi.org/10.5555/3361338.3361407
Acknowledgements
This work was supported partly by National Natural Science Foundation of China under Grant No. 62141209, 62202026, and partly by Guangxi Collaborative Innovation Center of Multi-source Information Integration and Intelligent.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflicts of Interest
The authors declared that they have no conflict of interest.
Additional information
Communicated by: Slinger Jansen.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Shen, Y., Gao, X., Sun, H. et al. Understanding vulnerabilities in software supply chains. Empir Software Eng 30, 20 (2025). https://doi.org/10.1007/s10664-024-10581-2
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-024-10581-2