Abstract
Container-based solutions, such as Docker, have become increasingly relevant in the software industry to facilitate deploying and maintaining software systems. Little is known, however, about how outdated such containers are at the moment of their release or when used in production. This article addresses this question, by measuring and comparing five different dimensions of technical lag that Docker container images can face: package lag, time lag, version lag, vulnerability lag, and bug lag. We instantiate the formal technical lag framework from previous work to operationalise these different dimensions of lag on Docker Hub images based on the Debian Linux distribution. We carry out a large-scale empirical study of such technical lag, over a three-year period, in 140,498 Debian images. We compare the differences between official and community images, as well as between images with different Debian distributions: OldStable, Stable or Testing. The analysis shows that the different dimensions of technical lag are complementary, providing multiple insights. Official Debian images consistently have a lower lag than community images for all considered lag dimensions. The amount of lag incurred depends on the type of Debian distribution and the considered lag dimension. Our research offers empirical evidence that developers and deployers of Docker images can benefit from identifying to which extent their containers are outdated according to the considered dimensions, and mitigate the risks related to such outdatedness.




















Similar content being viewed by others
Notes
An example of rule violation is forgetting the -y flag when using apt-get install.
Certified images are built with best practices, tested and validated against the Docker Enterprise Edition and pass security requirements.
Verified images are high-quality images from verified publishers. These products are published and maintained directly by a commercial entity.
Downloading all available images would have taken at least 6 extra months, and would have required considerably more storage capacity.
If n different tests are carried out over the same dataset, for each individual test one can only reject H0 if \(p< \frac {0.01}{n}\). In our case n = 28, i.e., p < 0.00036.
Extra analysis and results, distinguishing the evolution trends both for official and community images, can be found in our reproduction package.
An example of this was provided with the Dockerfile for the community image shogun-dev:latest presented in Section 2.2.
References
Abate P, Di Cosmo R, Boender J, Zacchiroli S (2009) Strong dependencies between software components. In: International symposium on empirical software engineering and measurement. https://doi.org/10.1109/ESEM.2009.5316017. IEEE Computer Society, pp 89–99
Abate P, Di Cosmo R, Treinen R, Zacchiroli S (2012) Dependency solving: a separate concern in component evolution management. J Syst Softw 85 (10):2228–2240. https://doi.org/10.1016/j.jss.2012.02.018
Abate P, Di Cosmo R, Treinen R, Zacchiroli S (2014) Learning from the future of component repositories. Sci Comput Program 90:93–115. https://doi.org/10.1016/j.scico.2013.06.007
Anchore.io (2017) Snapshot of the container ecosystem. https://anchore.com/wp-content/uploads/2017/04/Anchore-Container-Survey-5.pdf. Accessed: 01/12/2019
Artho C, Suzaki K, Di Cosmo R, Treinen R, Zacchiroli S (2012) Why do software packages conflict?. In: Working conference mining software repositories. https://doi.org/10.1109/MSR.2012.6224274, pp 141–150
Bernstein D (2014) Containers and cloud: from LXC to Docker to Kubernetes. IEEE Cloud Comput 1(3):81–84. https://doi.org/10.1109/MCC.2014.51
Bettini A (2015) Vulnerability exploitation in docker container environments. In: FlawCheck, Black Hat Europe
Boettiger C (2015) An introduction to Docker for reproducible research. ACM SIGOPS Oper Syst Rev 49(1):71–79. https://doi.org/10.1145/2723872.2723882
Cito J, Schermann G, Wittern JE, Leitner P, Zumberi S, Gall HC (2017) An empirical analysis of the Docker container ecosystem on GitHub. In: International conference on mining software repositories. https://doi.org/10.1109/MSR.2017.67. IEEE Press, pp 323–333
Claes M, Mens T, Di Cosmo R, Vouillon J (2015) A historical analysis of Debian package incompatibilities. In: Working conference mining software repositories. https://doi.org/10.1109/MSR.2015.27, pp 212–223
Cogo F R, Oliva G A, Hassan A E (2019) An empirical study of dependency downgrades in the npm ecosystem. IEEE Trans Softw Eng. https://doi.org/10.1109/TSE.2019.2952130
Combe T, Martin A, Di Pietro R (2016) To Docker or not to Docker: a security perspective. IEEE Cloud Comput 3(5):54–62. https://doi.org/10.1109/MCC.2016.100
Cox J, Bouwers E, van Eekelen M, Visser J (2015) Measuring dependency freshness in software systems. In: International conference on software engineering. https://doi.org/10.1109/ICSE.2015.140. IEEE Press, pp 109–118
de Visser M (2017) A look at how often Docker images are updated. https://anchore.com/look-often-docker-images-updated/. Accessed: 20 August 2020
Decan A, Mens T, Constantinou E (2018a) On the evolution of technical lag in the npm package dependency network. In: International conference software maintenance and evolution. https://doi.org/10.1109/ICSME.2018.00050. IEEE, pp 404–414
Decan A, Mens T, Constantinou E (2018b) On the impact of security vulnerabilities in the npm package dependency network. In: International conference on mining software repositories. https://doi.org/10.1145/3196398.3196401
Decan A, Mens T, Grosjean P (2019) An empirical comparison of dependency network evolution in seven software packaging ecosystems. Empir Softw Eng 24(1):381–416. ISSN 1573-7616. https://doi.org/10.1007/s10664-017-9589-y
DeHamer B (2020) Docker hub top 10. https://www.ctl.io/developers/blog/post/docker-hub-top-10/. Accessed: 20 August 2020
Docker Inc. (2020a) Docker registry HTTP API V2. https://docs.docker.com/registry/spec/api/. Accessed: 20 Aug 2020
Docker Inc. (2020b) Dockerfile reference. https://docs.docker.com/engine/reference/builder/. Accessed: 20 August 2020
Gonzalez-Barahona JM, Robles G, Michlmayr M, Amor JJ, German DM (2009) Macro-level software evolution: a case study of a large software compilation. Empir Softw Eng 14(3):262–285. https://doi.org/10.1007/s10664-008-9100-x
Gonzalez-Barahona JM, Sherwood P, Robles G, Izquierdo D (2017) Technical lag in software compilations: measuring how outdated a software deployment is. In: IFIP international conference on open source systems. https://doi.org/10.1007/978-3-319-57735-7_17. Springer, pp 182–192
Henkel J, Bird C, Lahiri SK, Reps T (2020) Learning from, understanding, and supporting DevOps artifacts for Docker. In: International conference on software engineering
Kula R G, German D M, Ishio T, Inoue K (2015) Trusting a library: a study of the latency to adopt the latest Maven release. In: International conference on software analysis, evolution, and reengineering. https://doi.org/10.1109/SANER.2015.7081869, pp 520–524
Kula RG, German DM, Ouni A, Ishio T, Inoue K (2017) Do developers update their library dependencies? Empir Softw Eng 23(1):384–417. https://doi.org/10.1007/s10664-017-9521-5. ISSN 1573-7616
Kwon S, Lee J-H (2020) Divds: Docker image vulnerability diagnostic system. IEEE Access. https://doi.org/10.1109/ACCESS.2020.2976874
Legay D, Decan A, Mens T (2020) On package freshness in Linux distributions. In: International conference software maintenance and evolution—NIER Track
Lu Z, Xu J, Wu Y, Wang T, Huang T (2019) An empirical case study on the temporary file smell in Dockerfiles. IEEE Access. https://doi.org/10.1109/ACCESS.2019.2905424
Merkel D (2014) Docker: lightweight Linux containers for consistent development and deployment. Linux J 2014(239):2
Mezzetti G, Møller A, Torp MT (2018) Type regression testing to detect breaking changes in Node. js libraries. In: European conference on object-oriented programming. https://doi.org/10.4230/LIPIcs.ECOOP.2018.7
Møller A, Torp M T (2019) Model-based testing of breaking changes in Node.js libraries. In: Joint meeting on European software engineering conference and symposium on the foundations of software engineering. https://doi.org/10.1145/3338906.3338940. ACM, pp 409–419
Mouat A (2015) Using docker: developing and deploying software with containers. O’Reilly Media, Inc.
Nussbaum L, Zacchiroli S (2010) The ultimate Debian database: consolidating bazaar metadata for quality assurance and data mining. In: Working conference on mining software repositories. https://doi.org/10.1109/MSR.2010.5463277, pp 52–61
Romano J, Kromrey JD, Coraggio J, Skowronek J, Devine L (2006) Exploring methods for evaluating group differences on the NSSE and other surveys: are the t-test and Cohen’s d indices the most appropriate choices?. In: Annual meeting of the southern association for institutional research
Salza P, Palomba F, Di Nucci D, De Lucia A, Ferrucci F (2020) Third-party libraries in mobile apps: when, how, and why developers update them. Empir Softw Eng 25:2341–2377. https://doi.org/10.1007/s10664-019-09754-1
Shu R, Gu X, Enck W (2017) A study of security vulnerabilities on Docker Hub. In: International conference on data and application security and privacy. https://doi.org/10.1145/3029806.3029832. ACM, pp 269–280
Socchi E, Luu J (2019) A deep dive into Docker Hub’s security landscape—a story of inheritance? Master’s thesis University of Oslo
The Debian GNU/Linux FAQ (2019) The Debian package management tools. https://www.debian.org/doc/manuals/debian-faq/pkgtools.en.html. Accessed: 20 Aug 2020
Turnbull J (2014) The Docker book: containerization is the new virtualization. James Turnbull
Vermeer B, Henry W (2019) Shifting Docker security left. https://snyk.io/blog/shifting-docker-security-left/. Accessed: 02/11/2019
Vouillon J, Di Cosmo R (2011) On software component co-installability. In: Joint European software engineering conference and ACM SIGSOFT international symposium on foundations of software engineering. https://doi.org/10.1145/2025113.2025149
Wohlin C, Runeson P, Host M, Ohlsson MC, Regnell B, Wesslen A (2000) Experimentation in software engineering—an introduction. Kluwer, Boston. https://doi.org/10.1007/978-1-4615-4625-2
Zapata RE, Kula RG, Chinthanet B, Ishio T, Matsumoto K, Ihara A (2018) Towards smoother library migrations: a look at vulnerable dependency migrations at function level for npm JavaScript packages. In: International conference on software maintenance and evolution. https://doi.org/10.1109/ICSME.2018.00067. IEEE, pp 559–563
Zerouali A (2019) A measurement framework for analyzing technical lag in open-source software ecosystems. PhD thesis, University of Mons
Zerouali A (2020) Replication package for Debian-based Docker images. https://doi.org/10.5281/zenodo.3765315
Zerouali A, Constantinou E, Mens T, Robles G, González-Barahona J (2018) An empirical analysis of technical lag in npm package dependencies. In: International conference on software reuse. https://doi.org/10.1007/978-3-319-90421-4_6. Springer, pp 95–110
Zerouali A, Cosentino V, Robles G, Gonzalez-Barahona JM, Mens T (2019a) Conpan: a tool to analyze packages in software containers. In: Proceedings of the 16th international conference on mining software repositories. https://doi.org/10.1109/MSR.2019.00089. IEEE Press, pp 592–596
Zerouali A, Mens T, Gonzalez-Barahona J, Decan A, Constantinou E, Robles G (2019b) A formal framework for measuring technical lag in component repositories—and its application to npm. J Softw: Evol Process. https://doi.org/10.1002/smr.2157
Zerouali A, Mens T, Robles G, Gonzalez-Barahona JM (2019c) On the relation between outdated Docker containers, severity vulnerabilities, and bugs. In: International conference on software analysis, evolution and reengineering. https://doi.org/10.1109/SANER.2019.8668013. IEEE, pp 491–501
Zhou J, Chen W, Wu G, Wei J (2019) SemiTagRec: a semi-supervised learning based tag recommendation approach for Docker repositories. In: International conference on software and systems reuse. https://doi.org/10.1007/978-3-030-22888-0_10. Springer, pp 132–148
Zimmermann M, Staicu C-A, Tenny C, Pradel M (2019) Small world with high risks: a study of security threats in the npm ecosystem. In: USENIX security symposium, pp 1–16
Acknowledgements
This research is carried out in the context of the Excellence of Science project 30446992 SECO-Assist financed by FWO-Vlaanderen and F.R.S.-FNRS. We acknowledge the support of the Government of Spain through project “BugBirth” (RTI2018-101963-B-100).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by: Emad Shihab and David Lo
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This article belongs to the Topical Collection: Software Analysis, Evolution and Reengineering (SANER)
Rights and permissions
About this article
Cite this article
Zerouali, A., Mens, T., Decan, A. et al. A multi-dimensional analysis of technical lag in Debian-based Docker images. Empir Software Eng 26, 19 (2021). https://doi.org/10.1007/s10664-020-09908-6
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-020-09908-6