Abstract
Load imbalances are a major reason for efficiency loss in highly parallel applications. Hence, their identification is of high relevance in performance analysis and tuning. We present a low-overhead approach to automatically identify load-imbalanced regions and filter out irrelevant ones based on new selection heuristics in our PIRA tool for automatic instrumentation refinement for the Score-P measurement system. For the LULESH mini-app as well as the Ice-sheet and Sea-level System Model simulation package we, thus, correctly identify existing load imbalances while maintaining a runtime overhead of less than \(10\%\) for all but one input. Moreover, the traces generated are suitable for Scalasca’s automatic trace analysis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Adhianto, L., Banerjee, S., Fagan, M., Krentel, M., Marin, G., Mellor-Crummey, J., Tallent, N.R.: HPCToolkit: tools for performance analysis of optimized parallel programs. Concurr. Comput. Practice Exp. 22(6), 685–701 (2010). https://doi.org/10.1002/cpe.1553
Bhatele, A., Brink, S., Gamblin, T.: Hatchet: pruning the overgrowth in parallel profiles. In: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC 2019, pp. 1–21. Association for Computing Machinery (2019). https://doi.org/10.1145/3295500.3356219
Böhme, D., Geimer, M., Arnold, L., Voigtlaender, F., Wolf, F.: Identifying the root causes of wait states in large-scale parallel applications. ACM Trans. Parallel Comput. 3(2), 11:1–11:24 (2016). https://doi.org/10.1145/2934661
DeRose, L., Homer, B., Johnson, D.: Detecting application load imbalance on high end massively parallel systems. In: Kermarrec, A.-M., Bougé, L., Priol, T. (eds.) Euro-Par 2007. LNCS, vol. 4641, pp. 150–159. Springer, Heidelberg (2007). https://doi.org/10.1007/978-3-540-74466-5_17
Gamblin, T.: wrap.py - A PMPI Wrapper. https://github.com/LLNL/wrap
Geimer, M., Wolf, F., Wylie, B.J.N., Ábrahám, E., Becker, D., Mohr, B.: The Scalasca performance toolset architecture. Concurr. Comput. Practice Exp. 22(6), 702–719 (2010). https://doi.org/10.1002/cpe.1556
Karlin, I., Keasler, J., Neely, R.: LULESH 2.0 updates and changes. Technical report, Lawrence Livermore National Lab (LLNL) (2013). https://computing.llnl.gov/projects/co-design/lulesh2.0_changes1.pdf
Knüpfer, A., Rössel, C., Mey, D.a., Biersdorff, et al., S.: Score-P: a joint performance measurement run-time infrastructure for periscope, Scalasca, TAU, and Vampir. In: Tools for High Performance Computing 2011, pp. 79–91. Springer (2012). https://doi.org/10.1007/978-3-642-31476-6_7
Larour, E., Seroussi, H., Morlighem, M., Rignot, E.: Continental scale, high order, high spatial resolution, ice sheet modeling using the Ice Sheet System Model (ISSM). J. Geophys. Res. Earth Surface 117(F1) (2012). https://doi.org/10.1029/2011JF002140
Lehr, J.P., Calotoiu, A., Bischof, C., Wolf, F.: Automatic Instrumentation Refinement for Empirical Performance Modeling. In: 2019 IEEE/ACM Intl. Workshop on Programming and Performance Visualization Tools (ProTools). pp. 40–47. IEEE (2019). https://doi.org/10.1109/ProTools49597.2019.00011
Lehr, J.P., Hück, A., Bischof, C.: PIRA: performance instrumentation refinement automation. In: 5th ACM SIGPLAN International Workshop on Artificial Intelligence and Empirical Methods for Software Engineering and Parallel Computing Systems, AI-SEPS 2018, pp. 1–10. ACM (2018). https://doi.org/10.1145/3281070.3281071
Lehr, J.P., Hück, A., Fischler, Y., Bischof, C.: MetaCG: annotated call-graphs to facilitate whole-program analysis, pp. 3–9. Association for Computing Machinery, New York (2020). https://doi.org/10.1145/3427764.3428320
Lehr, J.P., Iwainsky, C., Bischof, C.: The influence of HPCToolkit and Score-p on hardware performance counters. In: Proceedings of the 4th ACM SIGPLAN International Workshop on Software Engineering for Parallel Systems, SEPS 2017, pp. 21–30. ACM, New York (2017). https://doi.org/10.1145/3141865.3141869
Nagel, W.E., Arnold, A., Weber, M., Hoppe, H.C., Solchenbach, K.: VAMPIR: visualization and analysis of MPI resources. Supercomputer 63 12(1), 69–80 (1996)
Rückamp, M., Greve, R., Humbert, A.: Comparative simulations of the evolution of the Greenland ice sheet under simplified Paris Agreement scenarios with the models SICOPOLIS and ISSM. Polar Sci. 21, 14–25 (2019). https://doi.org/10.1016/j.polar.2018.12.003
Tallent, N.R., Adhianto, L., Mellor-Crummey, J.M.: Scalable identification of load imbalance in parallel executions using call path profiles. In: SC 2010: Proceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–11 (2010). https://doi.org/10.1109/SC.2010.47, ISSN: 2167-4337
Acknowledgments
This work was funded by the Hessian LOEWE initiative within the Software-Factory 4.0 project and the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) – Project-ID 265191195 – SFB 1194. Calculations for this research were conducted on the Lichtenberg high-performance computer of Technical University of Darmstadt.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Ethics declarations
Contributions
Conception: Jan-Patrick Lehr; Funding: Christian Bischof; Investigation: Peter Arzt and Yannic Fischler; Methodology: Peter Arzt; Software: Peter Arzt; Supervision: Jan-Patrick Lehr; Validation: Yannic Fischler; Writing: Peter Arzt, Jan-Patrick Lehr, Yannic Fischler and Christian Bischof.
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Arzt, P., Fischler, Y., Lehr, JP., Bischof, C. (2021). Automatic Low-Overhead Load-Imbalance Detection in MPI Applications. In: Sousa, L., Roma, N., Tomás, P. (eds) Euro-Par 2021: Parallel Processing. Euro-Par 2021. Lecture Notes in Computer Science(), vol 12820. Springer, Cham. https://doi.org/10.1007/978-3-030-85665-6_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-85665-6_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-85664-9
Online ISBN: 978-3-030-85665-6
eBook Packages: Computer ScienceComputer Science (R0)