Abstract
This paper explores the application of Human-in-the-Loop (HITL) strategies in the training of machine learning models in the medical domain. In this case, a doctor-in-the-loop approach is proposed to leverage human expertise in dealing with large and complex data. Specifically, the paper deals with the use of Whole Slide Imaging (WSI) for the analysis and prediction of the genomic subtype of breast cancer. Three different tasks were developed: segmentation of histopathological images, classification of these images regarding the genomic subtype of the cancer, and finally, interpretation of the machine learning results. The involvement of a pathologist helped us to develop a better segmentation model trying to group areas to make it more useful for further diagnosis. Because the classification models underperformed due to the complexity of the problem and insufficient data for certain cancer types, we focus our efforts in using the feedback from the pathologist to enhance model interpretability through a HITL hyperparameter optimization process.














Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availibility
The dataset analyzed during the current study is available in the TCGA repository [23], URL: https://portal.gdc.cancer.gov/projects/TCGA-BRCA. The images analyzed during the current study is available in the TCIA repository [25], https://www.cancerimagingarchive.net/collection/tcga-brca/
References
Siegel RL, Giaquinto AN, Jemal A (2024) Cancer statistics, 2024. CA: A Cancer J Clinic, 74(1):12–49 https://doi.org/10.3322/caac.21820https://acsjournals.onlinelibrary.wiley.com/doi/pdf/10.3322/caac.21820
Dizon DS, Kamal AH (2024) Cancer statistics 2024: All hands on deck. CA: Cancer J Clinic, 74(1), 8–9 https://doi.org/10.3322/caac.21824https://acsjournals.onlinelibrary.wiley.com/doi/pdf/10.3322/caac.21824
Giaquinto AN, Sung H, Miller KD, Kramer JL, Newman LA, Minihan A, Jemal A, Siegel RL (2022) Breast cancer statistics, 2022. CA: Canc J Clinici, 72(6), 524–541 https://doi.org/10.3322/caac.21754, https://acsjournals.onlinelibrary.wiley.com/doi/pdf/10.3322/caac.21754
Parker JS, Mullins M, Cheang MCU, Leung S, Voduc D, Vickery T, Davies S, Fauron C, He X, Hu Z, Quackenbush JF, Stijleman IJ, Palazzo J, Marron JS, Nobel AB, Mardis E, Nielsen TO, Ellis MJ, Perou CM, Bernard PS (2009) Supervised risk predictor of breast cancer based on intrinsic subtypes. J Clin Oncol 27(8):1160–1167. https://doi.org/10.1200/JCO.2008.18.1370. (PMID: 19204204)
Pascual T, Martin M, Fernández-Martínez A, Paré L, Alba E, Rodríguez-Lescure A, Perrone G, Cortés J, Morales S, Lluch A, Urruticoechea A, González-Farré B, Galván P, Jares P, Rodriguez A, Chic N, Righi D, Cejalvo JM, Tonini G, Adamo B, Vidal M, Villagrasa P, Muñoz M, Prat A (2019) A pathology-based combined model to identify pam50 non-luminal intrinsic disease in hormone receptor-positive HER2-negative breast cancer. Front Oncol, https://doi.org/10.3389/fonc.2019.00303
Gurcan MN, Boucheron LE, Can A, Madabhushi A, Rajpoot NM, Yener B (2009) Histopathological image analysis: a review. IEEE Rev Biomed Eng 2:147–171. https://doi.org/10.1109/RBME.2009.2034865
Kaur A, Kaushal C, Sandhu JK, Damaševičius R, Thakur N (2024) Histopathological image diagnosis for breast cancer diagnosis based on deep mutual learning. Diagnostics. https://doi.org/10.3390/diagnostics14010095
Krishnakumar B, Kousalya K (2023) Optimal trained deep learning model for breast cancer segmentation and classification. Inform Technol Control 52(4):915–934. https://doi.org/10.5755/j01.itc.52.4.34232
Carriero A, Groenhoff L, Vologina E, Basile P, Albera M (2024) Deep learning in breast cancer imaging: state of the art and recent advancements. Diagnostics 14(8):848. https://doi.org/10.3390/diagnostics14080848
Mosqueira-Rey E, Hernández-Pereira E, Alonso-Ríos D, Bobes-Bascarán J, Fernández-Leal A (2023) Human-in-the-loop machine learning: a state of the art. Artif Intell Rev. https://doi.org/10.1007/s10462-022-10246-w
Boecking B, Neiswanger W, Xing E, Dubrawski A (2021) Interactive weak supervision: learning useful heuristics for data labeling. https://arxiv.org/abs/2012.06046
Lison P, Hubin A, Barnes J, Touileb S (2020) Named entity recognition without labelled data: a weak supervision approach. https://arxiv.org/abs/2004.14723
Mosqueira-Rey E, Hernández-Pereira E, Bobes-Bascarán J, Alonso-Ríos D, Pérez-Sánchez A, Fernández-Leal A, Moret-Bonillo V, Vidal-Ínsua Y, Vázquez-Rivera F (2024) Addressing the data bottleneck in medical deep learning models using a human-in-the-loop machine learning approach. Neural Comput Appl 36(5):2597–2616. https://doi.org/10.1007/s00521-023-09197-2
Voorst R (2024) Challenges and limitations of human oversight in ethical ai implementation in healthcare: balancing digital literacy and professional strain. Mayo Clinic: Proceed Digital Health. https://doi.org/10.1016/j.mcpdig.2024.08.004
Kosaraju S, Park J, Lee H, Yang JW, Kang M (2022) Deep learning-based framework for slide-based histopathological image analysis. Sci Rep 12(1):19075. https://doi.org/10.1038/s41598-022-23166-0
Bera K, Schalper KA, Rimm DL, Velcheti V, Madabhushi A (2019) Artificial intelligence in digital pathology – new tools for diagnosis and precision oncology. Nat Rev Clin Oncol 16(11):703–715. https://doi.org/10.1038/s41571-019-0252-y
Su A, Lee H, Tan X, Suarez CJ, Andor N, Nguyen Q, Ji HP (2022) A deep learning model for molecular label transfer that enables cancer cell identification from histopathology images. NPJ Precis Oncol 6(1):14. https://doi.org/10.1038/s41698-022-00252-0
Rosai J (2007) Why microscopy will remain a cornerstone of surgical pathology. Lab Invest 87(5):403–408. https://doi.org/10.1038/labinvest.3700551
Laak J, Litjens G, Ciompi F (2021) Deep learning in histopathology: the path to the clinic. Nat Med 27(5):775–784. https://doi.org/10.1038/s41591-021-01343-4
Bera K, Schalper KA, Rimm DL, Velcheti V, Madabhushi A (2019) Artificial intelligence in digital pathology - new tools for diagnosis and precision oncology. Nat Rev Clin Oncol 16(11):703–715. https://doi.org/10.1038/s41571-019-0252-y
Schneider L, Laiouar-Pedari S, Kuntz S, Krieghoff-Henning E, Hekler A, Kather JN, Gaiser T, Fröhling S, Brinker TJ (2022) Integration of deep learning-based image analysis and genomic data in cancer pathology: a systematic review. Eur J Cancer 160:80–91. https://doi.org/10.1016/j.ejca.2021.10.007
Schettini F, Brasó-Maristany F, Kuderer NM, Prat A (2022) A perspective on the development and lack of interchangeability of the breast cancer intrinsic subtypes. NPJ Breast Cancer 8(1):85. https://doi.org/10.1038/s41523-022-00451-9
Weinstein JN, Collisson EA, Mills GB, Shaw KR, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM (2013) The cancer genome atlas pan-cancer analysis project. Nat Genet 45(10):1113–1120. https://doi.org/10.1038/ng.2764
Clark K, Vendt B, Smith K, Freymann J, Kirby J, Koppel P, Moore S, Phillips S, Maffitt D, Pringle M, Tarbox L, Prior F (2013) The cancer imaging archive (TCIA): maintaining and operating a public information repository. J Digit Imag 26(6):1045–1057. https://doi.org/10.1007/s10278-013-9622-7
Lingle W, Erickson BJ, Zuley ML, Jarosz R, Bonaccio E, Filippini J, Net JM, Levi L, Morris EA, Figler GG, Elnajjar P, Kirk S, Lee Y, Giger M, Gruszauskas N (2016) The Cancer Genome Atlas Breast Invasive Carcinoma Collection (TCGA-BRCA) (Version 3) [Data set]. The Canc Imag Arch. https://doi.org/10.7937/K9/TCIA.2016.AB2NAZRP
Chollet F et al (2015) Keras. https://keras.io
Anderson MR, Antenucci D, Cafarella MJ (2016) Runtime support for human-in-the-loop feature engineering system. IEEE Data Eng Bull 39(4):62–84
Gkorou D, Larranaga M, Ypma A, Hasibi F, Wijk RJ (2020) Get a human-in-the-loop: Feature engineering via interactive visualizations. In: Proceedings of the workshop on interactive adaptive learning co-located with european conference on machine learning and principles and practice of knowledge discovery in databases (ECML PKDD 2020), vol. 2660. CEUR Workshop Proceedings, ???. https://ceur-ws.org/Vol-2660/ialatecml_shortpaper4.pdf
Bengio Y, Louradour J, Collobert R, Weston J (2009) Curriculum learning. In: Proceedings of the 26th Annual international conference on machine learning. ICML ’09, pp. 41–48. Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/1553374.1553380. https://dl.acm.org/doi/10.1145/1553374.1553380
Holmberg L, Davidsson P, Linde P (2020) A feature space focus in machine teaching. In: 2020 IEEE International Conference on Pervasive Computing and Communications Workshops (PerCom Workshops), pp. 1–2. https://doi.org/10.1109/PerComWorkshops48775.2020.9156175. http://mau.diva-portal.org/smash/get/diva2:1428195/FULLTEXT01.pdf
Settles B (2009) Active learning literature survey. Technical report, University of Wisconsin-Madison. Department of Computer Sciences. https://minds.wisconsin.edu/handle/1793/60660
Amershi S, Cakmak M, Knox WB, Kulesza T (2014) Power to the people: the role of humans in interactive machine learning. AI Mag 35(4):105–120. https://doi.org/10.1609/aimag.v35i4.2513
Kaufmann T, Weng P, Bengs V, Hüllermeier E (2023) A survey of reinforcement learning from human feedback. https://arxiv.org/abs/2312.14925
Simard PY, Amershi S, Chickering DM, Pelton AE, Ghorashi S, Meek C, Ramos G, Suh J, Verwey J, Wang M, Wernsing J (2017) Machine teaching: a new paradigm for building machine learning systems. http://arxiv.org/abs/1707.06742
Ramos G, Meek C, Simard P, Suh J, Ghorashi S (2020) Interactive machine teaching: a human-centered approach to building machine-learned models. Human-Comput Interact 35(5–6):413–451. https://doi.org/10.1080/07370024.2020.1734931
Mosqueira-Rey E, Fernández-Castaño S, Alonso-Ríos D, Vázquez-Cano E, López-Meneses E (2023) Gamifying machine teaching: human-in-the-loop approach for diphthong and hiatus identification in spanish language. Procedia Comput Sci, 225:3086–3093 https://doi.org/10.1016/j.procs.2023.10.302
Gunning D (2017) Explainable artificial intelligence (xAI). Technical report, Defense Advanced Research Projects Agency (DARPA). https://www.darpa.mil/program/explainable-artificial-intelligence
Abdul A, Vermeulen J, Wang D, Lim BY, Kankanhalli M (2018) Trends and trajectories for explainable, accountable and intelligible systems: An hci research agenda. In: Proceedings of the 2018 CHI conference on human factors in computing systems. CHI ’18. Association for Computing Machinery, New York, NY, USA, pp. 1–18. https://doi.org/10.1145/3173574.3174156
Guillot Suarez C (2022) Human-in-the-loop hyperparameter tuning of deep nets to improve explainability of classifications. Master’s thesis, Aalto University. School of Electrical Engineering. http://urn.fi/URN:NBN:fi:aalto-202205223354
Xu W (2019) Toward human-centered AI: a perspective from human-computer interaction. Interactions 26(4):42–46. https://doi.org/10.1145/3328485
Choung H, David P, Ross A (2023) Trust and ethics in AI. AI & Society 38(2):733–745. https://doi.org/10.1007/s00146-022-01473-4
Barredo Arrieta A, Díaz-Rodríguez N, Del Ser J, Bennetot A, Tabik S, Barbado A, Garcia S, Gil-Lopez S, Molina D, Benjamins R, Chatila R, Herrera F (2020) Explainable artificial intelligence (xai): concepts, taxonomies, opportunities and challenges toward responsible AI. Inform Fus 58, 82–115 https://doi.org/10.1016/j.inffus.2019.12.012
Freitas AA (2014) Comprehensible classification models: a position paper. SIGKDD Explor. Newsl. 15(1):1–10. https://doi.org/10.1145/2594473.2594475
Ribeiro MT, Singh S, Guestrin C (2016) Model-agnostic interpretability of machine learning. arXiv:1606.05386
Slack D, Hilgard A, Singh S, Lakkaraju H (2021) Reliable post hoc explanations: Modeling uncertainty in explainability. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P.S., Vaughan, J.W. (eds.) Advances in neural information processing systems, vol. 34, pp. 9391–9404. Curran Associates, Inc., ???. https://proceedings.neurips.cc/paper_files/paper/2021/file/4e246a381baf2ce038b3b0f82c7d6fb4-Paper.pdf
Ho DJ, Yarlagadda DVK, D’Alfonso TM, Hanna MG, Grabenstetter A, Ntiamoah P, Brogi E, Tan LK, Fuchs TJ (2021) Deep multi-magnification networks for multi-class breast cancer image segmentation. Computeriz Med Imag Graph 88:101866. https://doi.org/10.1016/j.compmedimag.2021.101866
YILMAZ V (2019) Elastic deformation on images. https://towardsdatascience.com/elastic-deformation-on-images-b00c21327372
Hou L, Samaras D, Kurc TM, Gao Y, Davis JE, Saltz JH (2016) Patch-based convolutional neural network for whole slide tissue image classification. In: Proceedings of the IEEE conference on computer vision and pattern recognition (CVPR). https://openaccess.thecvf.com/content_cvpr_2016/html/Hou_Patch-Based_Convolutional_Neural_CVPR_2016_paper.html
Mehta S, Mercan E, Bartlett J, Weaver D, Elmore J, Shapiro L (2018) Learning to segment breast biopsy whole slide images. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 663–672. https://doi.org/10.1109/WACV.2018.00078
Agarwalla A, Shaban M, Rajpoot NM (2017) Representation-aggregation networks for segmentation of multi-gigapixel histology images. https://arxiv.org/abs/1707.08814
Chollet F (2017) Xception: deep learning with depthwise separable convolutions. arXiv:1610.02357
He K, Zhang X, Ren S, Sun J (2015) Deep residual learning for image recognition. arXiv:1512.03385
Liu T, Huang J, Liao T, Pu R, Liu S, Peng Y (2022) A hybrid deep learning model for predicting molecular subtypes of human breast cancer using multimodal data. IRBM 43(1):62–74. https://doi.org/10.1016/j.irbm.2020.12.002
Villareal RJT, Abu PAR (2021) Patch-based convolutional neural networks for TCGA-BRCA breast cancer classification. In: Bebis G, Athitsos V, Yan T, Lau M, Li F, Shi C, Yuan X, Mousas C, Bruder G (Eds) Advances in visual computing, pp. 29–40. Springer, Cham. https://doi.org/10.1007/978-3-030-90436-4_3
Choi JM, Chae H (2023) moBRCA-net: a breast cancer subtype classification framework based on multi-omics attention neural networks. BMC Bioinform 24(1):169. https://doi.org/10.1186/s12859-023-05273-5
Ribeiro MT, Singh S, Guestrin C (2016) “Why should I trust you?”: Explaining the predictions of any classifier. In: Proceedings of the 22nd ACM SIGKDD international conference on knowledge discovery and data mining. KDD ’16, pp. 1135–1144. Association for computing machinery, New York, NY, USA. https://doi.org/10.1145/2939672.2939778
Guidotti R, Monreale A, Ruggieri S, Turini F, Giannotti F, Pedreschi D (2018) A survey of methods for explaining black box models. ACM Comput Sur. https://doi.org/10.1145/3236009
Zhang Y, Song K, Sun Y, Tan S, Udell M (2019) Why Should You Trust My Explanation? Understanding uncertainty in LIME explanations. https://arxiv.org/abs/1904.12991
Lundberg SM, Lee S-I (2017) A unified approach to interpreting model predictions. In: Guyon I, Luxburg UV, Bengio S, Wallach H, Fergus R, Vishwanathan S, Garnett R (eds.) Advances in neural information processing systems. Proceedings of the 31st Int. Conf. on neural information processing systems. NIPS’17, vol. 30, pp. 4768–4777. Curran Associates Inc., Red Hook, NY, USA. https://proceedings.neurips.cc/paper_files/paper/2017/file/8a20a8621978632d76c43dfd28b67767-Paper.pdf
Watson DS, O’Hara J, Tax N, Mudd R, Guy I (2023) Explaining predictive uncertainty with information theoretic shapley values. arXiv:2306.05724
Alvarez-Melis D, Jaakkola TS (2018) On the robustness of interpretability methods. arXiv:1806.08049
Selvaraju RR, Cogswell M, Das A, Vedantam R, Parikh D, Batra D (2017) Grad-cam: Visual explanations from deep networks via gradient-based localization. In: 2017 IEEE International conference on computer vision (ICCV), pp. 618–626. https://doi.org/10.1109/ICCV.2017.74
Zhou B, Khosla A, Lapedriza A, Oliva A, Torralba A (2016) Learning deep features for discriminative localization. In: 2016 IEEE conference on computer vision and pattern recognition (CVPR), pp. 2921–2929. IEEE Computer Society, Los Alamitos, CA, USA. https://doi.org/10.1109/CVPR.2016.319. https://doi.ieeecomputersociety.org/10.1109/CVPR.2016.319
Lin M, Chen Q, Yan S (2014) Network in network. https://arxiv.org/pdf/1312.4400v3.pdf
Chattopadhay A, Sarkar A, Howlader P, Balasubramanian VN (2018) Grad-cam++: Generalized gradient-based visual explanations for deep convolutional networks. In: 2018 IEEE Winter conference on applications of computer vision (WACV), pp. 839–847. https://doi.org/10.1109/WACV.2018.00097
Li S, Li T, Sun C, Yan R, Chen X (2023) Multilayer grad-cam: an effective tool towards explainable deep neural networks for intelligent fault diagnosis. J Manuf Syst 69:20–30. https://doi.org/10.1016/j.jmsy.2023.05.027
Bengio Y (2012) Practical recommendations for gradient-based training of deep architectures. arXiv:1206.5533
Feurer M, Hutter F (2019) Hyperparameter optimization. In: Hutter, F., Kotthoff, L., Vanschoren, J. (eds.) Automated machine learning: methods, systems, challenges, pp. 3–33. Springer. https://doi.org/10.1007/978-3-030-05318-5_1
Bergstra J, Bardenet R, Bengio Y, Kégl B (2011) Algorithms for hyper-parameter optimization. In: Proceedings of the 24th International conference on neural information processing systems. NIPS’11, pp. 2546–2554. Curran Associates Inc., Red Hook, NY, USA. https://dl.acm.org/doi/10.5555/2986459.2986743
Bishop CM, Nasrabadi NM (2006) Pattern recognition and machine learning, vol 4. Springer, New York, NY, USA
Chen Z, Mak S, Wu CFJ (2023) A hierarchical expected improvement method for Bayesian optimization. arXiv:1911.07285pdf
Wu J, Chen X-Y, Zhang H, Xiong L-D, Lei H, Deng S-H (2019) Hyperparameter optimization for machine learning models based on bayesian optimizationb. J Electr Sci Technol 17(1), 26–40 https://doi.org/10.11989/JEST.1674-862X.80904120
Nogueira F (2014) Bayesian optimization: open source constrained global optimization tool for Python. https://github.com/bayesian-optimization/BayesianOptimization
Shahriari B, Swersky K, Wang Z, Adams RP, Freitas N (2016) Taking the human out of the loop: a review of Bayesian optimization. Proc IEEE 104(1):148–175. https://doi.org/10.1109/JPROC.2015.2494218
Brochu E, Brochu T, Freitas N (2010) A bayesian interactive optimization approach to procedural animation design. In: Proceedings of the 2010 ACM SIGGRAPH/Eurographics symposium on computer animation. SCA ’10, pp. 103–112. Eurographics Association, Goslar, DEU. https://dl.acm.org/doi/abs/10.5555/1921427.1921443
Kim M, Ding Y, Malcolm P, Speeckaert J, Siviy CJ, Walsh CJ, Kuindersma S (2017) Human-in-the-loop Bayesian optimization of wearable device parameters. Plos One. https://doi.org/10.1371/journal.pone.0184054
Acknowledgements
This work has been supported by the State Research Agency of the Spanish Government (Grants PID2019-107194GB-I00/AEI/10.13039/501100011033 and Project PID2023-147422OB-I00) and by the Xunta de Galicia (Grant ED431C 2022/44), supported by the EU European Regional Development Fund (ERDF). We wish to acknowledge support received from the Centro de Investigación de Galicia CITIC, funded by the Xunta de Galicia and ERDF (Grant ED431C 2022/44). Funding for open access charge: Universidade da Coruña/CISUG. The results published here are in whole or part based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.
Funding
The funding received has been stated in the acknowledgments section.
Author information
Authors and Affiliations
Contributions
David Vázquez-Lema main programmer and developer, also writer, reviewer and editor. Eduardo Mosqueira-Rey main writer of the manuscript, also reviewer and editor. Elena Hernández-Pereira assistant writer in interpretation tasks, also reviewer and editor. Carlos Fernández-Lozano ML specialist in cancer diseases acting as advisor in all the tasks of the paper, also reviewer and editor. Fernando Seara-Romero software engineer in charge of implementing the interpretation task. Jorge Pombo-Otero pathologist in charge of all the tasks involving HITL machine learning.
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Vázquez-Lema, D., Mosqueira-Rey, E., Hernández-Pereira, E. et al. Segmentation, classification and interpretation of breast cancer medical images using human-in-the-loop machine learning. Neural Comput & Applic 37, 3023–3045 (2025). https://doi.org/10.1007/s00521-024-10799-7
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-10799-7