Abstract
The CoAtNet deep neural model has been shown to achieve state-of-the-art performance by stacking convolutional and self-attention layers. In particular, the initial layers of CoAtNet apply efficient convolutions for extracting local features out of the input image and the initial fine-resolution feature maps. In turn, the final layers apply more cumbersome Transformers in order to extract global features from the coarse-resolution feature maps. The model’s outcome directly depends on those final global features. This paper proposes an extension of the original CoAtNet model based on the introduction of a dual stream of convolution and self-attention blocks applied at the final layers of CoAtNet. In this way, those final layers automatically aggregate both local and global features extracted from the initial feature maps. Two dual-stream topologies have been proposed and evaluated. This Dual-Stream CoAtNet model exhibits a significant improvement on the segmentation accuracy of breast ultrasound images, thus contributing to the development of more robust tumor detection methods.












Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The availability of the public dataset used in this study is openly accessible through the provided source reference. For access to the private dataset utilized in this research, interested researchers can directly contact the UDIAT Diagnostic Centre as indicated in the citation.
References
Feng J, Polychronidis G, Heger U, Frongia G, Mehrabi A, Hoffmann K (2019) Incidence trends and survival prediction of hepatoblastoma in children: a population-based study. Cancer Commun 39:1–9
Huang Q, Huang Y, Luo Y, Yuan F, Li X (2020) Segmentation of breast ultrasound image with semantic classification of superpixels. Med Image Anal 61:101657
Guo Z, Xie J, Wan Y, Zhang M, Qiao L, Yu J, Chen S, Li B, Yao Y (2022) A review of the current state of the computer-aided diagnosis (cad) systems for breast cancer diagnosis. Open Life Sci 17:1600–1611
Xian M, Zhang Y, Cheng H-D, Xu F, Huang K, Zhang B, Ding J, Ning C, Wang Y (2018) A benchmark for breast ultrasound image segmentation (BUSIS), Infinite Study
Dai Z, Liu H, Le QV, Tan M (2021) Coatnet: Marrying convolution and attention for all data sizes, CoRR abs/2106.04803. arXiv:2106.04803
Göçeri E (2017) Intensity normalization in brain mr images using spatially varying distribution matching, in: International conference on computer graphics, visualization, computer vision and image processing, pp. 300–304
Göçeri E (2018) Fully automated and adaptive intensity normalization using statistical features for brain mr images, Celal Bayar University Journal of. Science 14:125–134
Hardaha S, Edla DR, Parne SR (2023) A survey on convolutional neural networks for mri analysis. Wireless Pers Commun 128:1065–1085
Göçeri E (2020) Convolutional neural network based desktop applications to classify dermatological diseases, in, (2020) IEEE 4th international conference on image processing, applications and systems (IPAS). IEEE 138–143
Idlahcen F, Idri A, Göçeri E (2024) Exploring data mining and machine learning in gynecologic oncology. Artif Intell Rev 57:20
Ronneberger O, Fischer P, Brox T (2015) U-net: Convolutional networks for biomedical image segmentation, in: Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18, Springer, pp. 234–241
Zhou Z, Rahman Siddiquee MM, Tajbakhsh N, Liang J (2018) Unet++: A nested u-net architecture for medical image segmentation, in: Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support: 4th International Workshop, DLMIA 2018, and 8th International Workshop, ML-CDS 2018, Held in Conjunction with MICCAI 2018, Granada, Spain, September 20, 2018, Proceedings 4, Springer, pp. 3–11
Diakogiannis FI, Waldner F, Caccetta P, Wu C (2019) Resunet-a: a deep learning framework for semantic segmentation of remotely sensed data, CoRR abs/1904.00592. arXiv:1904.00592
Huang K, Zhang Y, Cheng H-D, Xing P, Zhang B (2019) Fuzzy semantic segmentation of breast ultrasound image with breast anatomy constraints, arXiv preprint arXiv:1909.06645
Nair AA, Washington KN, Tran TD, Reiter A, Bell MAL (2020) Deep learning to obtain simultaneous image and segmentation outputs from a single input of raw ultrasound channel data. IEEE Trans Ultrason Ferroelectr Freq Control 67:2493–2509
Zhuang Z, Li N, Joseph Raj AN, Mahesh VG, Qiu S (2019) An rdau-net model for lesion segmentation in breast ultrasound images. PLoS ONE 14:e0221535
Zaidkilani N, Abdel-Nasser M, Garcia MA, Puig D (2022) Breast ultrasound cad system based on efficient tumour segmentation network and transfer-learned features, in: 2022 5th International conference on multimedia, signal processing and communication technologies (IMPACT), IEEE, pp. 1–5
Shareef B, Xian M, Vakanski A (2020) Stan: small tumor-aware network for breast ultrasound image segmentation, in, (2020) IEEE 17th International symposium on biomedical imaging (ISBI). IEEE 1–5
Vakanski A, Xian M, Freer PE (2020) Attention-enriched deep learning model for breast tumor segmentation in ultrasound images. Ultrasound Med Biol 46:2819–2833
Deng E, Qin Z, Chen D, Qin Z, Ding Y, Geng J, Zhang N (2022) Engan: Enhancement generative adversarial network in medical image segmentation
Byra M, Jarosik P, Szubert A, Galperin M, Ojeda-Fournier H, Olson L, O’Boyle M, Comstock C, Andre M (2020) Breast mass segmentation in ultrasound with selective kernel u-net convolutional neural network. Biomed Signal Process Control 61:102027
Zhou Q, Wang Q, Bao Y, Kong L, Jin X, Ou W (2022) Laednet: a lightweight attention encoder-decoder network for ultrasound medical image segmentation. Comput Electr Eng 99:107777
Xu M, Huang K, Qi X (2023) A regional-attentive multi-task learning framework for breast ultrasound image segmentation and classification. IEEE Access 11:5377–5392
Zhang S, Liao M, Wang J, Zhu Y, Zhang Y, Zhang J, Zheng R, Lv L, Zhu D, Chen H et al (2023) Fully automatic tumor segmentation of breast ultrasound images with deep learning. J Appl Clin Med Phys 24:e13863
Tang F, Ding J, Wang L, Xian M, Ning C (2023) Multi-level global context cross consistency model for semi-supervised ultrasound image segmentation with diffusion model, arXiv preprint arXiv:2305.09447
Ahmed S, Hasan MK (2023) Coma-net: towards generalized medical image segmentation using complementary attention guided bipolar refinement modules. Biomed Signal Process Control 86:105198
Ta N, Chen H, Liu X, Jin N (2023) Let-net: locally enhanced transformer network for medical image segmentation. Multimedia Syst 29:3847–3861
Heidari M, Kazerouni A, Soltany M, Azad R, Aghdam EK, Cohen-Adad J, Merhof D (2023) Hiformer: Hierarchical multi-scale representations using transformers for medical image segmentation, in: Proceedings of the IEEE/CVF winter conference on applications of computer vision, pp. 6202–6212
Yuan F, Zhang Z, Fang Z (2023) An effective cnn and transformer complementary network for medical image segmentation. Pattern Recogn 136:109228
Dar MF, Ganivada A (2023) Efficientu-net: a novel deep learning method for breast tumor segmentation and classification in ultrasound images. Neural Process Lett 55:10439–10462
Yang L, Fan C, Lin H, Qiu Y (2023) Rema-net: an efficient multi-attention convolutional neural network for rapid skin lesion segmentation. Comput Biol Med 159:106952
Ahmed MR, Ashrafi AF, Ahmed RU, Shatabda S, Islam AM, Islam S (2023) Doubleu-netplus: a novel attention and context-guided dual u-net with multi-scale residual feature fusion network for semantic segmentation of medical images. Neural Comput Appl 35:14379–14401
Hekal AA, Elnakib A, Moustafa HE-D, Amer HM (2024) Breast cancer segmentation from ultrasound images using deep dual-decoder technology with attention network, IEEE Access
Zhang H, Lian J, Yi Z, Wu R, Lu X, Ma P, Ma Y (2024) Hau-net: hybrid cnn-transformer for breast ultrasound image segmentation. Biomed Signal Process Control 87:105427
Üzen H (2024) Convmixer-based encoder and classification-based decoder architecture for breast lesion segmentation in ultrasound images. Biomed Signal Process Control 89:105707
Chicco D, Jurman G (2020) The advantages of the matthews correlation coefficient (mcc) over f1 score and accuracy in binary classification evaluation. BMC Genomics 21:1–13
Berman M, Triki AR, Blaschko MB (2018) The lovász-softmax loss: A tractable surrogate for the optimization of the intersection-over-union measure in neural networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 4413–4421
Al-Dhabyani W, Gomaa M, Khaled H, Fahmy A (2020) Dataset of breast ultrasound images. Data Brief 28:104
Göçeri E (2023) Medical image data augmentation: techniques, comparisons and interpretations. Artif Intell Rev 56:12561–12605
Göçeri E (2023) Comparison of the impacts of dermoscopy image augmentation methods on skin cancer classification and a new augmentation method with wavelet packets. Int J Imaging Syst Technol 33:1727–1744
Göçeri E (2020) Image augmentation for deep learning based lesion classification from skin images, in, (2020) IEEE 4th International conference on image processing, applications and systems (IPAS). IEEE 144–148
Zagoruyko S, Komodakis N (2016) Wide residual networks, arXiv preprint arXiv:1605.07146
Xie S, Girshick R, Dollár P, Tu Z, He K (2017) Aggregated residual transformations for deep neural networks, in: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 1492–1500
Oktay O, Schlemper J, Folgoc LL, Lee M, Heinrich M, Misawa K, Mori K, McDonagh S, Hammerla NY, Kainz B, et al (2018) Attention u-net: learning where to look for the pancreas, arXiv preprint arXiv:1804.03999
Chen L-C, Papandreou G, Schroff F, Adam H (2017) Rethinking atrous convolution for semantic image segmentation, arXiv preprint arXiv:1706.05587
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39:2481–2495
Sovrasov V (2019) Flops counter for convolutional networks in pytorch framework. https://github.com/sovrasov/flops-counter.pytorch/
Göçeri E (2023) Evaluation of denoising techniques to remove speckle and gaussian noise from dermoscopy images. Comput Biol Med 152:106474
Muthana R, Alshareefi AN (2020) Techniques in de-blurring image, in: Journal of physics: conference series, volume 1530, IOP Publishing, p. 012115
Awad A (2019) Denoising images corrupted with impulse, gaussian, or a mixture of impulse and gaussian noise. Eng Sci Technol Int J 22:746–753
Rajagopal A, Hamilton RB, Scalzo F (2016) Noise reduction in intracranial pressure signal using causal shape manifolds. Biomed Signal Process Control 28:19–26
Ilesanmi AE, Idowu OP, Chaumrattanakul U, Makhanov SS (2021) Multiscale hybrid algorithm for pre-processing of ultrasound images. Biomed Signal Process Control 66:102396
Hooi FM, Kripfgans O, Carson PL (2016) Acoustic attenuation imaging of tissue bulk properties with a priori information. J Acoust Soc Am 140:2113–2122
Biswas B, Sen BK, Dey KN (2018) Ultrasound medical image deblurring and denoising method using variational model on cuda. Adv Comput Syst Secur 5:95–108
Göçeri E (2024) Polyp segmentation using a hybrid vision transformer and a hybrid loss function, J Imaging Inform Med 1–13
Göçeri E (2021) An application for automated diagnosis of facial dermatological diseases. İzmir Katip Çelebi Üniversitesi Sağlık Bilimleri Fakültesi Dergisi 6:91–99
Göçeri E (2023) Classification of skin cancer using adjustable and fully convolutional capsule layers. Biomed Signal Process Control 85:104949
Göçeri E (2021) Analysis of capsule networks for image classification, in: International conference on computer graphics, visualization, computer vision and image processing
Göçeri E (2021) Capsule neural networks in classification of skin lesions, in: International conference on computer graphics, visualization, computer vision and image processing, pp. 29–36
Acknowledgements
The Spanish Government partly supported this research through Project TED2021-130081B-C21 and Project PDC2022-133383-I00.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zaidkilani, N., Garcia, M.A. & Puig, D. Dual-Stream CoAtNet models for accurate breast ultrasound image segmentation. Neural Comput & Applic 36, 16427–16443 (2024). https://doi.org/10.1007/s00521-024-09963-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00521-024-09963-w