Weakly Supervised Segmentation Framework with Uncertainty: A Study on Pneumothorax Segmentation in Chest X-ray

Ouyang, Xi; Xue, Zhong; Zhan, Yiqiang; Zhou, Xiang Sean; Wang, Qingfeng; Zhou, Ying; Wang, Qian; Cheng, Jie-Zhi

doi:10.1007/978-3-030-32226-7_68

Xi Ouyang^16,17,
Zhong Xue¹⁶,
Yiqiang Zhan¹⁶,
Xiang Sean Zhou¹⁶,
Qingfeng Wang¹⁸,
Ying Zhou¹⁹,
Qian Wang¹⁷ &
…
Jie-Zhi Cheng¹⁶

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 11769))

Included in the following conference series:

International Conference on Medical Image Computing and Computer-Assisted Intervention

11k Accesses
22 Citations

Abstract

Pneumothorax is a critical abnormality that shall be treated with higher priority, and hence a computerized triage scheme is needed. A deep-learning-based framework to automatically segment the pneumothorax in chest X-rays is developed to support the realization of a triage system. Since a large number of pixel-level annotations is commonly needed but difficult to obtain for deep learning model, we propose a weakly supervised framework that allows partial training data to be weakly annotated with only image-level labels. We employ the attention masks derived from an image-level classification model as the pixel-level masks for those weakly-annotated data. Because the attention masks are rough and may have errors, we further develop a spatial label smoothing regularization technique to explore the uncertainty for the incorrectness of the attention masks in the training of segmentation model. Experimental results show that the proposed weakly supervised segmentation algorithm relieves the need of well-annotated data and yield satisfactory performance on the pneumothorax segmentation.

You have full access to this open access chapter, Download conference paper PDF

Weakly-Supervised Segmentation for Disease Localization in Chest X-Ray Images

Dual Consistency Regularization for Semi-supervised Medical Image Segmentation

Positional Information is a Strong Supervision for Volumetric Medical Image Segmentation

Article 16 June 2023

Keywords

1 Introduction

Pneumothorax is a lung abnormality with air leaking into the space between the lung and chest wall. It can be caused by chest injury or trauma, certain medical procedures, damage from underlying lung disease, or sometimes no obvious reason [12]. The most common imaging tool for the diagnosis of pneumothorax is chest X-ray (CXR). Large pneumothorax can be fatal as air compression may cause significant impairment to circulation and respiration. Accordingly, large pneumothorax shall be classified as a critical abnormality that requires immediate treatment, as shown in the latest computerized CXR triage study [1].

Since the diagnosis and treatment of critical pneumothorax are directly related to the size of pneumothorax area, segmenting pneumothorax accurately rather than image-level classification, is needed for the triage of CXR. However, as shown in a previous study [5], the performance of pneumothorax diagnosis in CXR is highly dependent on the physicians experience, implying that the pixel-level annotations with good quality is difficult to obtain and can be very limited. On the other hand, large image-level annotations can be relatively easy to access with the text analysis techniques on radiological reports [9]. Motivated by this, we propose a weakly supervised learning approach that aims to ease the requirement of annotating all training data in pixel level. Specifically, we allow parts of training data to be weakly annotated with only image-level labels, i.e., pneumothorax or not.

In the literature, to boost the localization of abnormalities, Li et al. [7] developed an end-to-end deep multi-instance network to identify image-level abnormalities with annotated bounding boxes. Yan et al. [10] introduced a weakly-supervised deep learning framework for abnormality identification and localization. Cai et al. [2] further proposed an attention mining strategy to improve identification performance. However, these studies focused on providing rough heatmaps to indicate abnormal regions. However, the precise pixel-level segmentation results, can not be acquired through these methods.

To get better synergy between the well- and weakly-annotated data, our approach employs the spatial label smoothing regularization (SLSR) technique to leverage the network-generated attention masks from the weakly-annotated data. Specifically, we realize the proposed method in two stages, see Fig. 1. First, an image-level classification (pneumothorax or not) model is implemented to obtain the attention masks. The attention masks from this classification model may roughly suggest the pneumothorax regions from the weakly-annotated data. Second, an image segmentation model is designed to use both well- and weakly-annotated data, where the attention masks for weakly-annotated data are incorporated. Since the attention masks do not delineate the exact pneumothorax regions and may have some errors, we employ the SLSR technique to consider the uncertainty for the incorrectness of the attention masks during the training process of the segmentation model. The label smoothing regularization technique addresses label corruption issue by treating labels as probability variables with slightly numerical perturbation, and is proved to be robust to noisy labels [13]. Our experimental results show that our method can improve the performance of the segmentation model with both well- and weakly-annotated data. The contribution of this paper can be two-fold. First, a novel weakly supervised framework is proposed to equip the deep learning segmentation with the capability of learning with well- and weakly-annotated data. Second, we demonstrate the effectiveness of the weakly supervised framework on the pneumothorax segmentation problem in CXR images. As shown in Fig. 2, the pneumothorax segmentation is difficult as several issues such as inter-subject variations, the variety of pneumothorax degree, location and shape, need to be addressed. The pneumothorax segmentation can help identify critical cases and expedite the treatment process.

2 Methodology

As shown in Fig. 1, the proposed weakly supervised approach consists of two steps: (1) image-level classification that generates attention masks at the same time; (2) pixel-level segmentation with SLSR that leverages attention masks. The details will be elaborated as follows.

2.1 Image-Level Classification

Image-level classification for our task (pneumothorax or not) is carried out with Resnet101 model [11]. The attention mining method, Guided Attention Inference Network (GAIN) [6], is performed to generate the attention masks. The sizes of the obtained attention masks are 1/8 of the input sizes, and they are further resized via bilinear interpolation to fit the requirement for the pixel-level segmentation model.

2.2 Pixel-Level Segmentation

The SLSR technique is developed to leverage the attention masks of the weakly-annotated data for better performance and mitigate the problem of less accessibility of well-annotated data. Since the attention masks simply capture rough pneumothorax regions and may have some errors, we explore the uncertainty of attention masks by numerically perturbing one-hot label distribution into the probabilistic domain as shown in Fig. 3. Specifically, let $k \in \{0,1\}$ be the object classes, where class 0 stands for background and 1 for pneumothorax class.

Assuming that label prior distribution is uniform, the ground-truth label distribution with the consideration of potential label corruption in each pixel $PIX_{i,j}$ (i and j refer to the row and column indices, respectively) is defined as:

$$\begin{aligned} \small {q_{i,j}}(k) = {{\left\{ \begin{array}{ll} \frac{\varepsilon }{2}, &{} k \ne {y_{i,j}}\\ 1\mathrm{{ - }}\varepsilon \mathrm{{ + }}\frac{\varepsilon }{2}, &{} k = {y_{i,j}} \end{array}\right. } = } {{\left\{ \begin{array}{ll} \frac{\varepsilon }{2}, &{} k \ne {y_{i,j}}\\ 1\mathrm{{ - }}\frac{\varepsilon }{2}, &{} k = {y_{i,j}}, \end{array}\right. }} \end{aligned}$$

(1)

where $y_{i,j}$ is the ground-truth class of the pixel $PIX_{i,j}$, and $\varepsilon $ is the perturbing parameter. The uncertainty for an attention mask can be categorized into two types: all-class and ground-truth uncertainty. The all-class uncertainty suggests that each pixel in an attention mask has equal potential to be either pneumothorax or background with the probability of $\varepsilon /2$. The ground-truth uncertainty specifies that the pneumothorax region in an attention mask may have errors and the corresponding probabilities of the pixels are perturbed with $\varepsilon $.

Assuming ${p_{i,j}}(k)$ is the predicted probability of class k from the network, it comes from the softmax function after the final convolutional layer in the segmentation task. With the Eq. 1, the cross-entropy loss of the pixel $PIX_{i,j}$ can be further computed as:

$$\begin{aligned} \small {l_{i,j}} = - \sum \limits _{k = 0}^1 {\log ({p_{i,j}}(k)){q_{i,j}}(k)}=- (1 - \varepsilon )\log ({p_{i,j}}(y_{i,j})) - \frac{\varepsilon }{{2}}\sum \limits _{k = 0}^1 {\log ({p_{i,j}}(k))}. \end{aligned}$$

(2)

Accordingly, the loss function for a weakly-annotated image with the attention mask is the summation of $l_{i,j}$ in all the pixels (H and W are height and width of the input image), which is defined as:

$$\begin{aligned} \begin{aligned} \small l = \sum \limits _{i = 1}^H {\sum \limits _{j = 1}^W {l_{i,j}} } = - (1 - \varepsilon )\sum \limits _{i = 1}^H {\sum \limits _{j = 1}^W {\log ({p_{i,j}}(y_{i,j}))} } - \frac{\varepsilon }{{2}}\sum \limits _{k = 0}^1 {\sum \limits _{i = 1}^H {\sum \limits _{j = 1}^W {\log ({p_{i,j}}(k))} } } . \end{aligned} \end{aligned}$$

(3)

Since the training data are comprised of well- and weakly-annotated data, we further introduce a variable of indicator, z, to specify if the data is well or weakly annotated. Meanwhile, referring to Eq. 3, the first term is only effective in the foreground class whereas the second term considers both foreground and background classes. Since in most cases the number of background pixels is significantly larger than the number of the foreground, we implement a weighting factor to soothe the pixel sample imbalance issue. Specifically, the loss function is further defined as:

$$\begin{aligned} \begin{aligned} {l_{SLSR}} =&- (1 - z \cdot \varepsilon )\sum \limits _{i = 1}^H {\sum \limits _{j = 1}^W {\log ({p_{i,j}}(y_{i,j}))}} \\&- z\sum \limits _{k = 0}^1 {\left\{ {(1 + \sum \limits _{i = 1}^H {\sum \limits _{j = 1}^W {I(k = {y_{i,j}})} } ) \cdot \frac{\varepsilon }{{2HW}} \cdot \sum \limits _{i = 1}^H {\sum \limits _{j = 1}^W {\log ({p_{i,j}}(k))} } } \right\} }, \end{aligned} \end{aligned}$$

(4)

where ${I(k={y_{i,j}})}$ is an indicator function to calculate the ground-truth pixel number for k-th class. For a well-annotated image sample, we set $z = 0$, while $z=1$ for weakly-annotated image sample. With the implementation of Eq. 4, the pixel-level segmentation model can be trained with well- and weakly-annotated data, and still attain satisfactory performance.

3 Experiments

3.1 Dataset

Totally, 5400 frontal-view chest X-ray images are collected from the Mianyang Central Hospital, Mianyang, Sichuan, China, with IRB approval. Specifically, 3400 images were diagnosed with pneumothorax, whereas the 2000 images are normal cases. 800 pneumothorax images are well-annotated with pixel-level ground-truth masks by an experienced radiologist and the remaining 2600 pneumothorax images only have the image-level labels. The 800 well-annotated data and 2000 normal cases are randomly and evenly split into 2 groups for training and testing. Therefore, the total number of testing data is 1400, whereas the number of available data for training is 4000.

3.2 Experimental Settings

For the segmentation network, three state-of-the-art (SOTA) networks: U-Net [8], LinkNet [3], and Tiramisu [4] are implemented. For Tiramisu, the specific FCDenseNet67 is employed. For all three models, Adam optimization method is employed and the value of momentum is set as 0.9. The learning rate is initiated with 0.0001 and will be gradually degraded by a factor of 0.1 every 100 epochs. The size of network inputs is $256 \times 256$. For U-Net and LinkNet, we set the batch size as 32, whereas the batch size of Tiramisu model is set as 5. The perturbing parameter $\varepsilon $ is set to 0.1.

Table 1. Performance of different the SOTA models, and Tiramisu achieves the best performance.

Full size table

Table 2. Performance w.r.t. different pneumothorax degrees.

Full size table

3.3 Experimental Results

Results of SOTA Networks. In this experiment, we aim to illustrate the performance limitation of the SOTA networks on the pneumothorax segmentation problem. Specifically, the training data for segmentation, i.e. 400 well-annotated and 1000 normal CXR images, is employed to train the three SOTA networks. The results of the SOTA networks on the 1400 testing data with the IoU (intersection over union) metrics are shown in Table 1. As can be found, the Tiramisu model achieves the best performance with 0.640 of IoU value. Accordingly, the pneumothorax segmentation problem is quite difficult. To further investigate the factor of pneumothorax severity on the segmentation efficacy, the 400 pneumothorax cases are further divided into three groups of small, medium and large, based on the collapse ratios that meet the criteria of $<= 0.10$, 0.10–0.30, and $>= 0.30$, respectively. The collapse ratio of the pneumothorax region against the lung field is one of the common quantitative metrics for the measure of pneumothorax severity. The small, medium, and large groups comprise of 156, 142, and 102 cases, respectively. The performance of the Tiramisu model on the three groups is shown in Table 2. Since the IoU is very sensitive to object size, the IoU value for the small group is not very high. Small pneumothorax is usually less critical as large pneumothorax, and sometimes may heal on its own.

Table 3. The results with different combinations of well- and weakly-annotated data. The column “Test” the IoU performance of the entire testing data, whereas the columns “Small”, “Medium” and “Large” are the IoU performances of the dataset with three groups of severity, respectively.

Full size table

Efficacy of the SLSR. In this experiment, the efficacy of the SLSR on the weakly supervised framework is illustrated. Referring to Table 1, we here only consider the Tiramisu model. The image-level classification model (GAIN) is firstly trained with the 2600 weakly-annotated pneumothorax and 1000 normal images to obtain the attention masks. We use this model to generate attention masks for all test images, and take these results as segmentation prediction and get the IoU value (0.128), which serves the baseline of only using image-level annotations. Afterward, the segmentation models are trained with different experimental settings of ratios between well- and weakly-annotated data. Specifically, we set 4 groups of experiments that include 100, 200, 300, and 400 well-annotated pneumothorax cases. For each group, we firstly train a Tiramisu model using the selected well-annotated pneumothorax cases and 1000 normal images. Then, we also add the selected well-annotated cases to retrain the GAIN model, which can help to improve the quality of generated attention masks. Finally, we consider several experiments with adding 0, 200, 400, and 800 weakly-annotated pneumothorax cases to train the Tiramisu models with SLSR loss (Eq. 4).

The experimental results are shown in Table 2. As can be found, more involvement of well-annotated data will improve the segmentation performance. Meanwhile, for each group of experiments, it can be observed from Table 2 that if the numbers of weakly- and well-annotated data are close, the best synergy can be achieved. In particular, for the case of adding of 200 weakly-annotated data to the 300 well-annotated group, the best segmentation performance such as 0.637 IoU can be achieved, which is close to 0.640 IoU achieved by the Tiramisu model in the SOTA experiment. Meanwhile, the models with 200–800 weakly-annotated data in the 400 well-annotated group can outperform the model with only 400 well-annotated data, whereas the best performance can be achieved by the model with 400 weakly-annotated data and 400 well-annotated data (0.669 IoU). We also conduct an experiment for the setting of $\varepsilon $ with the results given in Table 4, where $\varepsilon =0$ equals to the vanilla cross-entropy loss. We can see the setting of $\varepsilon =0.1$ can achieve the best performance.

Table 4. Results of different settings for $\varepsilon $ value.

Full size table

Visualization Results. The intermediate results of three testing data with the different degree of pneumothorax severity are shown in Fig. 4. As can be found, the segmentation results with only 200 well-annotated data (without including weakly-annotated data) are not very promising. However, with the inclusion of more well-annotated data and even number of weakly-annotated data, better performance can be achieved. Meanwhile, the attention maps of classification model with image-level annotations are also shown in Fig. 4. The attention maps are very rough and have too many errors to precisely segment and measure the pneumothorax for the application of CXR triage. More examples can be found in supplementary materials.

4 Conclusion

A novel spatial label smoothing regularization method has been developed to explore the uncertainty of weakly-annotated data in the weakly supervised segmentation framework. As can be observed in Table 3 and Fig. 4, the proposed method can relieve the need for well-annotated data by achieving competitive performance. The proposed method has been evaluated on the difficult pneumothorax segmentation problem in CXR images with extensive experiments.

References

Annarumma, M., Withey, S.J., Bakewell, R.J., Pesce, E., Goh, V., Montana, G.: Automated triaging of adult chest radiographs with deep artificial neural networks. Radiology 291, 180921 (2019)
Article Google Scholar
Cai, J., Lu, L., Harrison, A.P., Shi, X., Chen, P., Yang, L.: Iterative attention mining for weakly supervised thoracic disease pattern localization in chest X-Rays. In: Frangi, A.F., Schnabel, J.A., Davatzikos, C., Alberola-López, C., Fichtinger, G. (eds.) MICCAI 2018. LNCS, vol. 11071, pp. 589–598. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-00934-2_66
Chapter Google Scholar
Chaurasia, A., Culurciello, E.: LinkNet: exploiting encoder representations for efficient semantic segmentation. In: 2017 IEEE Visual Communications and Image Processing (VCIP), pp. 1–4. IEEE (2017)
Google Scholar
Jégou, S., Drozdzal, M., Vazquez, D., Romero, A., Bengio, Y.: The one hundred layers tiramisu: fully convolutional densenets for semantic segmentation. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), pp. 1175–1183. IEEE (2017)
Google Scholar
Kelly, B.S., Rainford, L.A., Darcy, S.P., Kavanagh, E.C., Toomey, R.J.: The development of expertise in radiology: in chest radiograph interpretation,"expert" search pattern may predate "expert" levels of diagnostic accuracy for pneumothorax identification. Radiology 280(1), 252–260 (2016)
Article Google Scholar
Li, K., Wu, Z., Peng, K.C., Ernst, J., Fu, Y.: Tell me where to look: guided attention inference network. arXiv preprint arXiv:1802.10171 (2018)
Li, Z., et al.: Thoracic disease identification and localization with limited supervision. arXiv preprint arXiv:1711.06373 (2017)
Ronneberger, O., Fischer, P., Brox, T.: U-Net: convolutional networks for biomedical image segmentation. In: Navab, N., Hornegger, J., Wells, W.M., Frangi, A.F. (eds.) MICCAI 2015. LNCS, vol. 9351, pp. 234–241. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24574-4_28
Chapter Google Scholar
Wang, X., Peng, Y., Lu, L., Lu, Z., Bagheri, M., Summers, R.M.: ChestX-ray8: Hospital-scale chest X-ray database and benchmarks on weakly-supervised classification and localization of common thorax diseases. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2097–2106 (2017)
Google Scholar
Yan, C., Yao, J., Li, R., Xu, Z., Huang, J.: Weakly supervised deep learning for thoracic disease classification and localization on chest x-rays. In: Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 103–110. ACM (2018)
Google Scholar
Yu, F., Koltun, V., Funkhouser, T.: Dilated residual networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 472–480 (2017)
Google Scholar
Zarogoulidis, P., et al.: Pneumothorax: from definition to diagnosis and treatment. J. Thorac. Dis. 6(Suppl 4), S372 (2014)
Google Scholar
Zheng, Z., Zheng, L., Yang, Y.: Unlabeled samples generated by GAN improve the person re-identification baseline in vitro. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 3754–3762 (2017)
Google Scholar

Download references

Acknowledgement

This work was partially supported by the National Key Research and Development Program of China (2018YFC0116400), STCSM grants (19QC1400600, 17411953300), and the Shanghai Municipal Commission of Economy and Informatization (2017RGZN01026).

Author information

Authors and Affiliations

Shanghai United Imaging Intelligence Co., Ltd., Shanghai, China
Xi Ouyang, Zhong Xue, Yiqiang Zhan, Xiang Sean Zhou & Jie-Zhi Cheng
Institute for Medical Imaging Technology, School of Biomedical Engineering, Shanghai Jiao Tong University, Shanghai, China
Xi Ouyang & Qian Wang
School of Computer Science and Technology, Southwest University of Science and Technology, Mianyang, China
Qingfeng Wang
Radiology Department, Mianyang Central Hospital, Mianyang, China
Ying Zhou

Authors

Xi Ouyang
View author publications
You can also search for this author in PubMed Google Scholar
Zhong Xue
View author publications
You can also search for this author in PubMed Google Scholar
Yiqiang Zhan
View author publications
You can also search for this author in PubMed Google Scholar
Xiang Sean Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Qingfeng Wang
View author publications
You can also search for this author in PubMed Google Scholar
Ying Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Qian Wang
View author publications
You can also search for this author in PubMed Google Scholar
Jie-Zhi Cheng
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Jie-Zhi Cheng .

Editor information

Editors and Affiliations

University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Dinggang Shen
University of Georgia, Athens, GA, USA
Tianming Liu
Western University, London, ON, Canada
Terry M. Peters
Yale University, New Haven, CT, USA
Lawrence H. Staib
University of Strasbourg, Illkirch, France
Caroline Essert
United Imaging Intelligence, Shanghai, China
Sean Zhou
University of North Carolina at Chapel Hill, Chapel Hill, NC, USA
Pew-Thian Yap
Western University, London, ON, Canada
Ali Khan

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 537 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ouyang, X. et al. (2019). Weakly Supervised Segmentation Framework with Uncertainty: A Study on Pneumothorax Segmentation in Chest X-ray. In: Shen, D., et al. Medical Image Computing and Computer Assisted Intervention – MICCAI 2019. MICCAI 2019. Lecture Notes in Computer Science(), vol 11769. Springer, Cham. https://doi.org/10.1007/978-3-030-32226-7_68

Download citation

DOI: https://doi.org/10.1007/978-3-030-32226-7_68
Published: 10 October 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-32225-0
Online ISBN: 978-3-030-32226-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Societies and partnerships

The Medical Image Computing and Computer Assisted Intervention Society (opens in a new tab)

Weakly Supervised Segmentation Framework with Uncertainty: A Study on Pneumothorax Segmentation in Chest X-ray

Abstract

Similar content being viewed by others

Weakly-Supervised Segmentation for Disease Localization in Chest X-Ray Images

Dual Consistency Regularization for Semi-supervised Medical Image Segmentation

Positional Information is a Strong Supervision for Volumetric Medical Image Segmentation

Keywords

1 Introduction

2 Methodology

2.1 Image-Level Classification

2.2 Pixel-Level Segmentation

3 Experiments

3.1 Dataset

3.2 Experimental Settings

3.3 Experimental Results

4 Conclusion

References

Acknowledgement

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (pdf 537 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Societies and partnerships