RGB-D joint modelling with scene geometric information for indoor semantic segmentation

Liu, Hong; Wu, Wenshan; Wang, Xiangdong; Qian, Yueliang

doi:10.1007/s11042-018-6056-8

RGB-D joint modelling with scene geometric information for indoor semantic segmentation

Published: 21 May 2018

Volume 77, pages 22475–22488, (2018)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Hong Liu ORCID: orcid.org/0000-0003-4524-495X¹,
Wenshan Wu¹,
Xiangdong Wang¹ &
…
Yueliang Qian¹

648 Accesses
Explore all metrics

Abstract

This paper focuses on the problem of RGB-D semantic segmentation for indoor scenes. We introduce a novel gravity direction detection method based on vertical lines fitting combined 2D vision information and 3D geometric information to improve the original HHA depth encoding. Then to fuse two-stream networks of deep convolutional networks from RGB and depth encoding, we propose a joint modelling method by learning a weighted summing layer to fuse the prediction results. Finally, to refine the pixel-wise score maps, we adopt fully-connected CRF as a post-processing and propose a pairwise potential function combined normal kernel to explore geometric information. Experimental results show our proposed approach achieves state-of-the-art performance of RGB-D semantic segmentation on public dataset.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

A survey on indoor RGB-D semantic segmentation: from hand-crafted features to deep convolutional neural networks

Article 21 May 2019

Semantic Segmentation of Indoor-Scene RGB-D Images Based on Iterative Contraction and Merging

Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and CRFs

Article 05 July 2019

References

Anand A, Koppula HS, Joachims T, Saxena A (2013) Contextually guided semantic labeling and search for three-dimensional point clouds. Int J Robot Res 32(1):19–34
Article Google Scholar
Banica D, Sminchisescu C (2015) Second-order constrained parametric proposals and sequential search-based structured prediction for semantic segmentation in rgb-d images. In: Computer Vision and Pattern Recognition
Bingjie W, Junpeng Z, Chunjie W (2014) Spatial straightness error evaluation based on three-dimensional least squares method. Journal of Beijing University of Aeronautics and Astronautics 40:1477–1480 (in Chinese)
Google Scholar
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. Comp Sci 357–361. https://arxiv.org/abs/1412.7062
Couprie C, Farabet C, Najman L, LeCun Y (2013) Indoor semantic segmentation using depth information. In: international conference on learning Representa- tions. Number arXiv preprint arXiv:1301.3572
Deng Z, Todorovic S, Latecki L J (2015) Semantic segmentation of rgbd images with mutex constraints. In: ICCV
Eigen D, Fergus R (2015) Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2650–2658
Farabet C, Couprie C, Najman L, LeCun Y (2013) Learning hierarchical features for scene labeling. IEEE Trans Pattern Anal Mach Intell 35(8):1915–1929
Article Google Scholar
Filliat D, Battesti E, Bazeille S, et al (2012) RGBD object recognition and visual texture classification for indoor semantic mapping. Technologies for Practical Robot Applications (TePRA), 2012 I.E. International Conference on IEEE, pp. 127–132
Gupta S, Arbelaez P, Malik J (2013) Perceptual organization and recognition of indoor scenes from rgb-d images. In: CVPR. 564–571
Gupta S, Girshick R, Arbelaez P, Malik J (2014) Learning rich features from RGB-D images for object detection and segmentation. In: ECCV
He Y, Chiu WC, Keuper M, Fritz M (2017) Std2p: rgbd semantic segmentation using spatio-temporal data-driven pooling. In CVPR, 7158–7167
Hong S, Noh H, Han B (2015) Decoupled deep neural network for semi- supervised semantic segmentation. NIPS 2015
Khan S, Bennamoun M, Sohel F, Togneri R (2014) Geometry driven semantic labeling of indoor scenes. ECCV 2014 8689:679–694
Google Scholar
Koppula H S, Anand A, Joachims T, et al (2011) Semantic labeling of 3D point clouds for indoor scenes. International Conference on Neural Information Processing Systems. Curran Associates Inc, pp. 244–252
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. In NIPS
Li Z, Gan Y, Liang X, et al (2016) LSTM-CF: Unifying Context Modeling and Fusion with LSTMs for RGB-D Scene Labeling. In: European Conference on Computer Vision. Springer International Publishing, 541–557
Liu F, Lin G, Shen C (2016) Discriminative Training of Deep Fully-connected Continuous CRF with Task-specific Loss. arXiv preprint arXiv:1601.07649
Long J, Shelhamer E, and Darrell T (2015) Fully convolutional networks for semantic segmentation, In CVPR, pp. 3431–3440
Noh H, Hong S, Han B (2015) Learning deconvolution network for semantic segmen- tation. arXiv preprint arXiv:1505.04366
Ren X, Bo L, Fox D (2012) Rgb-(d) scene labeling: features and algorithms. In: CVPR 2759–2766
Shuai B, Zuo Z, Wang B, et al (2016) DAG-recurrent neural networks for scene labeling. In: Computer Vision and Pattern Recognition. IEEE, pp. 3620–3629
Shuai B, Zuo Z, Wang G, Wang B (2016) Scene parsing with integration of parametric and non-parametric models. IEEE Transactions on Image Processing A Publication of the IEEE Signal Processing Society 25(5):2379–2391
Article MathSciNet Google Scholar
Silberman N, Fergus R (2011) Indoor scene segmentation using a structured light sensor. In: ICCV Workshops 601–608
Silberman N, Hoiem D, Kohli P, Fergus R (2012) Indoor segmentation and support inference from rgbd images. In: ECCV, pp. 746–760
Simonyan K and Zisserman A (2014) Very deep convolu- tional networks for large-scale image recognition. CoRR, abs/1409.1556
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, and Rabinovich A (2014) Going deeper with convolutions. CoRR, abs/1409.4842
Wang J, Wang Z, Tao D, et al (2016) Learning common and specific features for rgb-d semantic segmentation with deconvolutional networks. In: European Conference on Computer Vision. Springer International Publishing, pp. 664–679

Download references

Acknowledgments

This work is supported in part by Beijing Natural Science Foundation: 4142051.

Author information

Authors and Affiliations

Beijing Key Laboratory of Mobile Computing and Pervasive Device, Institute of Computing Technology, Chinese Academy of Sciences, Beijing, 100190, China
Hong Liu, Wenshan Wu, Xiangdong Wang & Yueliang Qian

Authors

Hong Liu
View author publications
You can also search for this author inPubMed Google Scholar
Wenshan Wu
View author publications
You can also search for this author inPubMed Google Scholar
Xiangdong Wang
View author publications
You can also search for this author inPubMed Google Scholar
Yueliang Qian
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding author

Correspondence to Hong Liu.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, H., Wu, W., Wang, X. et al. RGB-D joint modelling with scene geometric information for indoor semantic segmentation. Multimed Tools Appl 77, 22475–22488 (2018). https://doi.org/10.1007/s11042-018-6056-8

Download citation

Received: 15 September 2017
Revised: 04 April 2018
Accepted: 24 April 2018
Published: 21 May 2018
Issue Date: September 2018
DOI: https://doi.org/10.1007/s11042-018-6056-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

RGB-D joint modelling with scene geometric information for indoor semantic segmentation

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

A survey on indoor RGB-D semantic segmentation: from hand-crafted features to deep convolutional neural networks

Semantic Segmentation of Indoor-Scene RGB-D Images Based on Iterative Contraction and Merging

Indoor scene understanding via RGB-D image segmentation employing depth-based CNN and CRFs

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now