Abstract
Semantic maps play a key role in tasks such as navigation of mobile robots. However, the visual SLAM algorithm based on multi-objective geometry does not make full use of the rich semantic information in space. The map point information retained in the map is just a spatial geometric point without semantics. Since the algorithm based on convolutional neural network has achieved breakthroughs in the field of target detection, the target segmentation algorithm MASK-RCNN is combined with the SLAM algorithm to construct the semantic map. However, the MASK-RCNN algorithm easily treats part of the background in the image as foreground, which results in inaccuracy of target segmentation. Moreover, Grubcut segmentation algorithm is time-consuming, but it’s easy to take foreground as background, which leads to the excessive edge segmentation. Based on these, our paper proposes a novel algorithm which combines MASK-RCNN and Grubcut segmentation. By comparing the experimental results of MASK-Rcnn, Grubcut and the improved algorithm on the data set, it is obvious that the improved algorithm has the best segmentation effect and the accuracy of image target segmentation is significantly improved. These phenomenons demonstrate the effectiveness our proposed algorithm.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Tang, P., Wang, C., Wang, X., Liu, W., Zeng, W., Wang, J.: Object detection in videos by high quality object linking. arXiv preprint arXiv:1801.09823 (2018)
Neumann, L., Zisserman, A., Vedaldi, A.: Relaxed softmax: efficient confidence auto-calibration for safe pedestrian detection (2018)
Chen, L.C., Papandreou, G., Kokkinos, I., Murphy, K., Yuille, A.L.: DeepLab: semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs. IEEE Trans. Pattern Anal. Mach. Intell. 40(4), 834–848 (2018)
Fathi, A., et al.: Semantic instance segmentation via deep metric learning. arXiv preprint arXiv:1703.10277 (2017)
Arnab, A., Torr, P.H.: Pixelwise instance segmentation with a dynamically instantiated network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 441–450 (2017)
Bell, S., Lawrence Zitnick, C., Bala, K., Girshick, R.: Inside-outside net: Detecting objects in context with skip pooling and recurrent neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2874–2883 (2016)
Cao, Z., Simon, T., Wei, S.E., Sheikh, Y.: Realtime multi-person 2D pose estimation using part affinity fields. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7291–7299 (2017)
Girshick, R., Iandola, F., Darrell, T., Malik, J.: Deformable part models are convolutional neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 437–446 (2015)
Hayder, Z., He, X., Salzmann, M.: Shape-aware instance segmentation (2016)
Kirillov, A., Levinkov, E., Andres, B., Savchynskyy, B., Rother, C.: InstanceCut: from edges to instances with MultiCut. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. pp, 5008–5017 (2017)
Li, Y., Qi, H., Dai, J., Ji, X., Wei, Y.: Fully convolutional instance-aware semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2359–2367 (2017)
Lin, T.Y., DollĂ¡r, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125 (2017)
Long, J., Shelhamer, E., Darrell, T.: Fully convolutional networks for semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3431–3440 (2015)
Sun, C., Shrivastava, A., Singh, S., Gupta, A.: Revisiting unreasonable effectiveness of data in deep learning era. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 843–852 (2017)
Andriluka, M., Pishchulin, L., Gehler, P., Schiele, B.: 2D human pose estimation: new benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 3686–3693 (2014)
ArbelĂ¡ez, P., Pont-Tuset, J., Barron, J.T., Marques, F., Malik, J.: Multiscale combinatorial grouping. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 328–335 (2014)
He, K., Zhang, X., Ren, S., Sun, J.: Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1904–1916 (2015)
Acknowledgment
This work was supported by National Key R&D Program of China Number 2017YFB1301103, and the Fundamental Research Fund for the Central Universities of China N172604003, N172603001, and supported by Doctoral Foundation of Liaoning Science and Technology Department Number 20170520244, and the National Natural Science Foundation of China under Grant nos. 61701101, U1713216, 61803077, 61603080.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Wu, X., Wen, S., Xie, Ya. (2019). Improvement of Mask-RCNN Object Segmentation Algorithm. In: Yu, H., Liu, J., Liu, L., Ju, Z., Liu, Y., Zhou, D. (eds) Intelligent Robotics and Applications. ICIRA 2019. Lecture Notes in Computer Science(), vol 11740. Springer, Cham. https://doi.org/10.1007/978-3-030-27526-6_51
Download citation
DOI: https://doi.org/10.1007/978-3-030-27526-6_51
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-27525-9
Online ISBN: 978-3-030-27526-6
eBook Packages: Computer ScienceComputer Science (R0)