Abstract
Computer vision is becoming increasingly important in agriculture, as it can provide important insights and lead to better informed decisions and reduce costs. However, working on the agriculture domain introduces important challenges, such as adverse conditions, small structures and lack of large datasets, hindering its wide adoption on multiple cases. This work presents an approach to improve the performance of detecting challenging small objects, by exploiting their spatial structure under the hypothesis that they are located close to larger objects, which we define as anchor. This is achieved by providing feature maps from the detections of the anchor class to the network responsible for detecting the secondary class. Thus, the secondary class object detection is formulated as a residual problem on top of the anchor class detection, benefiting from an activation bias close to the anchor object spatial locations. Experiments on the grape-stem and capsicum-peduncle cases, demonstrate increased performance against more computationally expensive baselines, resulting in improved metrics at 37% of baseline FLOPs.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bochkovskiy, A., Wang, C.Y., Liao, H.: YOLOv4: optimal speed and accuracy of object detection (2020)
Bolya, D., Zhou, C., Xiao, F., Lee, Y.J.: YOLACT: real-time instance segmentation. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 9157–9166 (2019)
Cavero, M., Sa, L.E.: Sweet pepper recognition and peduncle pose estimation (2021). https://hdl.handle.net/11285/648430
Chen, H., Sun, K., Tian, Z., Shen, C., Huang, Y., Yan, Y.: Blendmask: top-down meets bottom-up for instance segmentation. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8573–8581 (2020)
Giang, T.T.H., Khai, T.Q., Im, D.Y., Ryoo, Y.J.: Fast detection of tomato sucker using semantic segmentation neural networks based on RGB-D images. Sensors 22(14) (2022)
Girshick, R.: Fast R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 1440–1448 (2015)
Halstead, M., Denman, S., Fookes, C., McCool, C.: Fruit detection in the wild: the impact of varying conditions and cultivar. In: 2020 Digital Image Computing: Techniques and Applications (DICTA), pp. 1–8 (2020)
He, K., Gkioxari, G., Dollár, P., Girshick, R.: Mask R-CNN. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2961–2969 (2017)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Jocher, G., et al.: ultralytics/yolov5: v3.1 - Bug Fixes and Performance Improvements (2020). https://doi.org/10.5281/zenodo.4154370
Kalampokas, T., Vrochidou, E., Papakostas, G.A., Pachidis, T., Kaburlasos, V.G.: Grape stem detection using regression convolutional neural networks. Comput. Electron. Agric. 186, 106220 (2021)
Kgp, I.: Field capsicum dataset (2023). https://universe.roboflow.com/iit-kgp-knvbv/field-capsicum
Lin, T.Y., Dollar, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018)
Liu, W., et al.: SSD: single shot multibox detector. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9905, pp. 21–37. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46448-0_2
López-Barrios, J.D., Escobedo Cabello, J.A., Gómez-Espinosa, A., Montoya-Cavero, L.E.: Green sweet pepper fruit and peduncle detection using mask R-CNN in greenhouses. Appl. Sci. 13(10) (2023)
Luo, L., et al.: In-field pose estimation of grape clusters with combined point cloud segmentation and geometric analysis. Comput. Electron. Agric. 200, 107197 (2022)
Morros, J.R., et al.: AI4Agriculture grape dataset (2021). https://doi.org/10.5281/zenodo.5660081
People, C.P.: Peduncle segmentation dataset (2023). https://universe.roboflow.com/cmu-pepper-people/peduncle-segmentation
Polić, M., Vuletić, J., Orsag, M.: Pepper to fall: a perception method for sweet pepper robotic harvesting. Intell. Serv. Robot. 15 (2022)
Qi, X., Dong, J., Lan, Y., Zhu, H.: Method for identifying litchi picking position based on YOLOv5 and PSPNet. Remote Sens. 14(9) (2022)
Ren, S., He, K., Girshick, R., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Advances in Neural Information Processing Systems, vol. 28 (2015)
Rong, J., Guanglin, D., Wang, P.: A peduncle detection method of tomato for autonomous harvesting. Complex Intell. Syst. 7 (2021)
Sa, I.: Deepfruits capsicum dataset (2021). https://universe.roboflow.com/inkyu-sa-e0c78/deepfruits-capsicum
Sa, I., Lim, J.Y., Ahn, H.S., MacDonald, B.: deepNIR: datasets for generating synthetic NIR images and improved fruit detection system using deep learning techniques. Sensors 22(13) (2022). https://doi.org/10.3390/s22134721
Santos, T., de Souza, L., dos Santos, A., Sandra, A.: Embrapa Wine Grape Instance Segmentation Dataset - Embrapa WGISD (2019). https://doi.org/10.5281/zenodo.3361736
Smitt, C., Halstead, M., Zaenker, T., Bennewitz, M., McCool, C.: PATHoBot: a robot for glasshouse crop phenotyping and intervention. In: 2021 IEEE International Conference on Robotics and Automation (ICRA), pp. 2324–2330 (2021)
Sozzi, M., Cantalamessa, S., Cogato, A., Kayad, A., Marinello, F.: wGrapeUNIPD-DL: an open dataset for white grape bunch detection. Data Brief 43, 108466 (2022). https://doi.org/10.1016/j.dib.2022.108466
Wang, X., Kong, T., Shen, C., Jiang, Y., Li, L.: SOLO: segmenting objects by locations. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12363, pp. 649–665. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58523-5_38
Wang, X., Zhang, R., Kong, T., Li, L., Shen, C.: SOLOv2: dynamic and fast instance segmentation. In: Advances in Neural Information Processing Systems, vol. 33, pp. 17721–17732 (2020)
Zhao, H., Shi, J., Qi, X., Wang, X., Jia, J.: Pyramid scene parsing network. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2881–2890 (2017)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zampokas, G., Mariolis, I., Giakoumis, D., Tzovaras, D. (2023). Residual Cascade CNN for Detection of Spatially Relevant Objects in Agriculture: The Grape-Stem Paradigm. In: Christensen, H.I., Corke, P., Detry, R., Weibel, JB., Vincze, M. (eds) Computer Vision Systems. ICVS 2023. Lecture Notes in Computer Science, vol 14253. Springer, Cham. https://doi.org/10.1007/978-3-031-44137-0_14
Download citation
DOI: https://doi.org/10.1007/978-3-031-44137-0_14
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-44136-3
Online ISBN: 978-3-031-44137-0
eBook Packages: Computer ScienceComputer Science (R0)