Abstract
The automatic image annotation is an effective computer operation that predicts the annotation of an unknown image by automatically learning potential relationships between the semantic concept space and the visual feature space in the annotation image dataset. Usually, the auto-labeling image includes the processing: learning processing and labeling processing. Existing image annotation methods that employ convolutional features of deep learning methods have a number of limitations, including complex training and high space/time expenses associated with the image annotation procedure. Accordingly, this paper proposes an innovative method in which the visual features of the image are presented by the intermediate layer features of deep learning, while semantic concepts are represented by mean vectors of positive samples. Firstly, the convolutional result is directly output in the form of low-level visual features through the mid-level of the pre-trained deep learning model, with the image being represented by sparse coding. Secondly, the positive mean vector method is used to construct visual feature vectors for each text vocabulary item, so that a visual feature vector database is created. Finally, the visual feature vector similarity between the testing image and all text vocabulary is calculated, and the vocabulary with the largest similarity used for annotation. Experiments on the datasets demonstrate the effectiveness of the proposed method; in terms of F1 score, the proposed method’s performance on the Corel5k dataset and IAPR TC-12 dataset is superior to that of MBRM, JEC-AF, JEC-DF, and 2PKNN with end-to-end deep features.





Similar content being viewed by others
References
Alex K, Ilya S, Hinton G E (2012) ImageNet classification with deep convolutional neural networks. In: proceedings of the 25th international conference on neural information processing systems, Lake Tahoe, Nevada, USA, 3-6 December 2012, pp 1106-1114
Budikova P, Batko M, Zezula P (2018) ConceptRank for search-based image annotation. Multimed Tools Appl 77(7):8847–8882
Chen YT, Wang J, Xia RL, Zhang Q, Cao ZH, Yang K (2019) The visual object tracking algorithm research based on adaptive combination kernel. J Ambient Intell Humaniz Comput 10(12):4855–4867
Chen YT, Wang J, Liu SJ, Chen X, Xiong J, Xie JB, Yang K (2019) Multiscale fast correlation filtering tracking algorithm based on a feature fusion model. Concurr Comput. https://doi.org/10.1002/cpe.5533
Chen YT, Zhang HP, Liu LW, Chen X, Zhang Q, Yang K, Xia RL, Xie JB (2020) Research on image inpainting algorithm of improved GAN based on two-discriminations networks. Appl Intell. https://doi.org/10.1007/s10489-020-01971-2
Chen YT, Xu WH, Zuo JW, Yang K (2019) The fire recognition algorithm using dynamic feature fusion and IV-SVM classifier. Clust Comput 22(10):S7665–S7675
Chen YT, Phonevilay V, Tao JJ, Chen X, Xia RL, Zhang Q, Yang K, Xiong J, Xie JB (2020) The face image super-resolution algorithm based on combined representation learning. Multimed Tools Appl. https://doi.org/10.1007/s11042-020-09969-1
Chen YT, Liu LW, Tao JJ, Xia RL, Zhang Q, Yang K, Xiong J, Chen X (2020) The improved image Inpainting algorithm via encoder and similarity constraint. Vis Comput. https://doi.org/10.1007/s00371-020-01932-3
Chen YT, Tao JJ, Liu LW, Xiong J, Xia RL, Xie JB, Zhang Q, Yang K (2020) Research of improving semantic image segmentation based on a feature fusion model. J Ambient Intell Humaniz Comput. https://doi.org/10.1007/s12652-020-02066-z
Chen Y T, Tao J J, Zhang Q, Yang K, Chen X, Xiong J, Xia R L, Xie J B (2020) Saliency Detection via improved hierarchical principle component analysis method. Wirel Commun Mob Comput, vol. 2020, Article ID 8822777
Cheng QM, Zhang Q, Fu P, Tu CH, Li S (2018) A survey and analysis on automatic image annotation. Pattern Recogn 79(7):242–259
Diwakar M, Kumar M (2018) A review on CT image noise and its denoising. Biomed Signal Process Control 42:73–88
Diwakar M, Kumar M (2018) CT image denoising using NLM and correlation-based wavelet packet thresholding. IET Image Process 12(5):708–715
Diwakar M, Singh P (2020) CT image denoising using multivariate model and its method noise thresholding in non-subsampled shearlet domain. Biomed Signal Process Control 57:101754. https://doi.org/10.1016/j.bspc.2019.101754
Gong Y C, Jia Y Q, Leung T, Toshev A, Loffe S (2014) Deep convolutional ranking for multilabel image annotation. In: proceedings of international conference on learning representation, Banff, AB, Canada, 14-16 April 2014, https://arxiv.org/abs/1312.4894v2. Accessed 14 Apr 2014
Guillaumin M, Mensink T, Verbeek J, Schmid C (2009) TagProp: discriminative metric learning in nearest neighbor models for image auto-annotation. In: proceedings of IEEE international conference on computer vision, Kyoto, Japan, 27 September-4 October, 2009, pp 309-316
He K M, Zhang X Y, Ren S Q, Sun J (2016) Deep residual learning for image recognition. In: proceedings of IEEE conference on computer vision and pattern recognition, Las Vegas, NV, USA, 27-30 June 2016, pp 770-778
Jesus R, Abrantes AJ, Correia N (2011) Methods for automatic and assisted image annotation. Multimed Tools Appl 55(1):7–26
Ji Q, Zhang LY, Shu XB, Tang JH (2019) Image annotation refinement via 2P-KNN based group sparse reconstruction. Multimed Tools Appl 78(10):13213–13225
Johnson J, Ballan L, Li F F (2015) Love thy neighbors: image annotation by exploiting image metadata. In: proceedings of IEEE international conference on computer vision, Santiago, Chile, 7-13 December 2015, pp 4624-4632
Kumar M, Diwakar (2019) A new exponentially directional weighted function based CT image denoising using total variation. Journal of King Saud University - Computer and Information Sciences, 31(1), pp. 113–124
Kumar M, Diwakar M (2018) CT image denoising using locally adaptive shrinkage rule in tetrolet domain. Journal of King Saud University - Computer and Information Sciences 30(1):41–50
Liao X, Li KD, Zhu XS, Liu KJR (2020) Robust detection of image operator chain with two-stream convolutional neural network. IEEE Journal of Selected Topics in Signal Processing 14(5):955–968
Liao X, Yu YB, Li B, Li ZP, Qin Z (2020) A new payload partition strategy in color image steganography. IEEE Transactions on Circuits and Systems for Video Technology 30(3):685–696
Liao X, Yin JJ, Chen ML, Qin Z (2020) Adaptive payload distribution in multiple images steganography based on image texture features. IEEE Transactions on Dependable and Secure Computing:1. https://doi.org/10.1109/TDSC.2020.3004708
Lu WP, Zhang X, Lu HM, Li FF (2020) Deep hierarchical encoding model for sentence semantic matching. J Vis Commun Image Represent 71:102794. https://doi.org/10.1016/j.jvcir.2020.102794
Luo YJ, Qin JH, Xiang XY, Tan Y, Liu Q, Xiang LY (2020) Coverless real-time image information hiding based on image block matching and dense convolutional network. J Real-Time Image Proc 17(1):125–135
Makadia A, Pavlovic V, Kumar S (2008) A new baseline for image annotation. In proceedings of European conference on computer vision, Marseille, France, 12-18 October 2008: pp 316-329
Murthy V N, Maji S, Manmatha R (2015) Automatic image annotation using deep learning representations. In: proceedings of ACM on international conference on multimedia retrieval, Shanghai, China, 23-26 June 2015, pp 603-606
Simonyan K, Zisserman A (2015) Very deep convolutional networks for large-scale image recognition. In: proceedings of the 3rd international conference on learning representation, San Diego, CA, USA, 7-9 may 2015, https://arxiv.org/abs/1409.1556. Accessed 10 Apr 2015
Sun L, Ma CY, Chen YJ, Zheng YH, Shim HJ, Wu ZB, Jeon B (2019) Low rank component induced spatial-spectral kernel method for hyperspectral image classification. IEEE Transactions on Circuits and Systems for Video Technology:1. https://doi.org/10.1109/TCSVT.2019.2946723
Szegedy C, Liu W, Jia Y Q, Sermanet P, Reed S E, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: proceedings of IEEE conference on computer vision and pattern recognition, Boston, MA, USA, 7-12 June 2015, pp 1-9
Verma Y, Jawahar C V (2012) Image annotation using metric learning in semantic neighborhoods. In: Proceedings of European Conference on Computer Vision, Lecture Notes in Computer Science, Springer, Berlin, Heidelberg, 2012, 7574, pp. 836–849
Xu HJ, Huang CQ, Huang XD, Huang MX (2019) Multi-modal multi-concept-based deep neural network for automatic image annotation. Multimed Tools Appl 78(21):30651–30675
Yu F, Liu L, Shen H, Zhang Z N, Huang Y Y, Cai S, Deng Z L, Wan Q Z (2020) Multistability analysis, coexisting multiple attractors and FPGA implementation of Yu-Wang four-wing chaotic system. Math. Probl. Eng, vol. 2020, Article ID 7530976
Yu F, Liu L, Shen H, Zhang Z N, Huang Y Y, Shi C Q, Cai S, Wu X M, Du S C, Wan Q Z (2020) Dynamic analysis, Circuit design and Synchronization of a novel 6D memristive four-wing hyperchaotic system with multiple coexisting attractors. Complexity, vol. 2020, Article ID 5904607
Zhang JM, Xie ZP, Sun J, Zou X, Wang J (2020) A cascaded R-CNN with multiscale attention and imbalanced samples for traffic sign detection. IEEE Access 8:29742–29754
Acknowledgments
We are grateful to all students and teachers who participated in this study and all the colleagues working to realize this project.
Funding
This work was supported in part by the National Natural Science Foundation of China under Grant 61972056, 61772454, 61402053, 61981340416, the Natural Science Foundation of Hunan Province of China under Grant 2020JJ4623, the Scientific Research Fund of Hunan Provincial Education Department under Grant 17A007, 19C0028, 19B005, the Changsha Science and Technology Planning under Grant KQ1703018, KQ1706064, KQ1703018–01, KQ1703018–04, the Junior Faculty Development Program Project of Changsha University of Science and Technology under Grant 2019QJCZ011, the “Double First-class” International Cooperation and Development Scientific Research Project of Changsha University of Science and Technology under Grant 2019IC34, the Practical Innovation and Entrepreneurship Ability Improvement Plan for Professional Degree Postgraduate of Changsha University of Science and Technology under Grant SJCX202072, the Postgraduate Training Innovation Base Construction Project of Hunan Province under Grant 2019–248-51, 2020–172-48.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
No potential conflict of interest was reported by the authors.
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Chen, Y., Liu, L., Tao, J. et al. The image annotation algorithm using convolutional features from intermediate layer of deep learning. Multimed Tools Appl 80, 4237–4261 (2021). https://doi.org/10.1007/s11042-020-09887-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09887-2