FRSFN: A semantic fusion network for practical fashion retrieval

Liu, An-An; Zhang, Ting; Song, Dan; Li, Wenhui; Zhou, Ming

doi:10.1007/s11042-020-08973-9

FRSFN: A semantic fusion network for practical fashion retrieval

Published: 10 May 2020

Volume 80, pages 17169–17181, (2021)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

An-An Liu¹,
Ting Zhang¹,
Dan Song ORCID: orcid.org/0000-0002-5864-0276¹,
Wenhui Li¹ &
…
Ming Zhou²

412 Accesses
Explore all metrics

Abstract

In recent years, research related to fashion has made remarkable progress, and the use of image content for fashion retrieval has become one of the effective approaches as well as research hot spots. However, it remains a challenging task due to the various contents contained in fashion images. This work presents a practical fashion retrieval method which puts emphasis on specific items. The Semantic Fusion Network of the method firstly extracts two kinds of features, which are the global features from the original query image and the item features. The item features are from the same query image semantically parsed before. Then the network fuses two kinds of features with the combination of color information. Finally, the similarity scores are calculated among features for retrieval. The experiments show that while remaining higher statistical retrieval results, our method grasps the detailed characteristics and items of the clothing and keeps a satisfying overall similarity in shape and color.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fast Fashion Guided Clothing Image Retrieval: Delving Deeper into What Feature Makes Fashion

DSSN: dual shallow Siamese network for fashion image retrieval

Article 12 November 2022

NecklaceFIR: A Large Volume Benchmarked Necklace Dataset for Fashion Image Retrieval

References

Andriluka M, Pishchulin L, Gehler P (2014) Schiele, B.: 2d human pose estimation: New benchmark and state of the art analysis. In: Proceedings of the IEEE Conference on computer Vision and Pattern Recognition, pp 3686–3693
Badrinarayanan V, Kendall A, Cipolla R (2017) Segnet: a deep convolutional encoder-decoder architecture for image segmentation. IEEE Trans Pattern Anal Mach Intell 39(12):2481–2495
Article Google Scholar
Chen H, Gallagher A, Girod B (2012) Describing clothing by semantic attributes. In: European Conference on Computer Vision, Springer, pp 609–623
Chen LC, Papandreou G, Kokkinos I, Murphy K, Yuille AL (2014) Semantic image segmentation with deep convolutional nets and fully connected crfs. arXiv:1412.7062
Chen LC, Yang Y, Wang J, Xu W, Yuille AL (2016) Attention to scale: Scale-aware semantic image segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 3640–3649
Cheng Z, Chang X, Zhu L, Kanjirathinkal RC, Kankanhalli M (2019) Mmalfm: Explainable recommendation by leveraging reviews and images. ACM Transactions on Information Systems (TOIS) 37(2):16
Article Google Scholar
Corbiere C, Ben-Younes H, Ramé A., Ollion C (2017) Leveraging weakly annotated data for fashion image retrieval and label prediction. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2268–2274
Dalal N, Triggs B (2005) Histograms of oriented gradients for human detection. In: IEEE Computer Society Conference on Computer Vision & Pattern Recognition
Di W, Wah C, Bhardwaj A, Piramuthu R, Sundaresan N (2013) Style finder: Fine-grained clothing style detection and retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp 8–13
Fang HS, Lu G, Fang X, Xie J, Tai YW, Lu C (2018) Weakly and semi supervised human body part parsing via pose-guided knowledge transfer. In: 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, IEEE, pp 70–78
Gajic B, Baldrich R (2018) Cross-domain fashion image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshops, pp. 1869–1871
Gan C, Lin M, Yang Y, De Melo G, Hauptmann AG (2016) Concepts not alone: Exploring pairwise relationships for zero-shot video activity recognition. In: Thirtieth AAAI Conference on Artificial Intelligence
Gong K, Liang X, Li Y, Chen Y, Yang M, Lin L (2018) Instance-level human parsing via part grouping network. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 770–785
Hadi Kiapour M, Han X, Lazebnik S, Berg AC, Berg TL (2015) Where to buy it: Matching street clothing photos in online shops. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3343–3351
Han X, Song X, Yao Y, Xu XS, Nie L (2019) Neural compatibility modeling with probabilistic knowledge distillation. IEEE Trans Image Process 29:871–882
Article MathSciNet Google Scholar
Han Y, Zhu L, Cheng Z, Li J, Liu X Discrete optimal graph clustering. IEEE Transactions on Cybernetics, pp 1–14
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 770–778
Huang J, Feris RS, Chen Q, Yan S (2015) Cross-domain image retrieval with a dual attribute-aware ranking network. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1062–1070
Kalayeh MM, Basaran E, Gökmen M, Kamasak ME, Shah M (2018) Human semantic parsing for person re-identification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1062–1071
Liang X, Gong K, Shen X, Lin L (2018) Look into person: Joint body parsing & pose estimation network and a new benchmark. IEEE Trans Pattern Anal Mach Intell 41(4):871–885
Article Google Scholar
Liang X, Lin L, Yang W, Luo P, Huang J, Yan S (2016) Clothes co-parsing via joint image segmentation and labeling with application to clothing retrieval. IEEE Trans Multimedia 18(6):1175–1186
Article Google Scholar
Liang X, Liu S, Shen X, Yang J, Liu L, Dong J, Lin L, Yan S (2015) Deep human parsing with active template regression. IEEE Trans Pattern Anal Mach Intell 37(12):2402–2414
Article Google Scholar
Liang X, Xu C, Shen X, Yang J, Liu S, Tang J, Lin L, Yan S (2015) Human parsing with contextualized convolutional neural network. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1386–1394
Lin K, Yang HF, Liu KH, Hsiao JH, Chen CS (2015) Rapid clothing retrieval via deep learning of binary codes and hierarchical search. In: Proceedings of the 5th ACM on International Conference on Multimedia Retrieval, ACM, pp 499–502
Liu AA, Nie WZ, Gao Y, Su YT (2016) Multi-modal clique-graph matching for view-based 3d model retrieval. IEEE Trans Image Process 25(5):2103–2116
Article MathSciNet Google Scholar
Liu S, Liang X, Liu L, Shen X, Yang J, Xu C, Lin L, Cao X, Yan S (2015) Matching-cnn meets knn: Quasi-parametric human parsing. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1419–1427
Liu Z, Luo P, Qiu S, Wang X, Tang X (2016) Deepfashion: Powering robust clothes recognition and retrieval with rich annotations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1096–1104
Lowe DG (2004) Distinctive image features from scale-invariant keypoints. Int J Comput Vis 60(2):91–110
Article Google Scholar
Luo Z, Yuan J, Yang J, Wen W (2019) Spatial constraint multiple granularity attention network for clothesretrieval. In: 2019 IEEE International Conference on Image Processing (ICIP), IEEE, pp 859–863
Nie W, Wang K, Wang H, Su Y (2019) The assessment of 3d model representation for retrieval with cnn-rnn networks. Multimedia Tools Appl 78 (12):16979–16994
Article Google Scholar
Nie W, Wang W, Huang X (2019) Srnet: Structured relevance feature learning network from skeleton data for human action recognition. IEEE Access 7:132161–132172
Article Google Scholar
Nie X, Feng J, Yan S (2018) Mutual learning to adapt for joint human parsing and pose estimation. In: Proceedings of the European Conference on Computer Vision (ECCV), pp 502–517
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. Computer Science
Song X, Feng F, Liu J, Li Z, Nie L, Ma J (2017) Neurostylist: Neural compatibility modeling for clothing matching. In: Proceedings of the 25th ACM International Conference on Multimedia, pp 753–761
Sun X, Liu Z, Hu Y, Zhang L, Zimmermann R (2018) Perceptual multi-channel visual feature fusion for scene categorization. Inf Sci 429:37–48
Article MathSciNet Google Scholar
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1–9
Xia F, Zhu J, Wang P, Yuille AL (2016) Pose-guided human parsing by an and/or graph using pose-context features. In: Thirtieth AAAI Conference on Artificial Intelligence
Xie H, Fang S, Zha ZJ, Yang Y, Li Y, Zhang Y (2019) Convolutional attention networks for scene text recognition. ACM Transactions on Multimedia Computing, Communications, and Applications (TOMM) 15(1s):3
Google Scholar
Yamaguchi K, Hadi Kiapour M, Berg TL (2013) Paper doll parsing: Retrieving similar styles to parse clothing items. In: Proceedings of the IEEE International Conference on Computer Vision, pp 3519–3526
Yamaguchi K, Kiapour MH, Ortiz LE, Berg TL (2012) Parsing clothing in fashion photographs. In: 2012 IEEE Conference on Computer Vision and Pattern Recognition, IEEE, pp 3570–3577
Zhang H, Ji Y, Huang W, Liu L (2018) Sitcom-star-based clothing retrieval for video advertising: a deep learning framework. Neural computing and applications, pp 1–20
Zhang H, Li S, Cai S, Jiang H, Kuo CCJ (2018) Representative fashion feature extraction by leveraging weakly annotated online resources. In: 2018 25Th IEEE International Conference on Image Processing (ICIP), IEEE, pp 2640–2644
Zhao B, Feng J, Wu X, Yan S (2017) Memory-augmented attribute manipulation networks for interactive fashion search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1520–1528
Ziaeefard M, Camacaro J, Bessega C (2018) Hierarchical feature map characterization in fashion interpretation. In: 2018 15Th Conference on Computer and Robot Vision (CRV), IEEE, pp 88–94

Download references

Acknowledgements

This work was supported in part by the National Nature Science Foundation of China (61902277,61772359,61872267), the grant of 2019 Tianjin New Generation Artificial Intelligence Major Program, the grant of Tianjin New Generation Artificial Intelligence Major Program (19ZXZNGX00110,18ZXZNGX00150), the Open Project Program of the State Key Lab of CAD & CG, Zhejiang University (A2012, A2005).

Author information

Authors and Affiliations

Multimedia Institute, School of Electrical and Information Engineering, Tianjin University, Tianjin, 300072, China
An-An Liu, Ting Zhang, Dan Song & Wenhui Li
Chengdu Spaceon Measurement & Control Technology Co., Ltd, Chengdu, China
Ming Zhou

Authors

An-An Liu
View author publications
You can also search for this author inPubMed Google Scholar
Ting Zhang
View author publications
You can also search for this author inPubMed Google Scholar
Dan Song
View author publications
You can also search for this author inPubMed Google Scholar
Wenhui Li
View author publications
You can also search for this author inPubMed Google Scholar
Ming Zhou
View author publications
You can also search for this author inPubMed Google Scholar

Corresponding authors

Correspondence to Dan Song or Wenhui Li.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Liu, AA., Zhang, T., Song, D. et al. FRSFN: A semantic fusion network for practical fashion retrieval. Multimed Tools Appl 80, 17169–17181 (2021). https://doi.org/10.1007/s11042-020-08973-9

Download citation

Received: 30 December 2019
Revised: 20 March 2020
Accepted: 22 April 2020
Published: 10 May 2020
Issue Date: May 2021
DOI: https://doi.org/10.1007/s11042-020-08973-9

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

FRSFN: A semantic fusion network for practical fashion retrieval

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fast Fashion Guided Clothing Image Retrieval: Delving Deeper into What Feature Makes Fashion

DSSN: dual shallow Siamese network for fashion image retrieval

NecklaceFIR: A Large Volume Benchmarked Necklace Dataset for Fashion Image Retrieval

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Additional information

Publisher’s note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now