Unsupervised Object-Level Representation Learning from Scene Images

Xie, Jiahao; Zhan, Xiaohang; Liu, Ziwei; Ong, Yew Soon; Loy, Chen Change

Computer Science > Computer Vision and Pattern Recognition

arXiv:2106.11952 (cs)

[Submitted on 22 Jun 2021 (v1), last revised 3 Dec 2021 (this version, v2)]

Title:Unsupervised Object-Level Representation Learning from Scene Images

Authors:Jiahao Xie, Xiaohang Zhan, Ziwei Liu, Yew Soon Ong, Chen Change Loy

View PDF

Abstract:Contrastive self-supervised learning has largely narrowed the gap to supervised pre-training on ImageNet. However, its success highly relies on the object-centric priors of ImageNet, i.e., different augmented views of the same image correspond to the same object. Such a heavily curated constraint becomes immediately infeasible when pre-trained on more complex scene images with many objects. To overcome this limitation, we introduce Object-level Representation Learning (ORL), a new self-supervised learning framework towards scene images. Our key insight is to leverage image-level self-supervised pre-training as the prior to discover object-level semantic correspondence, thus realizing object-level representation learning from scene images. Extensive experiments on COCO show that ORL significantly improves the performance of self-supervised learning on scene images, even surpassing supervised ImageNet pre-training on several downstream tasks. Furthermore, ORL improves the downstream performance when more unlabeled scene images are available, demonstrating its great potential of harnessing unlabeled data in the wild. We hope our approach can motivate future research on more general-purpose unsupervised representation learning from scene data.

Comments:	NeurIPS 2021. Project page: this https URL Code: this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2106.11952 [cs.CV]
	(or arXiv:2106.11952v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2106.11952

Submission history

From: Jiahao Xie [view email]
[v1] Tue, 22 Jun 2021 17:51:24 UTC (6,381 KB)
[v2] Fri, 3 Dec 2021 13:51:38 UTC (6,381 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Unsupervised Object-Level Representation Learning from Scene Images

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Unsupervised Object-Level Representation Learning from Scene Images

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators