Abstract
We propose a novel mode of feedback for image search, where a user describes which properties of exemplar images should be adjusted in order to more closely match his/her mental model of the image sought. For example, perusing image results for a query “black shoes”, the user might state, “Show me shoe images like these, but sportier.” Offline, our approach first learns a set of ranking functions, each of which predicts the relative strength of a nameable attribute in an image (e.g., sportiness). At query time, the system presents the user with a set of exemplar images, and the user relates them to his/her target image with comparative statements. Using a series of such constraints in the multi-dimensional attribute space, our method iteratively updates its relevance function and re-ranks the database of images. To determine which exemplar images receive feedback from the user, we present two variants of the approach: one where the feedback is user-initiated and another where the feedback is actively system-initiated. In either case, our approach allows a user to efficiently “whittle away” irrelevant portions of the visual feature space, using semantic language to precisely communicate her preferences to the system. We demonstrate our technique for refining image search for people, products, and scenes, and we show that it outperforms traditional binary relevance feedback in terms of search speed and accuracy. In addition, the ordinal nature of relative attributes helps make our active approach efficient—both computationally for the machine when selecting the reference images, and for the user by requiring less user interaction than conventional passive and active methods.





















Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Notes
We do, however, assume that all users would agree on the true attribute strength in a given image. See Kovashka and Grauman (2013a) for an approach to model the user-specific perception of an attribute.
Fig. 4 Sketch of WhittleSearch relevance computation. This toy example illustrates the intersection of relative constraints with \(M=2\) attributes. The images are plotted on the axes for both attributes. The space of images that satisfy each constraint are marked in a different color. The region satisfying all constraints is marked with a black dashed line. In this case, there is only one image in it (outlined in black). Best viewed in color
The exhaustive baseline was too expensive to run on all 14 K Shoes. On a 1000-image subset, it does similarly as on other datasets.
References
Berg, T., Berg, A. & Shih, J. (2010). Automatic attribute discovery and characterization from noisy web data. In: Proceedings of the European Conference on Computer Vision (ECCV).
Biswas, A. & Parikh, D. (2013). Simultaneous active learning of classifiers and attributes via relative feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Branson, S., Wah, C., Schroff, F., Babenko, B., Welinder, P., Perona, P. & Belongie, S. (2010). Visual recognition with humans in the loop. In: Proceedings of the European Conference on Computer Vision (ECCV).
Cox, I., Miller, M., Minka, T., Papathomas, T., & Yianilos, P. (2000). The Bayesian image retrieval system, PicHunter: Theory, implementation and psychophysical experiments. IEEE Transactions on Image Processing, 9(1), 20–37.
Douze, M., Ramisa, A., Schmid, C. (2011). Combining attributes and fisher vectors for efficient image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Farhadi, A., Endres, I., Hoiem, D., Forsyth, D. (2009). Describing objects by their attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Ferecatu, M., Geman, D. (2007). Interactive search for image categories by mental matching. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Flickner, M., Sawhney, H., Nilback, W., Ashley, J., Huang, Q., Dom, B., et al. (1995). Query by image and video content: The QBIC system. IEEE Computer, 28(9), 23–32.
Geman, D. & Jedynak, B. (1998). Model-based classification trees. IEEE Transactions on Information Theory, 47(3), 1075–1082.
Iqbal, Q. & Aggarwal, J. K. (2002) CIRES: A system for content-based retrieval in digital image libraries. In: Proceedings of the International Conference on Control, Automation, Robotics and Vision.
Jayaraman, D., Sha, F. & Grauman, K. (2014). Decorrelating semantic visual attributes by resisting the urge to share. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Joachims, T. (2002). Optimizing search engines using clickthrough data. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).
Joachims, T. (2006). Training linear SVMs in linear time. In: Proceedings of the ACM SIGKDD Conference on Knowledge Discovery and Data Mining (KDD).
Kekalainen, J., & Jarvelin, K. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems, 20(4), 422–446.
Kovashka, A. & Grauman, K. (2013a). Attribute adaptation for personalized image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kovashka, A. & Grauman, K. (2013b). Attribute pivots for guiding relevance feedback in image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kovashka, A., Vijayanarasimhan, S. & Grauman, K. (2011). Actively selecting annotations among objects and attributes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kovashka, A., Parikh, D. & Grauman, K. (2012). Whittle search: Image search with relative attribute feedback. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Kulkarni, P., Sharma, G., Zepeda, J. & Chevallier, L. (2014). Transfer learning via attributes for improved on-the-fly classification. In: Proceedings of the Winter Conference on Applications of Computer Vision (WACV).
Kumar, N., Belhumeur, P. & Nayar, S. (2008). Facetracer: A search engine for large collections of images with faces. In: Proceedings of the European Conference on Computer Vision (ECCV).
Kumar, N., Berg, A. C., Belhumeur, P. N. & Nayar, S. K. (2009). Attribute and simile classifiers for face verification. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Kurita, T., Kato, T. (1993). Learning of personal visual impression for image database systems. In: Proceedings of the International Conference on Document Analysis and Recognition (ICDAR).
Lampert, C., Nickisch, H. & Harmeling, S. (2009). Learning to detect unseen object classes by between-class attribute transfer. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Li, B., Chang, E. & Li, C. S. (2001). Learning image query concepts via intelligent sampling. In: Proceedings of the International Conference on Multimedia and Expo (ICME).
Ma, W. & Manjunath, B. (1997). NeTra: A toolbox for navigating large image databases. In: Proceedings of the International Conference on Image Processing (ICIP).
MacArthur, S. D., Brodley, C. E. & Shyu, C. R. (2000). Relevance feedback decision trees in content-based image retrieval. In: Proceedings of the IEEE Workshop on Content-Based Access of Image and Video Libraries.
Maji, S. (2012). Discovering a lexicon of parts and attributes. In: Proceedings of the European Conference on Computer Vision (ECCV) Workshop on Parts and Attributes.
Mensink, T., Verbeek, J. & Csurka, G. (2011). Learning structured prediction models for interactive image labeling. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Naphade, M., Smith, J., Tesic, J., Chang, S. F., Hsu, W., Kennedy, L., et al. (2006). Large-scale concept ontology for multimedia. IEEE Transactions on Multimedia, 13(3), 86–91.
Oliva, A., & Torralba, A. (2001). Modeling the shape of the scene: A holistic representation of the spatial envelope. International Journal of Computer Vision (IJCV), 42(3), 145–175.
Parikh, D., & Grauman, K. (2011a) Interactively building a discriminative vocabulary of nameable attributes. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Parikh, D., & Grauman, K. (2011b) Relative Attributes. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Parikh, D., & Grauman, K. (2013) Implied feedback: Learning nuances of user behavior in image search. In: Proceedings of the IEEE International Conference on Computer Vision (ICCV).
Parkash, A., & Parikh, D. (2012) Attributes for classifier feedback. In: Proceedings of the European Conference on Computer Vision (ECCV).
Patterson, G., Xu, C., Su, H., & Hays, J. (2014). The SUN attribute database: Beyond Categories for deeper scene understanding. International Journal of Computer Vision (IJCV), 108(1–2), 59–81.
Platt, J. C. (1999) Probabilistic output for support vector machines and comparisons to regularized likelihood methods. Advances in Large Margin Classifiers 10(3), 61–74.
Rasiwasia, N., Moreno, P., & Vasconcelos, N. (2007). Bridging the gap: Query by semantic example. IEEE Transactions on Multimedia, 9(5), 923–938.
Rastegari, M., Parikh, D., Diba, A. & Farhadi, A. (2013). Multi-attribute queries: To merge or not to merge? In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Rui, Y., Huang, T., Ortega, M., & Mehrotra, S. (1998). Relevance feedback: A power tool for interactive content-based image retrieval. IEEE Transactions on Circuits and Video Technology, 8(5), 644–655.
Saleh, B., Farhadi, A. & Elgammal, A. (2013). Object-centric anomaly detection by attribute-based reasoning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Scheirer, W., Kumar, N., Belhumeur, P. & Boult, T. (2012). Multi-attribute spaces: Calibration for attribute fusion and similarity search. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Siddiquie, B., Feris, R. & Davis, L. (2011). Image ranking and retrieval based on multi-attribute queries. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Smith, J., Naphade, M. & Natsev, A. (2003). Multimedia semantic indexing using model vectors. In: Proceedings of the International Conference on Multimedia and Expo (ICME).
Sznitman, R., & Jedynak, B. (2010). Active testing for face detection and localization. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 32(10), 1914–1920.
Tieu, K. & Viola, P. (2000). Boosting image retrieval. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Tong, S. & Chang, E. (2001). Support vector machine active learning for image retrieval. In: Proceedings of the ACM International Conference on Multimedia.
Vijayanarasimhan, S. & Kapoor, A. (2010). Visual recognition and detection under bounded computational resources. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wah, C. & Belongie, S. (2013). Attribute-based detection of unfamiliar classes with humans in the loop. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wah, C., Van Horn, G., Branson, S., Maji, S., Perona, P. & Belongie, S. (2014). Similarity comparisons for interactive fine-grained categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wang, X., Liu, K. & Tang, X. (2011). Query-specific visual semantic spaces for web image re-ranking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR).
Wang, Y. & Mori, G. (2010). A discriminative latent model of object classes and attributes. In: Proceedings of the European Conference on Computer Vision (ECCV).
Zavesky, E. & Chang, S. F. (2008). Cu-Zero: Embracing the Frontier of interactive visual search for informed users. In: Proceedings of the ACM International Conference on Multimedia Information Retrieval.
Zhang, C., & Chen, T. (2002). An active learning framework for content based information retrieval. IEEE Transactions on Multimedia, 4(2), 260–268.
Zhou, X., & Huang, T. (2003). Relevance feedback in image retrieval: A comprehensive review. ACM Multimedia Systems, 8(6), 536–544.
Acknowledgments
We thank the anonymous reviewers for their helpful feedback and suggestions. This research was supported by ONR YIP Award N00014-12-1-0754 (K.G. and A.K.) and Google Faculty Research Award (D.P.).
Author information
Authors and Affiliations
Corresponding author
Additional information
Communicated by M. Hebert.
The work was done while Adriana Kovashka was at The University of Texas at Austin.
Appendix
Appendix
See Table 4.
Rights and permissions
About this article
Cite this article
Kovashka, A., Parikh, D. & Grauman, K. WhittleSearch: Interactive Image Search with Relative Attribute Feedback. Int J Comput Vis 115, 185–210 (2015). https://doi.org/10.1007/s11263-015-0814-0
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11263-015-0814-0