Skip to main content

TDSS: A New Word Sense Representation Framework for Information Retrieval

  • Conference paper
  • First Online:
Natural Language Understanding and Intelligent Applications (ICCPOL 2016, NLPCC 2016)

Abstract

Word sense representation is important in the tasks of information retrieval (IR). Existing lexical databases, e.g., WordNet, and automated word sense representing approaches often use only one view to represent a word, and may not work well in the tasks which are sensitive to the contexts, e.g., query rewriting. In this paper, we propose a new framework to represent a word sense simultaneously in two views, explanation view and context view. We further propose an novel method to automatically learn such representations from large scale of query logs. Experimental results show that our new sense representations can better handle word substitutions in a query rewriting task.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Bagga, A., Baldwin, B.: Entity-based cross-document coreferencing using the vector space model. In: International Conference on Computational Linguistics, vol. 1, pp. 79–85 (1998)

    Google Scholar 

  2. Brody, S., Lapata, M.: Bayesian word sense induction. In: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics, pp. 103–111. Association for Computational Linguistics (2009)

    Google Scholar 

  3. Chen, T., Xu, R., He, Y., Wang, X.: Improving distributed representation of word sense via wordnet gloss composition and context clustering. In: Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, vol. 2 (Short Papers), pp. 15–20. Association for Computational Linguistics, Beijing, July 2015. http://www.aclweb.org/anthology/P15-2003

  4. Chen, X., Liu, Z., Sun, M.: A unified model for word sense representation and disambiguation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1025–1035. Association for Computational Linguistics, Doha, October 2014. http://www.aclweb.org/anthology/D14-1110

  5. Dorow, B., Widdows, D.: Discovering corpus-specific word senses. In: Proceedings of the Tenth Conference on European Chapter of the Association for Computational Linguistics, vol. 2, pp. 79–82. Association for Computational Linguistics (2003)

    Google Scholar 

  6. Guo, J., Che, W., Wang, H., Liu, T.: Learning sense-specific word embeddings by exploiting bilingual resources. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 497–507. Dublin City University and Association for Computational Linguistics, Dublin, August 2014. http://www.aclweb.org/anthology/C14-1048

  7. Huang, E., Socher, R., Manning, C., Ng, A.: Improving word representations via global context and multiple word prototypes. In: Proceedings of the 50th Annual Meeting of the Association for Computational Linguistics, vol. 1 (Long Papers), pp. 873–882. Association for Computational Linguistics, Jeju Island, July 2012. http://www.aclweb.org/anthology/P12-1092

  8. Jurgens, D.: Word sense induction by community detection. In: Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing, pp. 24–28. Association for Computational Linguistics (2011)

    Google Scholar 

  9. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013). http://dblp.uni-trier.de/db/journals/corr/corr1301.html#abs-1301-3781

  10. Miller, G.A.: Wordnet: a lexical database for English. Commun. ACM 38(11), 39–41 (1995)

    Article  Google Scholar 

  11. Miller, G.A., Beckwith, R., Fellbaum, C., Gross, D., Miller, K.J.: Introduction to wordnet: an on-line lexical database*. Int. J. Lexicogr. 3(4), 235–244 (1990)

    Article  Google Scholar 

  12. Neelakantan, A., Shankar, J., Passos, A., McCallum, A.: Efficient non-parametric estimation of multiple embeddings per word in vector space. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1059–1069. Association for Computational Linguistics, Doha, October 2014. http://www.aclweb.org/anthology/D14-1113

  13. Niu, Z.Y., Ji, D.H., Tan, C.L.: I2r: three systems for word sense discrimination, Chinese word sense disambiguation, and English word sense disambiguation. In: Proceedings of the 4th International Workshop on Semantic Evaluations, pp. 177–182. Association for Computational Linguistics (2007)

    Google Scholar 

  14. Purandare, A., Pedersen, T.: Word sense discrimination by clustering contexts in vector and similarity spaces. In: Proceedings of the Conference on Computational Natural Language Learning, Boston, vol. 72 (2004)

    Google Scholar 

  15. Schütze, H.: Automatic word sense discrimination. Comput. Linguist. 24(1), 97–123 (1998)

    Google Scholar 

  16. Tian, F., Dai, H., Bian, J., Gao, B., Zhang, R., Chen, E., Liu, T.Y.: A probabilistic model for learning multi-prototype word embeddings. In: Proceedings of COLING 2014, the 25th International Conference on Computational Linguistics: Technical Papers, pp. 151–160. Dublin City University and Association for Computational Linguistics, Dublin, August 2014. http://www.aclweb.org/anthology/C14-1016

  17. Yao, X., Van Durme, B.: Nonparametric Bayesian word sense induction. In: Proceedings of TextGraphs-6: Graph-based Methods for Natural Language Processing, pp. 10–14. Association for Computational Linguistics (2011)

    Google Scholar 

  18. Zhao, S., Wang, H., Liu, T.: Paraphrasing with search engine query logs. In: Proceedings of the 23rd International Conference on Computational Linguistics, pp. 1317–1325. Association for Computational Linguistics (2010)

    Google Scholar 

Download references

Acknowledgement

We would like to thank Ben Xu, Wensong He, Shuaixiang Dai, Xiaozhao Zhao, Qiannan Lv, and the anonymous reviewers for their helpful feedback. This work is supported by National High Technology R&D Program of China (Grant No. 2015AA015403, 2014AA015102) and Natural Science Foundation of China (Grant No. 61202233, 61272344, 61370055). For any correspondence, please contact Liwei Chen.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Liwei Chen .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2016 Springer International Publishing AG

About this paper

Cite this paper

Chen, L., Feng, Y., Zhao, D. (2016). TDSS: A New Word Sense Representation Framework for Information Retrieval. In: Lin, CY., Xue, N., Zhao, D., Huang, X., Feng, Y. (eds) Natural Language Understanding and Intelligent Applications. ICCPOL NLPCC 2016 2016. Lecture Notes in Computer Science(), vol 10102. Springer, Cham. https://doi.org/10.1007/978-3-319-50496-4_6

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-50496-4_6

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-50495-7

  • Online ISBN: 978-3-319-50496-4

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics