Abstract
A long query provides more useful hints for searching relevant documents, but it is likely to introduce noise which affects retrieval performance. In order to smooth such adverse effect, it is important to reduce noisy terms, introduce and boost additional relevant terms. This paper presents a comprehensive framework, called Aspect Hidden Markov Model (AHMM), which integrates query reduction and expansion, for retrieval with long queries. It optimizes the probability distribution of query terms by utilizing intra-query term dependencies as well as the relationships between query terms and words observed in relevance feedback documents. Empirical evaluation on three large-scale TREC collections demonstrates that our approach, which is automatic, achieves salient improvements over various strong baselines, and also reaches a comparable performance to a state of the art method based on user’s interactive query term reduction and expansion.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bendersky, M., Croft, W.B.: Discovering key concepts in verbose queries. In: SIGIR 2008, pp. 491–498 (2008)
Blei, D.M., Moreno, P.J.: Topic segmentation with an aspect hidden Markov model. In: SIGIR 2001, pp. 343–348 (2001)
Dumais, S., Joachims, T., Bharat, K., Weigend, A.: SIGIR 2003 workshop report: implicit measures of user interests and preferences. In: SIGIR Forum, pp. 50–54 (2003)
Harman, D.: Towards interactive query expansion. In: SIGIR 1998, pp. 321–331 (1998)
Hoenkamp, E., Bruza, P., Song, D., Huang, Q.: An effective approach to verbose queries using a limited dependencies language model. In: Azzopardi, L., Kazai, G., Robertson, S., Rüger, S., Shokouhi, M., Song, D., Yilmaz, E. (eds.) ICTIR 2009. LNCS, vol. 5766, pp. 116–127. Springer, Heidelberg (2009)
Huston, S., Croft, W.B.: Evaluating verbose query processing techniques. In: SIGIR 2010, pp. 291–298 (2010)
Jansen, B.J., Spink, A., Bateman, J., Saracevic, T.: Real life information retrieval: A study of user queries on the web. In: SIGIR Forum, pp. 5–17 (1998)
Jansen, B.J., Spink, A., Saracevic, T.: Real life, real users, and real needs: a study and analysis of user queries on the web. In: Information Processing and Management, pp. 207–227 (2000)
Kelly, D., Dollu, V.D., Fu, X.: The loquacious user: a document-independent source of terms for query expansion. In: SIGIR 2005, pp. 457–464 (2005)
Kumaran, G., Allan, J.: A case for shorter queries, and helping users create them. In: HLT-NAACL 2007, pp. 220–227 (2007)
Kumaran, G., Allan, J.: Effective and efficient user interaction for long queries. In: SIGIR 2008, pp. 11–18 (2008)
Kumaran, G., Carvalho, V.R.: Reducing long queries using query quality predictors. In: SIGIR 2009, pp. 564–571 (2009)
Lavrenko, V., Croft, W.B.: Relevance-based language models. In: SIGIR 2001, pp. 120–127 (2001)
Luo, G., Tang, C., Yang, H., Wei, X.: Medsearch: A specialized search engine for medical information retrieval. In: CIKM 2008, pp. 143–152 (2008)
Manning, C.D., Raghavan, P., Schütze, H.: Introduction to Information Retrieval. Cambridge University Press (2008)
Markey, K.: Twenty-five years of end-user searching, Part 1: Research findings. Journal of the American Society for Information Science and Technology, 1071–1081 (2007)
Rabiner, L.R.: A tutorial on hidden markov models and selected applications in speech recognition. Proceedings of the IEEE 77, 257–286 (1989)
Ruthven, I.: Re-examining the potential effectiveness of interactive query expansion. In: SIGIR 2003, pp. 213–220 (2003)
Song, D., Huang, Q., Bruza, P., Lau, R.: An aspect query language model based on query decomposition and high-order contextual term associations. In: Computational Intelligence, pp. 1–23 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Song, D. et al. (2013). Optimization of an Integrated Model for Automatic Reduction and Expansion of Long Queries. In: Banchs, R.E., Silvestri, F., Liu, TY., Zhang, M., Gao, S., Lang, J. (eds) Information Retrieval Technology. AIRS 2013. Lecture Notes in Computer Science, vol 8281. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-45068-6_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-45068-6_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-45067-9
Online ISBN: 978-3-642-45068-6
eBook Packages: Computer ScienceComputer Science (R0)