Skip to main content

Data Fusion Methods with Graded Relevance Judgment

  • Conference paper
  • First Online:
Web Information Systems and Applications (WISA 2022)

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13579))

Included in the following conference series:

Abstract

Data fusion methods have been widely used in many information retrieval tasks. Its performance is affected by many factors including the data fusion algorithm used, the component retrieval systems involved, relevance judgment, the metrics used for evaluation, and others. Previously, data fusion research mainly focused on the data fusion methods and the component retrieval systems involved, but other factors such as relevance judgment and the metrics used for evaluation have not been addressed. As a matter of fact, relevance judgment is an important issue that affects many aspects of information retrieval and data fusion. The assumption of binary relevance judgment has been taken for all the previous research work in data fusion. However, this assumption is simplified and not satisfactory in many cases. Instead, graded relevance judgment is more general and able to deal with more complicated requirements. In this paper, we investigate data fusion methods, especially linear combination, to work with graded relevance judgment. Necessary updates are given for using those methods in the new situation. Experimented with two data sets in TREC, we find that data fusion is still an effective technology for performance improvement in general. Many of them are very competitive in a controlled environment, and linear combination with weights trained by multiple linear regression is the most stable in a more complicated environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    Text REtrieval Conference (TREC) is held annually by the national institute of standards and technology, USA. Its web site is located at https://trec.nist.gov.

  2. 2.

    http://www.clef-initiative.eu/.

  3. 3.

    https://research.nii.ac.jp/ntcir/index-en.html.

References

  1. Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings of the 24th Annual International ACM SIGIR Conference, New Orleans, Louisiana, USA, pp. 276–284, September 2001

    Google Scholar 

  2. Cormack, G.V., Clarke, C.L.A., B\(\ddot{u}\)ttcher, S.: Reciprocal rank fusion outperforms Condorcet and individual rank learning methods. In: Proceedings of the 32nd Annual International ACM SIGIR Conference, Boston, MA, USA, pp. 758–759, July 2009

    Google Scholar 

  3. Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Järvelin, K., Beaulieu, M., Baeza-Yates, R.A., Myaeng, S. (eds.) SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 11–15 August 2002, pp. 299–306. ACM (2002)

    Google Scholar 

  4. Fox, E.A., Koushik, M.P., Shaw, J., Modlin, R., Rao, D.: Combining evidence from multiple searches. In: The First Text REtrieval Conference (TREC-1), Gaitherburg, MD, USA, pp. 319–328, March 1993

    Google Scholar 

  5. Ghosh, K., Parui, S.K., Majumder, P.: Learning combination weights in data fusion using genetic algorithms. Inf. Process. Manag. 51(3), 306–328 (2015)

    Article  Google Scholar 

  6. J\(\ddot{a}\)rvelin, K., Kek\(\ddot{a}\)l\(\ddot{a}\)inen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inform. Syst. 20(4), 442–446 (2002)

    Google Scholar 

  7. Lillis, D., Zhang, L., Toolan, F., Collier, R., Leonard, D., Dunnion, J.: Estimating probabilities for effective data fusion. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Geneva, Switzerland, pp. 347–354, July 2010

    Google Scholar 

  8. Lillis, D., Toolan, F., Collier, R., Dunnion, J.: Extending probabilistic data fusion using sliding windows. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 358–369. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_33

    Chapter  Google Scholar 

  9. Lin, J., Efron, M.: Overview of the TREC-2013 microblog track. In: Voorhees, E.M. (ed.) Proceedings of The Twenty-Second Text REtrieval Conference, TREC 2013, Gaithersburg, Maryland, USA, 19–22 November 2013. NIST Special Publication, vol. 500–302. National Institute of Standards and Technology (NIST) (2013)

    Google Scholar 

  10. Lin, J., Wang, Y., Efron, M., Sherman, G.: Overview of the TREC-2014 microblog track. In: Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, 19–21 November 2014. NIST Special Publication, vol. 500–308. National Institute of Standards and Technology (NIST) (2014)

    Google Scholar 

  11. Markovits, G., Shtok, A., Kurland, O., Carmel, D.: Predicting query performance for fusion-based retrieval. In: Chen, X., Lebanon, G., Wang, H., Zaki, M.J. (eds.) 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, Maui, HI, USA, 29 October– 02 November 2012, pp. 813–822. ACM (2012)

    Google Scholar 

  12. Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of ACM CIKM Conference, McLean, VA, USA, pp. 538–548, November 2002

    Google Scholar 

  13. Roitman, H.: Enhanced performance prediction of fusion-based retrieval. In: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2018, Tianjin, China, 14–17 September 2018, pp. 195–198. ACM (2018)

    Google Scholar 

  14. Roitman, H., Kurland, O.: Query performance prediction for pseudo-feedback-based retrieval. In: Piwowarski, B., Chevalier, M., Gaussier, É., Maarek, Y., Nie, J., Scholer, F. (eds.) Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, 21–25 July 2019, pp. 1261–1264. ACM (2019)

    Google Scholar 

  15. Sivaram, M., Batri, K., Mohammed, A.S., Porkodi, V., Kousik, N.V.: Data fusion using Tabu crossover genetic algorithm in information retrieval. J. Intell. Fuzzy Syst. 39(4), 5407–5416 (2020)

    Article  Google Scholar 

  16. Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. 28(4), 20:1–20:38 (2010)

    Google Scholar 

  17. Wu, S.: Applying statistical principles to data fusion in information retrieval. Expert Syst. Appl. 36(2), 2997–3006 (2009)

    Article  Google Scholar 

  18. Wu, S.: Data Fusion in Information Retrieval. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28866-1

  19. Wu, S.: Linear combination of component results in information retrieval. Data Knowl. Eng. 71(1), 114–126 (2012)

    Google Scholar 

  20. Wu, S.: The weighted Condorcet fusion in information retrieval. Inf. Process. Manag. 49(1), 114–126 (2013)

    Google Scholar 

  21. Wu, S., Bi, Y., Zeng, X., Han, L.: Assigning appropriate weights for the linear combination data fusion method in information retrieval. Inf. Process. Manag. 45(4), 413–426 (2009)

    Google Scholar 

  22. Wu, S., McClean, S.: Data fusion with correlation weights. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 275–286. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_20

    Chapter  Google Scholar 

  23. Wu, S., McClean, S.: Performance prediction of data fusion for information retrieval. Inf. Process. Manag. 42(4), 899–915 (2006)

    Google Scholar 

  24. Xu, C., Huang, C., Wu, S.: Differential evolution-based fusion for results diversification of web search. In: Web-Age Information Management - 17th International Conference, WAIM 2016, Nanchang, China, 3–5 June 2016, Proceedings, Part I, pp. 429–440 (2016)

    Google Scholar 

  25. Xu, Q., Wu, S.: Improving medical record search performance by particle swarm optimization based data fusion techniques. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds.) WISA 2021. LNCS, vol. 12999, pp. 87–98. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87571-8_8

    Chapter  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shengli Wu .

Editor information

Editors and Affiliations

7 Appendix

7 Appendix

See Tables  5 and 6

Table 5. Information of the eight selected runs in TREC 2013
Table 6. Information of the eight selected runs in TREC 2014

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Huang, Y., Xu, Q., Liu, Y., Xu, C., Wu, S. (2022). Data Fusion Methods with Graded Relevance Judgment. In: Zhao, X., Yang, S., Wang, X., Li, J. (eds) Web Information Systems and Applications. WISA 2022. Lecture Notes in Computer Science, vol 13579. Springer, Cham. https://doi.org/10.1007/978-3-031-20309-1_20

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-20309-1_20

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-20308-4

  • Online ISBN: 978-3-031-20309-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics