Data Fusion Methods with Graded Relevance Judgment

Huang, Yidong; Xu, Qiuyu; Liu, Yao; Xu, Chunlin; Wu, Shengli

doi:10.1007/978-3-031-20309-1_20

Yidong Huang¹¹,
Qiuyu Xu¹¹,
Yao Liu¹¹,
Chunlin Xu¹² &
…
Shengli Wu¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13579))

Included in the following conference series:

International Conference on Web Information Systems and Applications

1184 Accesses
1 Citations

Abstract

Data fusion methods have been widely used in many information retrieval tasks. Its performance is affected by many factors including the data fusion algorithm used, the component retrieval systems involved, relevance judgment, the metrics used for evaluation, and others. Previously, data fusion research mainly focused on the data fusion methods and the component retrieval systems involved, but other factors such as relevance judgment and the metrics used for evaluation have not been addressed. As a matter of fact, relevance judgment is an important issue that affects many aspects of information retrieval and data fusion. The assumption of binary relevance judgment has been taken for all the previous research work in data fusion. However, this assumption is simplified and not satisfactory in many cases. Instead, graded relevance judgment is more general and able to deal with more complicated requirements. In this paper, we investigate data fusion methods, especially linear combination, to work with graded relevance judgment. Necessary updates are given for using those methods in the new situation. Experimented with two data sets in TREC, we find that data fusion is still an effective technology for performance improvement in general. Many of them are very competitive in a controlled environment, and linear combination with weights trained by multiple linear regression is the most stable in a more complicated environment.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Inexpensive and Effective Data Fusion Methods with Performance Weights

Performance Study of Data Fusion Using Kalman Filter and Learning Vector Quantization

Using Quality Measures in the Intelligent Fusion of Probabilistic Information

Notes

1.
Text REtrieval Conference (TREC) is held annually by the national institute of standards and technology, USA. Its web site is located at https://trec.nist.gov.
2.
http://www.clef-initiative.eu/.
3.
https://research.nii.ac.jp/ntcir/index-en.html.

References

Aslam, J.A., Montague, M.: Models for metasearch. In: Proceedings of the 24th Annual International ACM SIGIR Conference, New Orleans, Louisiana, USA, pp. 276–284, September 2001
Google Scholar
Cormack, G.V., Clarke, C.L.A., B$\ddot{u}$ttcher, S.: Reciprocal rank fusion outperforms Condorcet and individual rank learning methods. In: Proceedings of the 32nd Annual International ACM SIGIR Conference, Boston, MA, USA, pp. 758–759, July 2009
Google Scholar
Cronen-Townsend, S., Zhou, Y., Croft, W.B.: Predicting query performance. In: Järvelin, K., Beaulieu, M., Baeza-Yates, R.A., Myaeng, S. (eds.) SIGIR 2002: Proceedings of the 25th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Tampere, Finland, 11–15 August 2002, pp. 299–306. ACM (2002)
Google Scholar
Fox, E.A., Koushik, M.P., Shaw, J., Modlin, R., Rao, D.: Combining evidence from multiple searches. In: The First Text REtrieval Conference (TREC-1), Gaitherburg, MD, USA, pp. 319–328, March 1993
Google Scholar
Ghosh, K., Parui, S.K., Majumder, P.: Learning combination weights in data fusion using genetic algorithms. Inf. Process. Manag. 51(3), 306–328 (2015)
Article Google Scholar
J$\ddot{a}$rvelin, K., Kek$\ddot{a}$l$\ddot{a}$inen, J.: Cumulated gain-based evaluation of IR techniques. ACM Trans. Inform. Syst. 20(4), 442–446 (2002)
Google Scholar
Lillis, D., Zhang, L., Toolan, F., Collier, R., Leonard, D., Dunnion, J.: Estimating probabilities for effective data fusion. In: Proceeding of the 33rd International ACM SIGIR Conference on Research and Development in Information Retrieval, Geneva, Switzerland, pp. 347–354, July 2010
Google Scholar
Lillis, D., Toolan, F., Collier, R., Dunnion, J.: Extending probabilistic data fusion using sliding windows. In: Macdonald, C., Ounis, I., Plachouras, V., Ruthven, I., White, R.W. (eds.) ECIR 2008. LNCS, vol. 4956, pp. 358–369. Springer, Heidelberg (2008). https://doi.org/10.1007/978-3-540-78646-7_33
Chapter Google Scholar
Lin, J., Efron, M.: Overview of the TREC-2013 microblog track. In: Voorhees, E.M. (ed.) Proceedings of The Twenty-Second Text REtrieval Conference, TREC 2013, Gaithersburg, Maryland, USA, 19–22 November 2013. NIST Special Publication, vol. 500–302. National Institute of Standards and Technology (NIST) (2013)
Google Scholar
Lin, J., Wang, Y., Efron, M., Sherman, G.: Overview of the TREC-2014 microblog track. In: Proceedings of The Twenty-Third Text REtrieval Conference, TREC 2014, Gaithersburg, Maryland, USA, 19–21 November 2014. NIST Special Publication, vol. 500–308. National Institute of Standards and Technology (NIST) (2014)
Google Scholar
Markovits, G., Shtok, A., Kurland, O., Carmel, D.: Predicting query performance for fusion-based retrieval. In: Chen, X., Lebanon, G., Wang, H., Zaki, M.J. (eds.) 21st ACM International Conference on Information and Knowledge Management, CIKM 2012, Maui, HI, USA, 29 October– 02 November 2012, pp. 813–822. ACM (2012)
Google Scholar
Montague, M., Aslam, J.A.: Condorcet fusion for improved retrieval. In: Proceedings of ACM CIKM Conference, McLean, VA, USA, pp. 538–548, November 2002
Google Scholar
Roitman, H.: Enhanced performance prediction of fusion-based retrieval. In: Proceedings of the 2018 ACM SIGIR International Conference on Theory of Information Retrieval, ICTIR 2018, Tianjin, China, 14–17 September 2018, pp. 195–198. ACM (2018)
Google Scholar
Roitman, H., Kurland, O.: Query performance prediction for pseudo-feedback-based retrieval. In: Piwowarski, B., Chevalier, M., Gaussier, É., Maarek, Y., Nie, J., Scholer, F. (eds.) Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, SIGIR 2019, Paris, France, 21–25 July 2019, pp. 1261–1264. ACM (2019)
Google Scholar
Sivaram, M., Batri, K., Mohammed, A.S., Porkodi, V., Kousik, N.V.: Data fusion using Tabu crossover genetic algorithm in information retrieval. J. Intell. Fuzzy Syst. 39(4), 5407–5416 (2020)
Article Google Scholar
Webber, W., Moffat, A., Zobel, J.: A similarity measure for indefinite rankings. ACM Trans. Inf. Syst. 28(4), 20:1–20:38 (2010)
Google Scholar
Wu, S.: Applying statistical principles to data fusion in information retrieval. Expert Syst. Appl. 36(2), 2997–3006 (2009)
Article Google Scholar
Wu, S.: Data Fusion in Information Retrieval. Springer, Heidelberg (2012). https://doi.org/10.1007/978-3-642-28866-1
Wu, S.: Linear combination of component results in information retrieval. Data Knowl. Eng. 71(1), 114–126 (2012)
Google Scholar
Wu, S.: The weighted Condorcet fusion in information retrieval. Inf. Process. Manag. 49(1), 114–126 (2013)
Google Scholar
Wu, S., Bi, Y., Zeng, X., Han, L.: Assigning appropriate weights for the linear combination data fusion method in information retrieval. Inf. Process. Manag. 45(4), 413–426 (2009)
Google Scholar
Wu, S., McClean, S.: Data fusion with correlation weights. In: Losada, D.E., Fernández-Luna, J.M. (eds.) ECIR 2005. LNCS, vol. 3408, pp. 275–286. Springer, Heidelberg (2005). https://doi.org/10.1007/978-3-540-31865-1_20
Chapter Google Scholar
Wu, S., McClean, S.: Performance prediction of data fusion for information retrieval. Inf. Process. Manag. 42(4), 899–915 (2006)
Google Scholar
Xu, C., Huang, C., Wu, S.: Differential evolution-based fusion for results diversification of web search. In: Web-Age Information Management - 17th International Conference, WAIM 2016, Nanchang, China, 3–5 June 2016, Proceedings, Part I, pp. 429–440 (2016)
Google Scholar
Xu, Q., Wu, S.: Improving medical record search performance by particle swarm optimization based data fusion techniques. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds.) WISA 2021. LNCS, vol. 12999, pp. 87–98. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87571-8_8
Chapter Google Scholar

Download references

Author information

Authors and Affiliations

School of Computer Science, Jiangsu University, Zhenjiang, China
Yidong Huang, Qiuyu Xu, Yao Liu & Shengli Wu
School of Computer Science, Guangdong Polytechnic Normal University, Guangzhou, China
Chunlin Xu

Authors

Yidong Huang
View author publications
You can also search for this author in PubMed Google Scholar
Qiuyu Xu
View author publications
You can also search for this author in PubMed Google Scholar
Yao Liu
View author publications
You can also search for this author in PubMed Google Scholar
Chunlin Xu
View author publications
You can also search for this author in PubMed Google Scholar
Shengli Wu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shengli Wu .

Editor information

Editors and Affiliations

National University of Defense Technology, Changsha, China
Xiang Zhao
Guangzhou University, Guangzhou, China
Shiyu Yang
Tianjin University, Tianjin, China
Xin Wang
Deakin University, Melbourne, VIC, Australia
Jianxin Li

7 Appendix

See Tables 5 and 6

Table 5. Information of the eight selected runs in TREC 2013

Full size table

Table 6. Information of the eight selected runs in TREC 2014

Full size table

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Huang, Y., Xu, Q., Liu, Y., Xu, C., Wu, S. (2022). Data Fusion Methods with Graded Relevance Judgment. In: Zhao, X., Yang, S., Wang, X., Li, J. (eds) Web Information Systems and Applications. WISA 2022. Lecture Notes in Computer Science, vol 13579. Springer, Cham. https://doi.org/10.1007/978-3-031-20309-1_20

Download citation

DOI: https://doi.org/10.1007/978-3-031-20309-1_20
Published: 08 December 2022
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-20308-4
Online ISBN: 978-3-031-20309-1
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Data Fusion Methods with Graded Relevance Judgment

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Inexpensive and Effective Data Fusion Methods with Performance Weights

Performance Study of Data Fusion Using Kalman Filter and Learning Vector Quantization

Using Quality Measures in the Intelligent Fusion of Probabilistic Information

Notes

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

7 Appendix

7 Appendix

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us