Synonyms
Deviation from randomness
Definition
Divergence-from-randomness (DFR) information retrieval models are term-document matching functions that are obtained by the product of two divergence functions. An example of DFR function is that related to Jensen’s information of two probability distributions [9, pp. 26–28]:
where \( {I}_1\left({\hat{p}}_i^{+}||{\hat{p}}_i\right)={\hat{p}}_i^{+}-{\hat{p}}_i=\varDelta {\hat{p}}_i\ \mathrm{and}\\ {I}_2\left({\hat{p}}_i^{+}||{\hat{p}}_i\right)={ \log}_2\frac{{\hat{p}}_i+\varDelta {\hat{p}}_i}{{\hat{p}}_i} \).
The DFR generalizes the Jensen’s information as follows:
where
-
p is a prior probability density function of terms (or documents) in the collection.
-
\( \hat{p} \) is the frequency of the term in a document (or in a subset of documents).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Recommended Reading
Amati G. Frequentist and Bayesian approach to information retrieval. In: Proceedings of the 28th European Conference on IR Research; 2005. p. 13–24.
Amati G, Carpineto C, Romano G. Query difficulty, robustness, and selective application of query expansion. In: Proceedings of the 26th European Conference on IR Research; 2004. p. 127–37.
Amati G, Van Rijsbergen CJ. Probabilistic models of information retrieval based on measuring the divergence from randomness. ACM Trans Inf Syst. 2002;20(4):357–89.
Gärdenfors P. Knowledge in flux. MIT; 1988.
Gaussier E, Clinchant S. The BNB distribution for text modeling. In: ECIR, lecture notes in computer science. Springer; 2008.
Good IJ. A casual calculus I. Br J Phil Sci. 1961;11(44):305–18.
Harter SP. A probabilistic approach to automatic keyword indexing. PhD thesis, Thesis No. T25146. Graduate Library, The University of Chicago; 1974.
He I, Ounis B. On setting the hyper-parameters of the term frequency normalisation for information retrieval. ACM Trans Inf Syst. 2007;25(3).
Kullback S. Information theory and statistics. New York: Wiley; 1959.
Ounis I, Amati G, Plachouras V, He B, Macdonald C, Johnson D. Terrier information retrieval platform. In: Proceedings of the 27th European Conference on IR Research; 2005. p. 517–9.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Section Editor information
Rights and permissions
Copyright information
© 2018 Springer Science+Business Media, LLC, part of Springer Nature
About this entry
Cite this entry
Amati, G. (2018). Divergence-from-Randomness Models. In: Liu, L., Özsu, M.T. (eds) Encyclopedia of Database Systems. Springer, New York, NY. https://doi.org/10.1007/978-1-4614-8265-9_924
Download citation
DOI: https://doi.org/10.1007/978-1-4614-8265-9_924
Published:
Publisher Name: Springer, New York, NY
Print ISBN: 978-1-4614-8266-6
Online ISBN: 978-1-4614-8265-9
eBook Packages: Computer ScienceReference Module Computer Science and Engineering