Abstract
We present in this paper a new model for representing probabilistic information in a semi-structured (XML) database, based on the use of probabilistic event variables. This work is motivated by the need of keeping track of both confidence and lineage of the information stored in a semi-structured warehouse. For instance, the modules of a (Hidden Web) content warehouse may derive information concerning the semantics of discovered Web services that is by nature not certain. Our model, namely the fuzzy tree model, supports both querying (tree pattern queries with join) and updating (transactions containing an arbitrary set of insertions and deletions) over probabilistic tree data. We highlight its expressive power and discuss implementation issues.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Abiteboul, S., Nguyen, B., Ruberg, G.: Building an active content warehouse. In: Processing and Managing Complex Data for Decision Support, Idea Group Publishing, USA (2005)
Dalvi, N.N., Suciu, D.: Efficient query evaluation on probabilistic databases. In: Very Large Data Bases, Hong Kong, China, pp. 864–875 (2004)
de Rougemont, M.: The reliability of queries. In: Principles Of Database Systems, San Jose, United States, pp. 286–291 (1995)
Imieliński, T., Lipski, W.: Incomplete information in relational databases. J. ACM 31, 761–791 (1984)
Abiteboul, S., Grahne, G.: Update semantics for incomplete databases. In: Very Large Data Bases, Stockholm, Sweden (1985)
Abiteboul, S., Senellart, P.: Querying and updating probabilistic information in XML. Technical Report 435, GEMO, Inria Futurs, Orsay, France (2005)
BrightPlanet: The Deep Web: Surfacing hidden value. White Paper (2000)
Franc, X.: Qizx/open (2005), http://www.xfra.net/qizxopen/
Arion, A., Bonifati, A., Manolescu, I., Pugliese, A.: Path summaries and path partitioning in modern XML databases. Technical Report 437, Gemo (2005)
Cavallo, R., Pittarelli, M.: The theory of probabilistic databases. In: Very Large Data Bases, pp. 71–81 (1987)
Barbará, D., Garcia-Molina, H., Porter, D.: The management of probabilistic data. IEEE Transactions on Knowledge and Data Engineering 4, 487–502 (1992)
Fuhr, N., Rölleke, T.: A probabilistic relational algebra for the integration of information retrieval and database systems. ACM Trans. Inf. Syst. 15 (1997)
Widom, J.: Trio: A system for integrated management of data, accuracy, and lineage. In: Biennal Conference on Innovative Data Systems Research, Pacific Grove, USA (2005)
Dekhtyar, A., Goldsmith, J., Hawkes, S.R.: Semistructured probabilistic databases. In: Statistical and Scientific Database Management, Tokyo, Japan, pp. 36–45 (2001)
Nierman, A., Jagadish, H.V.: ProTDB: Probabilistic data in XML. In: Very Large Data Bases, Hong Kong, China (2002)
Hung, E., Getoor, L., Subrahmanian, V.S.: PXML: A probabilistic semistructured data model and algebra. In: International Conference on Data Engineering, Bangalore, India, pp. 467–478 (2003)
van Keulen, M., de Keijzer, A., Alink, W.: A probabilistic XML approach to data integration. In: International Conference on Data Engineering, pp. 459–470 (2005)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Abiteboul, S., Senellart, P. (2006). Querying and Updating Probabilistic Information in XML. In: Ioannidis, Y., et al. Advances in Database Technology - EDBT 2006. EDBT 2006. Lecture Notes in Computer Science, vol 3896. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11687238_62
Download citation
DOI: https://doi.org/10.1007/11687238_62
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-32960-2
Online ISBN: 978-3-540-32961-9
eBook Packages: Computer ScienceComputer Science (R0)