Skip to main content
Log in

Discovering Robust Knowledge from Databases that Change

  • Published:
Data Mining and Knowledge Discovery Aims and scope Submit manuscript

    We’re sorry, something doesn't seem to be working properly.

    Please try refreshing the page. If that doesn't work, please contact support so we can address the problem.

Abstract

Many applications of knowledge discovery and data mining such as rule discovery for semantic query optimization, database integration and decision support, require the knowledge to be consistent with the data. However, databases usually change over time and make machine-discovered knowledge inconsistent. Useful knowledge should be robust against database changes so that it is unlikely to become inconsistent after database updates. This paper defines this notion of robustness in the context of relational databases and describes how robustness of first-order Horn-clause rules can be estimated. Experimental results show that our estimation approach can accurately identify robust rules. We also present a rule antecedent pruning algorithm that improves the robustness and applicability of machine discovered rules to demonstrate the usefulness of robustness estimation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  • Agrawal, R., Imielinski, T., and Swami, A. 1993. Database mining: A performance perspective. IEEE Transactions on Knowledge and Data Engineering 5(6):914–925.

    Google Scholar 

  • Ambite, J.-L., and Knoblock, C. A. 1995. Reconciling distributed information sources. In Working Notes of the AAAI Spring Symposium on Information Gathering in Distributed Heterogeneous Environments, AAAI Technical Report SS-95-08.

  • Arens, Y., Chee, C. Y., Hsu, C.-N., and Knoblock, C. A. 1993. Retrieving and integrating data from multiple information sources. International Journal on Intelligent and Cooperative Information Systems 2(2):127–159.

    Google Scholar 

  • Arens, Y., Knoblock, C. A., and Shen, W.-M. 1996. Query reformulation for dynamic information integration. Journal of Intelligent Information Systems, Special Issue on Intelligent Information Integration 6(2/3):99–130.

    Google Scholar 

  • Bacchus, F., Grove, A., Halpern, J. Y., and Koller, D. 1992. From statistics to beliefs. In Proceedings of the Tenth National Conference on Artificial Intelligence (AAAI-92), 602–608.

  • Bacchus, F., Grove, A., Halpern, J. Y., and Koller, D. 1994. Forming beliefs about a changing world. In Proceedings of the Twelfth National Conference on Artificial Intelligence (AAAI-94), 222–229.

  • Bell, S. 1995. Discovery and maintenance of functional dependencies by independencies. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Cestnik, B., and Bratko, I. 1991. On estimating probabilities in tree pruning. In Machine Learning – EWSL-91, European Working Session on Learning. Berlin, Germany: Springer-Verlag. 138–150.

    Google Scholar 

  • Clark, P., and Niblett, T. 1989. The CN2 induction algorithm. Machine Learning 3(4):261–283.

    Google Scholar 

  • Cohen, W. W. 1993. Efficient pruning methods for separate-and-conquer rule learning systems. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93).

  • Cussens, J. 1993. Bayes and pesudo-Bayes estimates of conditional probabilities and their reliability. In Machine Learning: ECML-93, 136–152. Berlin, Germany: Springer-Verlag.

    Google Scholar 

  • Dao, S., and Perry, B. 1995. Applying a data miner to heterogeneous schema integration. In Proceedings of the First International Conference on Knowledge Discovery and Data Mining (KDD-95). Menlo Park, CA: AAAI Press.

    Google Scholar 

  • Džeroski, S. 1996. Inductive logic programming and knowledge discovery in databases. In Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., eds., Advances in Knowledge Discovery and Data Mining. AAAI Press/MIT Press. chapter 5.

  • Furnkranz, J., and Widmer, G. 1994. Incremental reduced error prunning. In Machine Learning, Proceedings of the 11th International Conference(ML-94). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Helmbold, D. P., and Long, P. M. 1994. Tracking drifting concepts by minimizing disagreement. MachineLearning 14:27–45.

    Google Scholar 

  • Howson, C., and Urbach, P. 1988. Scientific Reasoning: The Bayesian Approach. Open Court.

  • Hsu, C.-N., and Knoblock, C. A. 1993. Reformulating query plans for multidatabase systems. In Proceedings of the Second International Conference on Information and Knowledge Management (CIKM-93).

  • Hsu, C.-N., and Knoblock, C. A. 1994. Rule induction for semantic query optimization. In Machine Learning, Proceedings of the 11th International Conference (ML-94). San Mateo, CA: Morgan Kaufmann.

    Google Scholar 

  • Hsu, C.-N., and Knoblock, C. A. 1996a. Discovering robust knowledge from dynamic closed-world data. In Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96).

  • Hsu, C.-N., and Knoblock, C. A. 1996b. Using inductive learning to generate rules for semantic query optimization. In Fayyad, U. M., Piatetsky-Shapiro, G., Smyth, P., and Uthurusamy, R., eds., Advances in Knowledge Discovery and Data Mining. AAAI Press/MIT Press. chapter 17.

  • Hsu, C.-N. 1996. Learning Effective and Robust Knowledge for Semantic Query Optimization. Ph.D. Dissertation, Department of Computer Science, University of Southern California. Available as USC/ISI Technical Report RR-96-451, or ftp://ftp.isi.edu/isi-pubs/rr-96-451.ps.Z.

  • King, J. J. 1981. Query Optimization by Semantic Reasoning. Ph.D. Dissertation, Stanford University, Department of Computer Science.

  • Knoblock, C. A., Arens, Y., and Hsu, C.-N. 1994. Cooperating agents for information retrieval. In Proceedings of the Second International Conference on Intelligent and Cooperative Information Systems.

  • Lavrač, N., and Džeroski, S. 1994. Inductive Logic Programming: Techniques and Applications. Ellis Horwood.

  • Lloyd, J. W. 1987. Foundations of Logic Programming. Berlin, Germany: Springer-Verlag.

    Google Scholar 

  • Mannila, H., and Raiha, K.-J. 1994. Algorithms for inferring functional dependencies from relations. Data and Knowledge Engineering 12:83–99.

    Google Scholar 

  • Minton, S. 1988. Learning Effective Search Control Knowledge: An Explanation-Based Approach. Ph.D. Dissertation, Carnegie Mellon University, School of Computer Science.

  • Pawlak, Z. 1991. Rough Sets: Theoretical aspects of Reasoning about Data. Boston, MA: Kluwer.

    Google Scholar 

  • Piatetsky-Shapiro, G. 1984. A Self-Organizing Database System – A Different Approach To Query Optimization. Ph.D. Dissertation, Department of Computer Science, New York University.

  • Piatetsky-Shapiro, G. 1991. Discovery, analysis, and presentation of strong rules. In Piatetsky-Shapiro, G., and Frawley, W. J., eds., Knowledge Discovery in Databases. MIT Press. 229–248.

  • Raedt, L. D., and Bruynooghe, M. 1993. A theory of clausal discovery. In Proceedings of the 13th International Joint Conference on Artificial Intelligence (IJCAI-93).

  • Ramsay, A. 1988. Formal Methods in Artificial Intelligence. Cambridge, U.K.: Cambridge University Press.

    Google Scholar 

  • Shekhar, S., Hamidzadeh, B., Kohli, A., and Coyle, M. 1993. Learning transformation rules for semantic query optimization: A data-driven approach. IEEE Transactions on Knowledge and Data Engineering 5(6):950–964.

    Google Scholar 

  • Siegel, M. D., Sciore, E., and Salveter, S. 1991. Rule discovery for query optimization. In Piatetsky-Shapiro, G., and Frawley, W. J., eds., Knowledge Discovery in Databases. Cambridge, MA: MIT Press. 411–427.

    Google Scholar 

  • Sun, W., and Yu, C. T. 1994. Semantic query optimization for tree and chain queries. IEEE Trans. Knowledge and Data Engineering 6(1):136–151.

    Google Scholar 

  • Ullman, J. D. 1988. Principles of Database and Knowledge-base Systems, volume I,II. Palo Alto, CA: Computer Science Press.

    Google Scholar 

  • Widmer, G., and Kubat, M. 1993. Effective learning in dynamic environments by explicit context tracking. In Machine Learning: ECML-93. Berlin, Germany: Springer-Verlag.

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hsu, CN., Knoblock, C.A. Discovering Robust Knowledge from Databases that Change. Data Mining and Knowledge Discovery 2, 69–95 (1998). https://doi.org/10.1023/A:1009717820785

Download citation

  • Issue Date:

  • DOI: https://doi.org/10.1023/A:1009717820785