Abstract
High performance object locating is the hard undertaking in a distributed system. The quality of this work can be assessed by the response time, space utilization and hit rate which are the essential requirements for large-scale Internet applications. Bloom Filter (BF) is made of a number of hash functions which is the critical part of the object locating algorithm. But how many hash functions in BF are the best remains unsolved. This paper presents a method for estimating those numbers in BF’s hash function configuration. Our theoretical analysis for figuring out the optimal hash number is given. That number has been crucial to construct a better BF-based algorithm. In order to verify the correctness of our theoretical result, we establish a simulation environment with 50 million objects which are scattered on one hundred nodes. The experiment for comparing traditional hash function number with our number is given. The experimental result shows that the BF with our optimized parameter can reduce the object locating time by 81- 91 percent. Furthermore, we demonstrate this method can be used in similar content randomly-located distributed systems.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bloom, B.: Space/time Trade-offs in Hash Coding with Allowable Errors. Communications of the ACM 13(7), 422–426 (1970)
Kaashoek, M., Karger, D.: A Simple Degree-Optimal Distributed Hash Table. In: Kaashoek, M.F., Stoica, I. (eds.) IPTPS 2003. LNCS, vol. 2735, pp. 98–107. Springer, Heidelberg (2003)
Maurer, W.D.: Hash Table Methods. ACM Computing Surveys 7(1), 5–19 (1975)
Ozsu, M.T., Valduriez, P.: Principles of Distributed Database Systems. Prentice Hall, Upper Saddle River (1999)
Broder, A., Mitzenmacher, M.: Network Applications of Bloom Filters: A Survey. Internet Mathematics 1, 485–509 (2003)
Stoica, I., Morris, R., Karger, D., Kaashoek, M., Balakrishnan, H.: Chord: A Scalable Peer-to-peer Lookup Service for Internet Applications. In: 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 149–160. ACM, New York (2001)
Pathan, A.M.K., Buyya, R.: A Taxonomy and Survey of Content Delivery Networks. Technical report, GRIDS-TR-2007-4 (2007)
Rhea, S.C., Kubiatowicz, J.: Probabilistic Location and Routing. In: 21st Annual Joint Conference of the IEEE Computer and Communications Societies, pp. 1248–1257. IEEE Press, New York (2002)
Ledlie, J., Taylor, M.J., Serban, L., Seltzer, M.: Self-Organization in Peer-to-peer Systems. In: 10th Workshop on ACM SIGOPS European Workshop, pp. 125–132. ACM, New York (2002)
Cuenca-Acuna, F.M., Peery, C., Martin, R.P., Nguyen, T.D.: Using Gossiping to Build Content Addressable Peer-to-peer Information Sharing Communities. In: 12th IEEE International Symposium on High Performance Distributed Computing, pp. 236–246. IEEE Press, New York (2003)
Suresh, D.C., Guo, Z., Buyukkurt, B., Najjar, W.A.: Automatic Compilation Framework for Bloom Filter Based Intrusion Detection. In: 2006 International Workshop on Applied Reconfigurable Computing, pp. 413–418. Springer, Germany (2006)
Locasto, M.E., Parekh, J.J., Keromytis, A.D., Stolfo, S.J.: Towards Collaborative Security and P2P Intrusion Detection. In: 5th IEEE Workshop on Information Assurance and Security, pp. 333–339. IEEE Press, New York (2005)
Estan, C., Varghese, G.: New Directions in Traffic Measurement and Accounting. In: 2002 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 323–336. ACM, New York (2002)
Snoeren, A.C., Partridge, C., Sanchez, L.A., Jones, C.E., Tchakountio, F., Kent, S.T., Strayer, W.T.: Hash-Based IP Traceback. In: 2001 Conference on Applications, Technologies, Architectures, and Protocols for Computer Communications, pp. 3–14. ACM, New York (2001)
Fan, L., Cao, P., Almeida, J., Broder, A.Z.: Summary Cache: A Scalable Wide-Area Web Cache Sharing Protocol. IEEE/ACM Transactions on Networking 8(3), 281–293 (2000)
Mitzenmacher, M.: Compressed Bloom Filters. IEEE/ACM Transactions on Networking 10(5), 604–612 (2002)
Bonomi, F., Mitzenmacher, M., Panigrahy, R., Singh, S., Varghese, G.: An Improved Construction for Counting Bloom Filters. In: Azar, Y., Erlebach, T. (eds.) ESA 2006. LNCS, vol. 4168, pp. 684–695. Springer, Heidelberg (2006)
Cohen, S., Matias, Y.: Spectral Bloom Filters. In: 2003 ACM SIGMOD International Conference on Management of Data, pp. 241–252. ACM, New York (2003)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Wang, Z., Luo, T. (2012). Optimizing Hash Function Number for BF-Based Object Locating Algorithm. In: Tan, Y., Shi, Y., Ji, Z. (eds) Advances in Swarm Intelligence. ICSI 2012. Lecture Notes in Computer Science, vol 7332. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-31020-1_65
Download citation
DOI: https://doi.org/10.1007/978-3-642-31020-1_65
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-31019-5
Online ISBN: 978-3-642-31020-1
eBook Packages: Computer ScienceComputer Science (R0)