Abstract
This paper addresses the author disambiguation problem in academic social network, namely, resolves the phenomenon of synonym problem “multiple names refer to one person” and polysemy problem “one name refers to multiple persons”. A unified semi-supervised framework is proposed to deal with both the synonym and polysemy problems. First, the framework uses semi-supervised approach to solve the cold-start problem in author disambiguation. Second, robust training data generating method based on multi-aspect similarity indicator is used and a way based on support vector machine is employed to model different kinds of feature combinations. Third, a self-taught procedure is proposed to solve ambiguity in coauthor information to boost the performances from other models. The proposed framework is verified on a large-scale real-world dataset, and obtains promising results.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Chang, C.H., Kayed, M., et al.: A survey of web information extraction systems. IEEE Trans. on Knowledge and Data Engineering 18(10), 1411–1428 (2006)
Ferreira, A.A., Gonalves, M.A., Laender, A.H.: A brief survey of automatic methods for author name disambiguation. ACM SIGMOD Record 41(2), 15–26 (2012)
Cortes, C., Vapnik, V.: Support-vector networks. Machine Learning 20(3), 273–297 (1995)
Yin, X., Han, J., Yu, P.S.: Object distinction: Distinguishing objects with identical names. In: Proceedings of ICDE 2007, Istanbul, Turkey (2007)
Kanani, P., McCallum, A.: Efficient strategies for improving partitioning-based author coreference by incorporating web pages as graph nodes. In: Proceedings of AAAI 2007 Workshop on Information Integration on the Web, Canada (2007)
Qian, Y., Hu, Y., Cui, J., Zheng, Q., et al.: Combining machine learning and human judgment in author disambiguation. In: Proceedings of the CIKM 2011, Glasgow, UK (2011)
Tang, J., Fong, A.C.M., et al.: A unified probabilistic framework for name disambiguation in digital library. IEEE Trans. on Knowledge and Data Engineering 24(6), 975–987 (2012)
Gurney, T., Horlings, E., Besselaar, P.V.D.: Author disambiguation using multi-aspect similarity indicators. Scientometrics 91(2), 435–449 (2012)
Tan, Y.F., Kan, M.Y., Lee, D.: Search engine driven author disambiguation. In: Proceedings of JCDL 2006, USA (2006)
Minkov, E., Cohen, W.W., Ng, A.Y.: Ucontextual search and name disambiguation in email using graphs. In: Proceedings of SIGIR 2006 (2006)
Bekkerman, R., McCallum, A.: Disambiguating web appearances of people in a social network. In: Proceedings of WWW 2005 (2005)
Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. Journal of Machine Learning Research 3, 993–1022 (2003)
Darling, W.M.: A theoretical and practical implementation tutorial on topic modeling and gibbs sampling. In: Proceedings of ACL 2011 (2011)
Blondel, V.D., Guillaume, J.L., et al.: Fast unfolding of communities in large networks. Journal of Statistical Mechanics: Theory and Experiment 2008(10), 10008 (2008)
Breunig, M.M., Kriegel, H.P., et al.: Lof: identifying density-based local outliers. ACM Sigmod Record 29(2), 93–104 (2000)
Chang, C.C., Lin, C.J.: Libsvm: a library for support vector machines. ACM Transactions on Intelligent Systems and Technology 2(3), 27 (2011)
Tarjan, R.E., Leeuwen, J.V.: Worst-case analysis of set union algorithms. Journal of the ACM 31(2), 245–281 (1984)
Roy, B.S., Cock, D.M., Mandava, V., et al.: The microsoft academic search dataset and kdd cup 2013. In: KDD Cup 2013 Workshop, Chicago, USA (2013)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Wang, P., Zhao, J., Huang, K., Xu, B. (2014). A Unified Semi-supervised Framework for Author Disambiguation in Academic Social Network. In: Decker, H., Lhotská, L., Link, S., Spies, M., Wagner, R.R. (eds) Database and Expert Systems Applications. DEXA 2014. Lecture Notes in Computer Science, vol 8645. Springer, Cham. https://doi.org/10.1007/978-3-319-10085-2_1
Download citation
DOI: https://doi.org/10.1007/978-3-319-10085-2_1
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10084-5
Online ISBN: 978-3-319-10085-2
eBook Packages: Computer ScienceComputer Science (R0)