Abstract
Knowledge graphs are composed of different elements: entity nodes, relation edges, and literal nodes. Each literal node contains an entity’s attribute value (e.g. the height of an entity of type person) and thereby encodes information which in general cannot be represented by relations between entities alone. However, most of the existing embedding- or latent-feature-based methods for knowledge graph analysis only consider entity nodes and relation edges, and thus do not take the information provided by literals into account. In this paper, we extend existing latent feature methods for link prediction by a simple portable module for incorporating literals, which we name LiteralE. Unlike in concurrent methods where literals are incorporated by adding a literal-dependent term to the output of the scoring function and thus only indirectly affect the entity embeddings, LiteralE directly enriches these embeddings with information from literals via a learnable parametrized function. This function can be easily integrated into the scoring function of existing methods and learned along with the entity embeddings in an end-to-end manner. In an extensive empirical study over three datasets, we evaluate LiteralE-extended versions of various state-of-the-art latent feature methods for link prediction and demonstrate that LiteralE presents an effective way to improve their performance. For these experiments, we augmented standard datasets with their literals, which we publicly provide as testbeds for further research. Moreover, we show that LiteralE leads to an qualitative improvement of the embeddings and that it can be easily extended to handle literals from different modalities.
A. Kristiadi, M. A. Khan—Equal contribution.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
For more information about the RDF concepts see https://www.w3.org/TR/rdf11-concepts.
- 2.
A literal-extended version of YAGO3-10 is provided by Pezeshkpou et al. [12].
- 3.
Note that in practice, we normalize the literal values.
- 4.
Note, that incorporating the literal information into the embeddings also seems advantageous for entity disambiguation or clustering.
- 5.
- 6.
- 7.
Note, that LiteralE could also be extended to incorporate graph features as an additional input to g.
- 8.
We use spaCy’s pretrained GloVe embedding model. Available at https://spacy.io.
- 9.
The base model for all of these methods is DistMult.
References
Bollacker, K., Evans, C., Paritosh, P., Sturge, T., Taylor, J.: Freebase: a collaboratively created graph database for structuring human knowledge. In: Proceedings of the 2008 ACM SIGMOD International Conference on Management of Data, pp. 1247–1250. ACM (2008)
Bordes, A., Usunier, N., Garcia-Duran, A., Weston, J., Yakhnenko, O.: Translating embeddings for modeling multi-relational data. In: Advances in Neural Information Processing Systems, pp. 2787–2795 (2013)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1724–1734. Association for Computational Linguistics, Doha, October 2014. http://www.aclweb.org/anthology/D14-1179
Dettmers, T., Pasquale, M., Pontus, S., Riedel, S.: Convolutional 2D knowledge graph embeddings. In: Proceedings of the 32nd AAAI Conference on Artificial Intelligence, February 2018. https://arxiv.org/abs/1707.01476
Dong, X., et al.: Knowledge vault: a web-scale approach to probabilistic knowledge fusion. In: Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, pp. 601–610. ACM (2014)
Garcia-Duran, A., Niepert, M.: KBLRN: end-to-end learning of knowledge base representations with latent, relational, and numerical features. arXiv preprint arXiv:1709.04676 (2017)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. In: 3rd International Conference for Learning Representations. ICLR (2015)
Le, Q., Mikolov, T.: Distributed representations of sentences and documents. In: International Conference on Machine Learning, pp. 1188–1196 (2014)
Lehmann, J., et al.: DBpedia - a large-scale, multilingual knowledge base extracted from Wikipedia. Semant. Web 6(2), 167–195 (2015)
Mahdisoltani, F., Biega, J., Suchanek, F.: YAGO3: a knowledge base from multilingual Wikipedias. In: 7th Biennial Conference on Innovative Data Systems Research, CIDR Conference (2014)
Nickel, M., Murphy, K., Tresp, V., Gabrilovich, E.: A review of relational machine learning for knowledge graphs. Proc. IEEE 104(1), 11–33 (2016)
Pezeshkpour, P., Irvine, C., Chen, L., Singh, S.: Embedding multimodal relational data (2017)
Xie, R., Liu, Z., Luan, H., Sun, M.: Image-embodied knowledge representation learning. In: Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence, IJCAI-2017, pp. 3140–3146 (2017). https://doi.org/10.24963/ijcai.2017/438
Sun, Z., Hu, W., Li, C.: Cross-lingual entity alignment via joint attribute-preserving embedding. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 628–644. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_37
Tay, Y., Tuan, L.A., Phan, M.C., Hui, S.C.: Multi-task neural network for non-discrete attribute prediction in knowledge graphs. In: Proceedings of the 2017 ACM on Conference on Information and Knowledge Management, pp. 1029–1038. ACM (2017)
Thoma, S., Rettinger, A., Both, F.: Towards holistic concept representations: embedding relational knowledge, visual attributes, and distributional word semantics. In: d’Amato, C., et al. (eds.) ISWC 2017. LNCS, vol. 10587, pp. 694–710. Springer, Cham (2017). https://doi.org/10.1007/978-3-319-68288-4_41
Toutanova, K., Chen, D.: Observed versus latent features for knowledge base and text inference. In: Proceedings of the 3rd Workshop on Continuous Vector Space Models and their Compositionality, pp. 57–66 (2015)
Toutanova, K., Chen, D., Pantel, P., Poon, H., Choudhury, P., Gamon, M.: Representing text for joint embedding of text and knowledge bases. In: EMNLP, vol. 15, pp. 1499–1509 (2015)
Trouillon, T., Welbl, J., Riedel, S., Gaussier, É., Bouchard, G.: Complex embeddings for simple link prediction. In: International Conference on Machine Learning, pp. 2071–2080 (2016)
Tu, C., Liu, H., Liu, Z., Sun, M.: CANE: context-aware network embedding for relation modeling. In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics: Long Papers, vol. 1, pp. 1722–1731 (2017)
Wu, Y., Wang, Z.: Knowledge graph embedding with numeric attributes of entities. In: Proceedings of the Third Workshop on Representation Learning for NLP, pp. 132–136 (2018)
Xie, R., Liu, Z., Jia, J., Luan, H., Sun, M.: Representation learning of knowledge graphs with entity descriptions. In: AAAI, pp. 2659–2665 (2016)
Xu, J., Chen, K., Qiu, X., Huang, X.: Knowledge graph representation with jointly structural and textual encoding. In: IJCAI (2016)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Kristiadi, A., Khan, M.A., Lukovnikov, D., Lehmann, J., Fischer, A. (2019). Incorporating Literals into Knowledge Graph Embeddings. In: Ghidini, C., et al. The Semantic Web – ISWC 2019. ISWC 2019. Lecture Notes in Computer Science(), vol 11778. Springer, Cham. https://doi.org/10.1007/978-3-030-30793-6_20
Download citation
DOI: https://doi.org/10.1007/978-3-030-30793-6_20
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-30792-9
Online ISBN: 978-3-030-30793-6
eBook Packages: Computer ScienceComputer Science (R0)