Abstract
Nowadays, with advances in digital technologies, interaction between computers and humans is essential. In this regard, the area of Natural Language Generation (NLG) can provide techniques capable of facilitating and improving this type of interaction. However, the existing approaches to this field are usually developed ad-hoc for specific tasks, purposes and domains, which hinders the advancement of flexible and adaptable multi-domain NLG systems. Under these premises, the objective of this paper is to present HanaNLG, a hybrid generic NLG approach, focused on the surface realisation stage. HanaNLG combines statistic and knowledge-based techniques and is able to generate text independently of the domain. In particular, this is done by exploiting language models in conjunction with semantic knowledge, providing flexibility to the whole generation process, thus, minimising the high cost associated with the development of common elements involved in NLG, such as grammars. Therefore, taking into account this joint perspective, our approach contributes to advancing the NLG field by providing greater flexibility when it comes to (i) producing text for different domains, and (ii) increasing the variety of vocabulary to appear in the generated text. In order to assess the effectiveness of HanaNLG, it was tested in two domains: (i) NLG for assistive technologies and, (ii) NLG for creating opinionated sentences. The positive results obtained (almost the 99% of the generated sentences for both domains are original and well constructed) show that our approach is capable of generating text for different domains. More importantly, the combination of language models with semantic knowledge enhances the quality of the generated text, thereby improving the results obtained compared to other methods that only rely on statistical methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
A set of synonyms used in Wordnet that are related to a term.
- 3.
- 4.
- 5.
References
Cambria, E., White, B.: Jumping NLP curves: a review of natural language processing research. Comput. Intell. Mag. IEEE 9, 48–57 (2014)
Reiter, E., Dale, R.: Building Natural Language Generation Systems. Cambridge University Press, Cambridge (2000)
Wanner, L., Bohnet, B., Bouayad-Agha, N., Lareau, F., Nicklaß, D.: Marquis: generation of user-tailored multilingual air quality bulletins. Appl. Artif. Intell. 24, 914–952 (2010)
McDonald, D.D.: 6. In: Natural Language Generation, pp. 121–144. CRC Press (2010)
Barros, C., Lloret, E.: A multilingual multi-domain data-to-text natural language generation approach. Procesamiento del Lenguaje Nat. 58, 45–52 (2017)
Mairesse, F., Young, S.: Stochastic language generation in dialogue using factored language models. Comput. Linguist. 40, 763–799 (2014)
Bangalore, S., Rambow, O.: Exploiting a probabilistic hierarchical model for generation. In: Proceedings of the 18th Conference on Computational Linguistics, vol. 1, COLING 2000, pp. 42–48. Association for Computational Linguistics (2000)
Group, X.R.: A lexicalized tree adjoining grammar for English. Technical report IRCS-01-03, IRCS, University of Pennsylvania (2001)
White, M.A.J., Clark, R.D., Moore, J.: Generating tailored comparative descriptions with contextually appropriate intonation. Comput. Linguist. 36, 159–201 (2010)
Kondadadi, R., Howald, B., Schilder, F.: A statistical NLG framework for aggregated planning and realization. In: Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, (Volume 1: Long Papers), pp. 1406–1415. Association for Computational Linguistics (2013)
Mille, S., Ballesteros, M., Burga, A., Casamayor, G., Wanner, L.: Multilingual natural language generation within abstractive summarization. In: Proceedings of the 1st International Workshop on Multimodal Media Data Analytics co-located with the 22nd European Conference on Artificial Intelligence, MMDA@ECAI 2016, pp. 33–38 (2016)
Žolkovskij, A.K., Mel’čuk, I.A.: O vozmožnom metode i instrumentax semantičeskogo sinteza. Naučno-texničeskaja informacija (1965)
Gardent, C., Perez-Beltrachini, L.: A statistical, grammar-based approach to microplanning. Comput. Linguist. 43, 1–30 (2017)
García-Méndez, S., Fernández-Gavilanes, M., Costa-Montenegro, E., Juncal-Martínez, J., González-Castaño, F.J.: Automatic natural language generation applied to alternative and augmentative communication for online video content services using simpleNLG for Spanish. In: Proceedings of the Internet of Accessible Things. W4A 2018, pp. 19:1–19:4. ACM (2018)
Gatt, A., Reiter, E.: SimpleNLG: A realisation engine for practical applications. In: Proceedings of the 12th European Workshop on Natural Language Generation, pp. 90–93. Association for Computational Linguistics (2009)
Fellbaum, C.: WordNet: An Electronic Lexical Database. MIT Press, Cambridge (1998)
Schuler, K.K.: Verbnet: A Broad-coverage, Comprehensive Verb Lexicon. Ph.D. thesis (2005)
Isard, A., Brockmann, C., Oberlander, J.: Individuality and alignment in generated dialogues. In: Proceedings of the INLG, pp. 25–32. Association for Computational Linguistics (2006)
Padró, L., Stanilovsky, E.: Freeling 3.0: Towards wider multilinguality. In: Proceedings of the 8ht International Conference on Language Resources and Evaluation, European Language Resources Association (2012)
Stolcke, A.: SRILM - an extensible language modeling toolkit. Proc. Int. Conf. Spoken Lang. Process. 2, 901–904 (2002)
Finlayson, M.A.: Java libraries for accessing the Princeton wordnet: comparison and evaluation. In: Proceedings of the 7th International Global WordNet Conference (GWC 2014), Tartu, Estonia, pp.78–85. Global WordNet Association (2014)
Rvachew, S., Rafaat, S., Martin, M.: Stimulability, speech perception skills, and the treatment of phonological disorders. Am. J. Speech-Lang. Pathol. 8, 33–43 (1999)
Lobo, P.V., de Matos, D.M.: Fairy tale corpus organization using latent semantic mapping and an item-to-item top-n recommendation algorithm. In: Proceedings of the Seventh conference on International Language Resources and Evaluation (LREC 2010), European Languages Resources Association (ELRA) (2010)
Pang, B., Lee, L.: A sentimental education: Sentiment analysis using subjectivity. In: Proceedings of the 42nd Annual Meeting on Association for Computational Linguistics, pp. 271–278. Association for Computational Linguistics (2004)
Gkatzia, D., Mahamood, S.: A snapshot of NLG evaluation practices 2005–2014. In: Proceedings of the 15th European Workshop on Natural Language Generation (ENLG), pp. 57–60. Association for Computational Linguistics (2015)
Randolph, J.J.: Online kappa calculator [computer software] (2008). http://justus.randolph.name/kappa
Acknowledgment
This research has been partially funded by the Generalitat Valenciana through the project “SIIA: Tecnologías del lenguaje humano para una sociedad inclusiva, igualitaria, y accesible” (PROMETEU/2018/089).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 Springer Nature Switzerland AG
About this paper
Cite this paper
Barros, C., Lloret, E. (2023). HanaNLG: A Flexible Hybrid Approach for Natural Language Generation. In: Gelbukh, A. (eds) Computational Linguistics and Intelligent Text Processing. CICLing 2019. Lecture Notes in Computer Science, vol 13452. Springer, Cham. https://doi.org/10.1007/978-3-031-24340-0_38
Download citation
DOI: https://doi.org/10.1007/978-3-031-24340-0_38
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-24339-4
Online ISBN: 978-3-031-24340-0
eBook Packages: Computer ScienceComputer Science (R0)