Abstract
This paper reports an initial study that aims to assess the viability of multi-document summarization techniques for automatic captioning of geo-referenced images. The automatic captioning procedure requires summarizing multiple Web documents that contain information related to images’ location. We use different state-of-the art summarization systems to generate generic and query-based multi-document summaries and evaluate them using ROUGE metrics [24] relative to human generated summaries. Results show that query-based summaries perform better than generic ones and thus are more appropriate for the task of image captioning or generation of short descriptions related to the location/place captured in the image. For our future work in automatic image captioning this result suggests that developing the query-based summarizer further and biasing it to account for user-specific requirements will prove worthwhile.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
- 2.
- 3.
- 4.
- 5.
- 6.
- 7.
- 8.
- 9.
References
Aker, A., Gaizauskas, R.: Summary generation for toponym-referenced images using object type language models. In: Proceedings of the International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets (2009)
Aker, A., Gaizauskas, R.: Model summaries for location-related images. In: Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC), Valletta (2010)
Barnard, K., Duygulu, P., Forsyth, D., de Freitas, N., Blei, D., Jordan, M.: Matching words and pictures. J. Mach. Learn. Res. 3, 1107–1135 (2003)
Bellare, K., Das Sarma, A., Loiwal, N., Mehta, V., Ramakrishnan, G., Bhattacharyya, P.: Generic text summarization using WordNet. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon (2004)
Carenini, G., Ng, R., Pauls, A.: Multi-document summarization of evaluative text. In: Proceedings of the 11th Conference of the European Chapter of the Association for Computational Linguistics (EACL), Trento (2006)
Cesarano, C., Mazzeo, A., Picariello, A.: A system for summary-document similarity in notary domain. In: Proceedings of the International Workshop on Database and Expert Systems Applications, Regensburg (2007)
Chowdary, C., Kumar, P.S.: Update summarizer using MMR approach. In: Proceedings of the Text Analysis Conference (TAC), Gaithersburg (2008)
Deschacht, K., Moens, M.: Text analysis for automatic image annotation. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague (2007)
El-haj, M., Hammo, B.: Evaluation of query-based arabic text summarization system. In: Proceedings of the International Conference on Natural Language Processing and Software Engineering, Beijing (2008)
Erkan, G., Radev, D.: LexRank: graph-based lexical centrality as salience in text summarization. J. Artif. Intell. Res. 22, 457–479 (2004)
Fan, J., Gao, Y., Luo, H., Keim, D., Li, Z.: A novel approach to enable semantic and visual image summarization for exploratory image search. In: Proceedings of the 1st ACM International Conference on Multimedia Information Retrieval (MIR), Vancouver (2008)
Ferrández, O., Micol, D., Muñoz, R., Palomar, M.: A perspective-based approach for solving textual entailment recognition. In: Proceedings of the ACL-PASCAL Workshop on Textual Entailment and Paraphrasing, Prague (2007)
Fiszman, M., Rindflesch, T., Kilicoglu, H.: Abstraction summarization for managing the biomedical research literature. In: Proceedings of the HLT-NAACL Workshop on Computational Lexical Semantics, Boston (2004)
Givón, T.: Syntax: A Functional-Typological Introduction, vol. II. John Benjamins Publishing Company, Amsterdam/Philadelphia (1990)
Glickman, O.: Applied textual entailment. Ph.D. thesis, Bar Ilan University (2006)
Goldstein, J., Mittal, V., Carbonell, J., Kantrowitz, M.: Multi-document summarization by sentence extraction. In: Proceedings of the NAACL-ANLP Workshop on Automatic summarization, Seattle (2000)
Gotti, F., Lapalme, G., Nerima, L., Wehrli, E.: GOFAISUM: a symbolic summarizer for DUC. In: Proceedings of the Document Understanding Conference (DUC), Rochester (2007)
Gupta, S., Nenkova, A., Jurafsky, D.: Measuring importance and query relevance in topic-focused multi-document summarization. In: Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague. Demo and Poster Sessions (2007)
He, L., Sanocki, E., Gupta, A., Grudin, J.: Auto-summarization of audio-video presentations. In: Proceedings of the Seventh ACM International Conference on Multimedia (MULTIMEDIA), Orlando (1999)
Hsin-Hsi, C., Chuan-Jie, L.: A multilingual news summarizer. In: Proceedings of the 18th Conference on Computational Linguistics (COLING), SaarbrĂĽcken (2000)
Jaoua, M., Ben Hamadou, A.: Automatic text summarization of scientific articles based on classification of extract’s population. In: Proceedings of the 4th International Conference on Computational Linguistics and Intelligent Text Processing (CICLing), Mexico City (2003)
Kan, M.Y., McKeown, K., Klavans, J.: Domain-specific informative and indicative summarization for information retrieval. In: Proceedings of the Document Understanding Conference (DUC), New Orleans (2001)
Lesk, M.: Automatic sense disambiguation using machine readable dictionaries: how to tell a pine cone from a ice cream cone. In: Proceedings of SIGDOC, Toronto (1986)
Lin, C.Y.: ROUGE: a package for automatic evaluation of summaries. In: Proceedings of the Workshop on Text Summarization Branches Out (WAS), Barcelona (2004)
Lloret, E., Ferrández, O., Muñoz, R., Palomar, M.: A text summarization approach under the influence of textual entailment. In: Proceedings of the 5th Natural Language Processing and Cognitive Science Workshop, Barcelona (2008)
Lloret, E., Palomar, M.: A gradual combination of features for building automatic summarisation systems. In: Proceedings of the 12th International Conference on Text, Speech and Dialogue (TSD), Pilsen (2009)
Luhn, H.P.: The automatic creation of literature abstracts. IBM J. Res. Dev. 2, 159–165 (1958)
Mani, I.: Automatic Summarization. John Benjamins Publishing Company, Amsterdam/Philadelphia (2001)
Marcu, D.: Discourse trees are good indicators of importance in text. In: Mani, I., Mayburg, M.T. (eds.) Advances in Automatic Text Summarization. MIT, Cambridge, MA (1999)
Mihalcea, R., Ceylan, H.: Explorations in automatic book summarization. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague (2007)
Mori, Y., Takahashi, H., Oka, R.: Automatic word assignment to images based on image division and vector quantization. In: Proceedings of RIAO 2000: Content-Based Multimedia Information Access, Paris (2000)
Pan, J.Y., Yang, H.J., Duygulu, P., Faloutsos, C.: Automatic image captionin. In: Proceedings of the IEEE International Conference on Multimedia and Expo (ICME), Taipei (2004)
Pedersen, T., Patwardhan, S., Michelizzi, J.: WordNet::Similarity – measuring the relatedness of concepts. In: Proceedings of the National Conference on Artificial Intelligence (AAAI), San Jose (2004)
Plaza, L., DĂaz, A., Gervás, P.: Concept-graph based biomedical automatic summarization using ontologies. In: Proceedings of the 3rd Textgraphs Workshop on Graph-based Algorithms for Natural Language Processing, Manchester, pp. 53–56. (2008)
Plaza, L., DĂaz, A., Gervás, P.: Automatic summarization of news using WordNet concept graphs. In: Proceedings of the Informatics IADIS International Conference, Algarve (2009)
Plaza, L., Lloret, E., Aker, A.: Improving automatic image captioning using text summarization techniques. In: Proceedings of the 13th International Conference on Text, Speech and Dialogue (TSD), Brno (2010)
Qiu, L., Kan, M.Y., Chua, T.S.: A public reference implementation of the RAP anaphora resolution algorithm. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon (2004)
Radev, D., BlairGoldensohn, S., Zhang, Z.: Experiments in single and multidocument summarization using MEAD. In: Proceedings of the Document Understanding Conference (DUC), New Orleans (2001)
Radev, D., Allison, T., Blair-Goldensohn, S., Blitzer, J., Çelebi, A., Dimitrov, S., Drabek, E., Hakim, A., Lam, W., Liu, D., Otterbacher, J., Qi, H., Saggion, H., Teufel, S., Topper, M., Winkel, A., Zhang, Z.: MEAD – a platform for multidocument multilingual text summarization. In: Proceedings of the 4th International Conference on Language Resources and Evaluation (LREC), Lisbon (2004)
Radev, D., Jing, H., Styś, M., Tam, D.: Centroid-based summarization of multiple documents. Inf. Process. Manage. 40(6), 919–938 (2004)
Radev, D., Blitzer, J., Winkel, A., Allison, T., Topper, M.: Mead documentation v3.10. Tech. rep. URL http://www.summarization.com/mead/ (2006). Accessed June 2011
Saggion, H.: SUMMA: A robust and adaptable summarization tool. Rev. Trait. Automat. Lang. 49(2), 103–125 (2008)
Saggion, H., Gaizauskas, R.: Multi-document summarization by cluster/profile relevance and redundancy removal. In: Proceedings of the Document Understanding Conference (DUC), Boston (2004)
Saggion, H., Teufel, S., Radev, D., Lam, W.: Meta-evaluation of summaries in a cross-lingual environment using content-based metrics. In: Proceedings of the 19th International Conference on Computational Linguistics (COLING), Taipei (2002)
Salton, G.: Automatic Text Processing. Addison-Wesley Longman Publishing Co., Inc., Boston, MA, USA (1988)
Salton, G., Buckley, C.: Term-Weighting Approaches in Automatic Text Retrieval. Pergamon, Tarrytown, NY, USA (1988)
Salton, G., Lesk, M.: Computer evaluation of indexing and text processing. ACM J. 15(1), 8–36 (1968)
Schilder, F., Kondadadi, R.: FastSum: fast and accurate query-based multi-document summarization. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies. Short Papers, Columbus (2008)
Sekine, S., Nobata, C.: A survey for multi-document summarization. In: Proceedings of the HLT-NAACL Workshop on Text Summarization, Edmonton (2003)
Spärck Jones, K.: Automatic summarizing: factors and directions. In: Mani, I., Mayburg, M.T. (eds.) Advances in Automatic Text Summarization, pp. 1–14. MIT, Cambridge, MA (1999)
Steinberger, J., Poesio, M., Kabadjov, M., Ježek, K.: Two uses of anaphora resolution in summarization. Inf. Process. Manage. 43(6), 1663–1680 (2007)
Sun, J.T., Shen, D., Zeng, H.J., Yang, Q., Lu, Y., Chen, Z.: Web-page summarization using clickthrough data. In: Proceedings of the 28th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, Salvador (2005)
Svore, K., Vanderwende, L., Burges, C.: Enhancing single-document summarization by combining RankNet and third-party sources. In: Proceedings of the Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning (EMNLP-CoNLL), Prague (2007)
Titov, I., McDonald, R.: A joint model of text and aspect ratings for sentiment summarization. In: Proceedings of the 46th Annual Meeting of the Association for Computational Linguistics (ACL): Human Language Technologies, Columbus (2008)
Trappey, A., Trappey, C., Wu, C.Y.: Automatic patent document summarization for collaborative knowledge systems and services. J. Syst. Sci. Syst. Eng. 1, 71–94 (2009)
Westerveld, T.: Image retrieval: content versus context. In: Proceedings of RIAO 2000: Content-Based Multimedia Information Access, Paris (2000)
Yoo, I., Hu, X., Song, I.Y.: A coherent graph-based semantic clustering and summarization approach for biomedical literature and a new summarization evaluation method. BMC Bioinformatics 8(9) (2007)
Zechner, K., Waibel, A.: DiaSumm: flexible summarization of spontaneous dialogues in unrestricted domains. In: Proceedings of the 18th Conference on Computational Linguistics (COLING), SaarbrĂĽcken (2000)
Acknowledgements
This work was supported by the EU-funded TRIPOD project (IST-FP6-045335) and by the Spanish Government through the FPU program and the projects TIN2009-14659-C03-01, TSI 020312-2009-44, and TIN2009-13391-C04-01; by Conselleria d’Educació – Generalitat Valenciana (grant no. PROMETEO/2009/119 and ACOMP/2010/286); and the FPI program (BES-2007-16268) from the Spanish Ministry of Science and Innovation (project TEXT-MESS (TIN2006-15265-C06-01)). We would like to thank Horacio Saggion for his support with SUMMA. We are also grateful to Emina Kurtic for comments on the previous versions of this paper.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2013 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Aker, A., Plaza, L., Lloret, E., Gaizauskas, R. (2013). Multi-Document Summarization Techniques for Generating Image Descriptions: A Comparative Analysis. In: Poibeau, T., Saggion, H., Piskorski, J., Yangarber, R. (eds) Multi-source, Multilingual Information Extraction and Summarization. Theory and Applications of Natural Language Processing. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-28569-1_14
Download citation
DOI: https://doi.org/10.1007/978-3-642-28569-1_14
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-28568-4
Online ISBN: 978-3-642-28569-1
eBook Packages: Computer ScienceComputer Science (R0)