Skip to main content

Aligning Gaussian-Topic with Embedding Network for Summarization Ranking

  • Conference paper
  • First Online:
Web and Big Data (APWeb-WAIM 2017)

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 10366))

  • 1863 Accesses

Abstract

Query-oriented summarization addresses the problem of information overload and help people get the main ideas within a short time. Summaries are composed by sentences. So, the basic idea of composing a salient summary is to construct quality sentences both for user specific queries and multiple documents. Sentence embedding has been shown effective in summarization tasks. However, these methods lack of the latent topic structure of contents. Hence, the summary lies only on vector space can hardly capture multi-topical content. In this paper, our proposed model incorporates the topical aspects and continuous vector representations, which jointly learns semantic rich representations encoded by vectors. Then, leveraged by topic filtering and embedding ranking model, the summarization can select desirable salient sentences. Experiments demonstrate outstanding performance of our proposed model from the perspectives of prominent topics and semantic coherence.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    In DUC, the word-based query is also called “title”, such as “New hydroelectric projects”.

  2. 2.

    In DUC, the sentence-based query is also called “narrative”, such as “What hydroelectric projects are planned or in progress and what problems are associated with them?”.

  3. 3.

    http://duc.nist.gov/data.html.

  4. 4.

    In DUC, the query is also called “narrative” or “topic”.

References

  1. Barzilay, R., Lee, L.: Catching the drift: probabilistic content models, with applications to generation and summarization. Comput. Sci. 113–120 (2004)

    Google Scholar 

  2. Blei, D.M., Ng, A.Y., Jordan, M.I.: Latent dirichlet allocation. JMLR 3(Jan), 993–1022 (2003)

    MATH  Google Scholar 

  3. Cao, Z., Li, S., Liu, Y., Li, W., Ji, H.: A novel neural topic model and its supervised extension. In: Proceedings of AAAI 2015 (2015)

    Google Scholar 

  4. Chen, K.Y., Liu, S.H., Wang, H.M., Chen, B., Chen, H.H.: Leveraging word embeddings for spoken document summarization. Comput. Sci. 1383–1387 (2015)

    Google Scholar 

  5. Conroy, J.M., Schlesinger, J.D., O’Leary, D.P.: Topic-focused multi-document summarization using an approximate oracle score. In: Proceedings of ACL 2006 (2006)

    Google Scholar 

  6. Das, R., Zaheer, M., Dyer, C.: Gaussian LDA for topic models with word embeddings. In: Proceedings of ACL 2015 (2015)

    Google Scholar 

  7. Galley, M.: A skip-chain conditional random field for ranking meeting utterances by importance. In: Proceedings of EMNLP 2007 (2006)

    Google Scholar 

  8. Ghosh, S., Vinyals, O., Strope, B., Roy, S., Dean, T., Heck, L.: Contextual LSTM (CLSTM) models for large scale NLP tasks (2016)

    Google Scholar 

  9. Gupta, S., Nenkova, A., Jurafsky, D.: Measuring Importance and Query Relevance in Topic-Focused Multi-document Summarization (2007)

    Google Scholar 

  10. Harabagiu, S., Lacatusu, F.: Topic themes for multi-document summarization. In: Proceedings of SIGIR 2005 (2005)

    Google Scholar 

  11. Kågebäck, M., Mogren, O., Tahmasebi, N., Dubhashi, D.: Extractive summarization using continuous vector space models. In: Proceedings of EACL 2014 (2014)

    Google Scholar 

  12. Kobayashi, H., Noguchi, M., Yatsuka, T.: Summarization based on embedding distributions. In: Proceedings of EMNLP 2015 (2015)

    Google Scholar 

  13. Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. Comput. Sci. 1188–1196 (2014)

    Google Scholar 

  14. Lin, C.Y., Hovy, E.: The automated acquisition of topic signatures for text summarization. In: Proceedings of COLING 2000 (2000)

    Google Scholar 

  15. Lin, C.Y., Hovy, E.: Automatic evaluation of summaries using n-gram co-occurrence statistics. In: Proceedings of ACL 2003 (2003)

    Google Scholar 

  16. Liu, Y.: Query-oriented multi-document summarization via unsupervised deep learning. In: Proceedings of AAAI 2012 (2012)

    Google Scholar 

  17. Liu, Y., Liu, Z., Chua, T.S., Sun, M.: Topical word embeddings. In: Proceedings of AAAI 2015 (2015)

    Google Scholar 

  18. Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. CoRR abs/1301.3781 (2013)

    Google Scholar 

  19. Mikolov, T., Sutskever, I., Chen, K., Corrado, G., Dean, J.: Distributed Representations of Words and Phrases and Their Compositionality (2013)

    Google Scholar 

  20. Nenkova, A., Vanderwende, L., Mckeown, K.: A compositional context sensitive multi-document summarizer: exploring the factors that influence summarization. In: Proceedings of SIGIR 2006 (2006)

    Google Scholar 

  21. Och, F.J.: Minimum error rate training in statistical machine translation. In: Proceedings of ACL 2003 (2003)

    Google Scholar 

  22. Parveen, D., Ramsl, H., Strube, M.: Topical coherence for graph-based extractive summarization. In: Proceedings of EMNLP 2015 (2015)

    Google Scholar 

  23. Tang, J., Yao, L., Chen, D.: Multi-topic based query-oriented summarization. In: Proceedings of SDM 2009 (2009)

    Google Scholar 

  24. Wang, D., Li, T., Zhu, S., Ding, C.: Multi-document summarization via sentence-level semantic analysis and symmetric matrix factorization. In: Proceedings of SIGIR 2008 (2008)

    Google Scholar 

  25. Yang, M., Cui, T., Tu, W.: Ordering-sensitive and semantic-aware topic modeling. In: Proceedings of AAAI 2015 (2015)

    Google Scholar 

  26. Yih, W.T., Goodman, J., Vanderwende, L., Suzuki, H.: Multi-document summarization by maximizing informative content-words. In: Proceedings of IJCAI 2007 (2007)

    Google Scholar 

  27. Yin, W., Pei, Y.: Optimizing sentence modeling and selection for document summarization. In: Proceedings of IJCAI 2015 (2015)

    Google Scholar 

Download references

Acknowledgments

The work was supported by National Nature Science Foundation of China (Grant No. 61602036), National Basic Research Program of China (973 Program, Grant No. 2013CB329303), Beijing Advanced Innovation Center for Imaging Technology (BAICIT-2016007).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yang Gao .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2017 Springer International Publishing AG

About this paper

Cite this paper

Wei, L., Huang, H., Gao, Y., Wei, X., Feng, C. (2017). Aligning Gaussian-Topic with Embedding Network for Summarization Ranking. In: Chen, L., Jensen, C., Shahabi, C., Yang, X., Lian, X. (eds) Web and Big Data. APWeb-WAIM 2017. Lecture Notes in Computer Science(), vol 10366. Springer, Cham. https://doi.org/10.1007/978-3-319-63579-8_46

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-63579-8_46

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-63578-1

  • Online ISBN: 978-3-319-63579-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics