Skip to main content

An Evolutionary Approach to Automatic Keyword Selection for Twitter Data Analysis

  • Conference paper
  • First Online:
Hybrid Artificial Intelligent Systems (HAIS 2020)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12344))

Included in the following conference series:

Abstract

In this paper, we propose an approach to intelligent and automatic keyword selection for the purpose of Twitter data collection and analysis. The proposed approach makes use of a combination of deep learning and evolutionary computing. As some context for application, we present the proposed algorithm using the case study of public health surveillance over Twitter, which is a field with a lot of interest. We also describe an optimization objective function particular to the keyword selection problem, as well as metrics for evaluating Twitter keywords, namely: reach and tweet retreival power, on top of traditional metrics such as precision. In our experiments, our evolutionary computing approach achieved a tweet retreival power of 0.55, compared to 0.35 achieved by the baseline human approach.

Supported by Public Health England.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 84.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 109.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

Notes

  1. 1.

    https://www.urbandictionary.com.

References

  1. Chen, L., Hossain, K.T., Butler, P., Ramakrishnan, N., Prakash, B.A.: Syndromic surveillance of flu on Twitter using weakly supervised temporal topic models. Data Min. Knowl. Discov. 30(3), 681–710 (2016)

    Article  MathSciNet  Google Scholar 

  2. de Quincey, E., Kostkova, P.: Early warning and outbreak detection using social networking websites: the potential of Twitter. In: Kostkova, P. (ed.) eHealth 2009. LNICST, vol. 27, pp. 21–24. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11745-9_4

    Chapter  Google Scholar 

  3. Deb, K., Padhye, N.: Improving a particle swarm optimization algorithm using an evolutionary algorithm framework. KanGAL report 2010/003 (2010)

    Google Scholar 

  4. Edo-Osagie, O., De La Iglesia, B., Lake, I., Edeghere, O.: Deep learning for relevance filtering in syndromic surveillance: a case study in asthma/difficulty breathing. In: International Conference on Pattern Recognition Applications and Methods, no. 8 (2019)

    Google Scholar 

  5. Edo-Osagie, O., Lake, I., Edeghere, O., De La Iglesia, B.: Attention-based recurrent neural networks (RNNs) for short text classification: an application in public health monitoring. In: Rojas, I., Joya, G., Catala, A. (eds.) IWANN 2019. LNCS, vol. 11506, pp. 895–911. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-20521-8_73

    Chapter  Google Scholar 

  6. Edo-Osagie, O., Smith, G., Lake, I., Edeghere, O., De La Iglesia, B.: Twitter mining using semi-supervised classification for relevance filtering in syndromic surveillance. PloS One 14(7), e0210689 (2019)

    Google Scholar 

  7. George, K.K., Kumar, C.S., Ramachandran, K., Panda, A.: Cosine distance features for improved speaker verification. Electron. Lett. 51(12), 939–941 (2015)

    Article  Google Scholar 

  8. Jin, L., Schuler, W.: A comparison of word similarity performance using explanatory and non-explanatory texts. In: Proceedings of the 2015 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pp. 990–994 (2015)

    Google Scholar 

  9. Kennedy, J.: Particle swarm optimization. In: Encyclopedia of Machine Learning, pp. 760–766 (2010)

    Google Scholar 

  10. Kiritchenko, S., Jiline, M.: Keyword optimization in sponsored search via feature selection. In: New Challenges for Feature Selection in Data Mining and Knowledge Discovery, pp. 122–134 (2008)

    Google Scholar 

  11. Lee, D., Kim, K.: Web site keyword selection method by considering semantic similarity based on word2vec. J. Soc. e-Bus. Stud. 23(2) (2019)

    Google Scholar 

  12. Liang, J., Yang, H., Gao, J., Yue, C., Ge, S., Qu, B.: MOPSO-based CNN for keyword selection on Google ads. IEEE Access 7, 125387–125400 (2019)

    Article  Google Scholar 

  13. Liu, A., Srikanth, M., Adams-Cohen, N., Alvarez, R.M., Anandkumar, A.: Finding social media trolls: dynamic keyword selection methods for rapidly-evolving online debates. arXiv preprint arXiv:1911.05332 (2019)

  14. Luong, T., Socher, R., Manning, C.: Better word representations with recursive neural networks for morphology. In: Proceedings of the Seventeenth Conference on Computational Natural Language Learning, pp. 104–113 (2013)

    Google Scholar 

  15. Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013)

    Google Scholar 

  16. Morstatter, F., Pfeffer, J., Liu, H., Carley, K.M.: Is the sample good enough? Comparing data from Twitter’s streaming API with Twitter’s firehose. In: Seventh International AAAI Conference on Weblogs and Social Media (2013)

    Google Scholar 

  17. Pennington, J., Socher, R., Manning, C.: Glove: global vectors for word representation. In: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), pp. 1532–1543 (2014)

    Google Scholar 

  18. Triple, S.: Assessment of syndromic surveillance in Europe. Lancet (London, England) 378(9806), 1833 (2011)

    Article  Google Scholar 

  19. Umapathy, P., Venkataseshaiah, C., Arumugam, M.S.: Particle swarm optimization with various inertia weight variants for optimal power flow solution. Discrete Dyn. Nat. Soc. 2010, 1–15 (2010). https://doi.org/10.1155/2010/462145

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Oduwa Edo-Osagie .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2020 Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Edo-Osagie, O., Iglesia, B.D.L., Lake, I., Edeghere, O. (2020). An Evolutionary Approach to Automatic Keyword Selection for Twitter Data Analysis. In: de la Cal, E.A., Villar Flecha, J.R., Quintián, H., Corchado, E. (eds) Hybrid Artificial Intelligent Systems. HAIS 2020. Lecture Notes in Computer Science(), vol 12344. Springer, Cham. https://doi.org/10.1007/978-3-030-61705-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-3-030-61705-9_14

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-030-61704-2

  • Online ISBN: 978-3-030-61705-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics