Abstract
As the quantity of social and online analytics data has drastically increased, a wide variety of methods are deployed to make sense of this data, typically via computational and algorithmic approaches. However, in many cases, these approaches trade one form of complexity for another by ignoring the principles of human cognitive processing. In this perspective manuscript, we propose an approach of employing Personas as an alternative form of making large volumes of online user analytics information useful to end users of the user and customer analytics, with results applicable in software development, business sectors, communication industry, and other domains where understanding online user behavior is deemed important. Toward this end, we have developed a system that automatically generates data-driven Personas from social media and online analytics data, capable of handling hundreds of millions of user interactions from tens of thousands of pieces of content on YouTube, Facebook and Google Analytics, while retaining the privacy of individual users of those channels. Our approach (1) identifies and prioritizes user segments by their online behavior, (2) associates the segments with demographic data, and (3) creates rich Persona profiles by dynamically adding characteristics, such as names, photos, and descriptive quotes. This chapter characterizes the currently open research problems in automatic Persona generation, such as de-aggregation of data, cross-platform data mapping, filtering of toxic comments, and choosing the right information content according to end-user needs. Addressing these problems requires the use of state-of-the-art techniques of computer and information science within one system and benefits greatly from inter-disciplinary collaboration. Overall, the research agenda set in this work aims at achieving the vision for automatic user profiling using diverse online and social media platforms and advanced data processing methods for the end goal of making complex analytics data more useful for human decision makers, especially those working with online content.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
For example, instead knowing that a person named John Meyers watched a video about women’s rights in Pakistan, we only know how many times the group he represents, e.g. [Male, 35–44, London] watched a particular video.
- 2.
A live demo is available at https://Persona.qcri.org/.
- 3.
- 4.
- 5.
Application programming interface, used for accessing a remote system; APG uses APIs for data retrieval.
- 6.
References
Agarwal R, Dhar V (2014) Editorial—big data, data science, and analytics: the opportunity and challenge for is research. Inf Syst Res 25(3):443–448
Aigner J, Durchardt A, Kersting T, Kattenbeck M, Elsweiler D (2017) Manipulating the perception of credibility in refugee related social media posts. In: Proceedings of the 2017 conference on conference human information interaction and retrieval. ACM, New York, NY, USA, pp 297–300
An J, Haewoon K, Jansen BJ (2016a). Towards Automatic Persona Generation Using Social Media. In Proc. of The Third International Symposium on Social Networks Analysis, Management and Security (SNAMS 2016), The 4th International Conference on Future Internet of Things and Cloud. 22–24 August
An J, Kwak H, Jansen BJ (2016b) Validating social media data for automatic Persona generation. In: Proceedings of the second international workshop on online social networks technologies (OSNT-2016), 13th ACS/IEEE international conference on computer systems and applications AICCSA 2016, 29 Nov–2 Dec
An J, Haewoon K, Jansen BJ (2017) Personas for content creators via decomposed aggregate audience statistics. In: Proceedings of Advances in Social Network Analysis and Mining (ASONAM 2017), 31 July
Badache I, Boughanem M (2014) Harnessing social signals to enhance a search. In: 2014 IEEE/WIC/ACM international joint conferences on web intelligence (WI) and intelligent agent technologies (IAT) (Presented at the 2014 IEEE/WIC/ACM international joint conferences on web intelligence (WI) and intelligent agent technologies (IAT), vol 1, pp 303–309
Blomquist, AAsa, Arvola M (2002). Personas in action: ethnography in an interaction design team. In: Proceedings of the second Nordic conference on human-computer interaction, pp 197–200
Bürgi P, Victor B, Lentz J (2004) Modeling how their business really works prepares managers for sudden change. Strat Leadersh 32(2):28–35
Chapman CN, Milham RP (2006) The Personas’ new clothes: methodological and practical arguments against a popular method. Proc Hum Factors Ergon Soc Annu Meet 50(5):634–636
Chapman CN, Love E, Milham RP, ElRif P, Alford JL (2008) Quantitative evaluation of Personas as information. Proc Hum Factors Ergon Soc Annu Meet 52(16):1107–1111
Cooper A (2004) The inmates are running the asylum: why high tech products drive us crazy and how to restore the sanity, 1st edn. Sams—Pearson Education, Indianapolis, IN
Feldman R (2013) Techniques and applications for sentiment analysis. Commun ACM 56(4):82–89
Fernandez-Luque L, Bau T (2015) Health and social media: perfect storm of information. Healthc Inform Res 21(2):67–73
Friess E (2012) Personas and decision making in the design process: an ethnographic case study. In: Proceedings of the SIGCHI conference on human factors in computing systems (CHI’12). ACM, New York, NY, USA, pp 1209–1218
Gandomi A, Haider M (2015) Beyond the hype: big data concepts, methods, and analytics. Int J Inf Manage 35(2):137–144
Goodfellow IJ, Pouget-Abadie J, Mirza M, Xu B, Warde-Farley D, Ozair S et al (2014) Generative adversarial networks. arXiv:1406.2661 [cs, stat]. http://arxiv.org/abs/1406.2661. Accessed 27 Feb 2018
Goodwin, K. (2011). Designing for the digital age: how to create human-centered products and services. Wiley, New York
Guo G, Zhu F, Chen E, Liu Q, Wu L, Guan C (2016) From footprint to evidence: an exploratory study of mining social data for credit scoring. ACM Trans Web 10(4):1–38
Hauser JR, Urban GL, Liberali G, Braun M (2009) Website morphing. Market Sci 28(2):202–223
Hill CG, Haag M, Oleson A, Mendez C, Marsden N, Sarma A, Burnett M (2017) Gender-inclusiveness Personas vs. stereotyping: can we have it both ways? In: Proceedings of CHI ‘17, ACM Press, pp 6658–6671
Hu Z, Yang Z, Liang X, Salakhutdinov R, Xing EP (2017) Controllable text generation. ArXiv preprint arXiv:1703.00955
Jansen BJ (2009) Understanding user-web interactions via web analytics. Synth Lect Inf Concepts Retrieval Serv 1(1):1–102
Jansen BJ, Mullen T (2008) Sponsored search: an overview of the concept, history, and technology. Int J Electron Bus 6(2):114–131
Jansen BJ, Spink A (2006) How are we searching the World Wide Web? A comparison of nine search engine transaction logs. Inf Process Manage 42(1):248–263
Jansen BJ, Sobel K, Cook G (2011) Classifying ecommerce information sharing behaviour by youths on social networking sites. J Inf Sci 37(2):120–136
Jansen BJ, An J, Kwak H, Salminen J, Jung S-G (2017) Viewed by too many or viewed too little: using information dissemination for audience segmentation (pp 189–196). In: Presented at the association for information science and technology annual meeting 2017 (ASIST2017), Washington DC, USA
Jenkinson A (1994) Beyond segmentation. J Target Measure Anal Market 3(1):60–72
Jung S-G, An J, Kwak H, Ahmad M, Nielsen L, Jansen BJ (2017) Persona generation from aggregated social media data. In: Proceedings of the 2017 CHI conference extended abstracts on human factors in computing systems (pp 1748–1755). ACM, New York, NY, USA
Kwak H, An J, Jansen BJ (2017) Automatic generation of Personas using youtube social media data (pp 833–842). In: Proceedings of the Hawaii international conference on system sciences (HICSS-50). 4–7 Jan, Waikoloa, Hawaii
Lee DD, Seung HS (1999) Learning the parts of objects by non-negative matrix factorization. Nature 401(6755):788–791
LeRouge C, Ma J, Sneha S, Tolle K (2013) User profiles and Personas in the design and development of consumer health technologies. Int J Med Inform 82(11):251–268
Matthews T, Judge T, Whittaker S (2012) How do designers and user experience professionals actually perceive and use Personas? In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, New York, NY, USA, pp 1219–1228
McGinn JJ, Kotamraju N (2008) Data-driven Persona development. In Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 1521–1524
Miaskiewicz T, Kozar KA (2011) Personas and user-centered design: How can Personas benefit product design processes? Des Stud 32(5):417–430
Miaskiewicz T, Sumner T, Kozar KA (2008) A latent semantic analysis methodology for the identification and creation of Personas. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 1501–1510
Miller GA (1956) The magical number seven plus or minus two: some limits on our capacity for processing information. Psychol Rev 63(2):81–97
Nguyen D.-P., Gravel R, Trieschnigg RB, Meder T (2013) “How old do you think I am?” A study of language and age in Twitter. In: Proceedings of the seventh international AAAI conference on weblogs and social media (ICWSM). Cambridge, Massachusetts, USA
Nielsen L (2002) From user to character: an investigation into user-descriptions in scenarios. In: Proceedings of the 4th conference on designing interactive systems: processes, practices, methods, and techniques. ACM, New York, NY, USA, pp 99–104
Nielsen L (2004) Engaging Personas and narrative scenarios (vol 17). Samfundslitteratur. http://Personas.dk/wp-content/samlet-udgave-til-load.pdf
Nielsen L, Storgaard Hansen K (2014) Personas is applicable: a study on the use of Personas in Denmark. In: Proceedings of the SIGCHI conference on human factors in computing systems. ACM, pp 1665–1674
Nielsen L, Jung S-G, An J, Salminen J, Kwak H, Jansen BJ (2017) Who are your users?: comparing media professionals’ preconception of users to data-driven Personas. In: Proceedings of the 29th Australian conference on computer-human interaction. ACM, New York, NY, USA, pp 602–606
Oviatt S (2006) Human-centered design meets cognitive load theory: designing interfaces that help people think. In: Proceedings of the 14th ACM international conference on Multimedia. ACM, pp 871–880
Pruitt J, Grudin J (2003) Personas: practice and theory. In: Proceedings of the 2003 conference on designing for user experiences. ACM, New York, NY, USA, pp 1–15
Rönkkö K, Hellman M, Kilander B, Dittrich Y (2004) Personas is not applicable: local remedies interpreted in a wider context. In: Proceedings of the eighth conference on participatory design: artful integration: interweaving media, materials and practices-volume 1 (PDC 04). vol. 1, ACM, New York, NY, USA, pp 112–120
Rönkkö K (2005) An empirical study demonstrating how different design constraints, project organization and contexts limited the utility of personas. In: Proceedings of the 38th annual hawaii international conference on system sciences-volume 08 (HICSS ’05), vol. 8. IEEE Computer Society, Washington, DC, USA, p 220
Salminen J (2014) Startup dilemmas—Strategic problems of early-stage platforms on the internet (Doctoral dissertation). Turku School of Economics, Turku. Retrieved from http://www.doria.fi/handle/10024/99349
Salminen J, Milenković M, Jansen BJ (2017a) Problems of data science in organizations: an explorative qualitative analysis of business professionals’ concerns. In: Proceedings of International Conference on Electronic Business (ICEB 2017). Dubai
Salminen J, Şengün S, Haewoon K, Jansen BJ, An J, Jung S et al (2017b) Generating cultural Personas from social data: a perspective of middle eastern users. In: Proceedings of the fourth international symposium on social networks analysis, management and security (SNAMS-2017), Prague, Czech Republic. Accessed 26 Aug 2017
Salminen J, Kwak H, Santos JM, Jung S-G, An J, Jansen BJ (2018a) Persona perception scale: developing and validating an instrument for human-like representations of data. In: CHI’18 extended abstracts: CHI conference on human factors in computing systems extended abstracts proceedings, Montréal, Canada
Salminen J, Nielsen L, Jung S-G, An J, Kwak H, Jansen BJ (2018b) “Is more better?”: impact of multiple photos on perception of Persona profiles. In: Proceedings of ACM CHI conference on human factors in computing systems (CHI’18), Montréal, Canada
Schwartz HA, Eichstaedt JC, Kern ML, Dziurzynski L, Ramones SM, Agrawal M et al (2013) Personality, gender, and age in the language of social media: the open-vocabulary approach. PLoS ONE 8(9):e73791
Scott DM (2007) The new rules of marketing. Wiley, Hoboken, New Jersey
Stauss B, Heinonen K, Strandvik T, Mickelsson K-J, Edvardsson B, Sundström E, Andersson P (2010) A customer-dominant logic of service. J Serv Manage 21(4):531–548
Thorson E (2008) Changing patterns of news consumption and participation: News recommendation engines. Inf Commun Soc 11(4):473–489
Tversky A, Kahneman D (1974) Judgment under uncertainty: heuristics and biases. Science 185(4157):1124–1131
Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101
Zagheni E, Garimella VRK, Weber I, State B (2014) Inferring international and internal migration patterns from twitter data. In: Proceedings of the 23rd international conference on World Wide Web, ACM, New York, NY, USA, pp 439–444
Zhang X, Brown H-F, Shankar A (2016) Data-driven Personas: constructing archetypal users with Clickstreams and user telemetry. In: Proceedings of the 2016 CHI conference on human factors in computing systems (pp. 5350–5359). ACM, New York, NY, USA. Accessed 4 Nov 2017
Zhang Y, Gan Z, Fan K, Chen Z, Henao R, Shen D, Carin L (2017) Adversarial feature matching for text generation. ArXiv preprint arXiv:1706.03850
Acknowledgements
We would like to thank the employees of the Al Jazeera Media Network, Qatar Airways, and Qatar Foundation who have collaborated with us on this project.
Author information
Authors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer-Verlag London Ltd., part of Springer Nature
About this chapter
Cite this chapter
Salminen, J., Jansen, B.J., An, J., Kwak, H., Jung, SG. (2019). Automatic Persona Generation for Online Content Creators: Conceptual Rationale and a Research Agenda. In: Personas - User Focused Design. Human–Computer Interaction Series. Springer, London. https://doi.org/10.1007/978-1-4471-7427-1_8
Download citation
DOI: https://doi.org/10.1007/978-1-4471-7427-1_8
Published:
Publisher Name: Springer, London
Print ISBN: 978-1-4471-7426-4
Online ISBN: 978-1-4471-7427-1
eBook Packages: Computer ScienceComputer Science (R0)