Abstract
Social networks have become a popular way for Internet users to express their thoughts and exchange real-time information. The increasing number of topic-oriented resources in social networks has drawn more and more attention, leading to the development of topic detection. Topic detection of pure texts originates from text mining and document clustering, aiming to automatically identify topics from massive data in an unsupervised manner. With the development of mobile Internet, user-generated content in social networks usually contains multimodal data, such as images, videos, etc. Multimodal topic detection poses a new challenge of fusing and aligning heterogeneous features from different modalities, which has received limited attention in existing research studies. To address this problem, we adopt a Graph Fusion Network (GFN) based encoder and a multilayer perceptron (MLP) decoder to hierarchically fuse information from images and texts. The proposed method regards multimodal features as vertices and models the interactions between modalities with edges layer by layer. Therefore, the fused representations contain rich semantic information and explicit multimodal dynamics, which are beneficial to improve the performance of multimodal topic detection. Experimental results on the real-world multimodal topic detection dataset demonstrate that our model performs favorably against all the baseline methods.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
References
Allan, J.: Introduction to topic detection and tracking. In: Allan, J. (ed.) Topic Detection and Tracking. The Information Retrieval Series, vol. 12, pp. 1–16. Springer, Boston (2002). https://doi.org/10.1007/978-1-4615-0933-2_1
Allan, J., Carbonell, J., Doddington, G., Yamron, J., Yang, Y.: Topic detection and tracking pilot study final report (1998)
Beeferman, D., Berger, A.: Agglomerative clustering of a search engine query log. In: ACM Knowledge Discovery and Data Mining, pp. 407–416 (2000)
Berrocal, J., Figuerola, C.G., RodrÃguez, Z.: Reina at replab2013 topic detection task: Community detection. reina.usal.es (2013)
Bird, S., Loper, E.: NLTK: the natural language toolkit. In: ACL, pp. 214–217 (2004)
Blei, D.M., Ng, A., Jordan, M.I.: Latent Dirichlet allocation. J. Mach. Learn. Res. 3, 993–1022 (2003)
Cai, D., He, X., Han, J.: Locally consistent concept factorization for document clustering. IEEE Trans. Knowl. Data Eng. 23(6), 902–913 (2010)
Cataldi, M., Di Caro, L., Schifanella, C.: Emerging topic detection on twitter based on temporal and social terms evaluation. In: Proceedings of the Tenth International Workshop on Multimedia Data Mining, pp. 1–10 (2010)
Chen, Y., Liu, L.: Development and research of topic detection and tracking. In: IEEE International Conference on Software Engineering and Service Science, pp. 170–173 (2017)
Connell, M., Feng, A., Kumaran, G., Raghavan, H., Shah, C., Allan, J.: UMass at TDT 2004. In: Topic Detection and Tracking Workshop Report, vol. 19 (2004)
Hochreiter, S., Schmidhuber, J.: Long short-term memory. Neural Comput. 9(8), 1735–1780 (1997)
Hofmann, T.: Unsupervised learning by probabilistic latent semantic analysis. Mach. Learn. 42(1), 177–196 (2001). https://doi.org/10.1023/A:1007617005950
Huang, F., Zhang, S., Zhang, J., Yu, G.: Multimodal learning for topic sentiment analysis in microblogging. Neurocomputing 253, 144–153 (2017)
Kuo, Z., Juan-zi, L., Gang, W., Ke-hong, W.: A new event detection model based on term reweighting (2008)
Lau, J.H., Collier, N., Baldwin, T.: On-line trend analysis with topic models:# Twitter trends detection topic model online. Proc. COLING 2012, 1519–1534 (2012)
Li, W., Joo, J., Qi, H., Zhu, S.C.: Joint image-text news topic detection and tracking by multimodal topic and-or graph. IEEE Trans. Multimedia 19(2), 367–381 (2016)
Liu, W., Zhang, M.: Semi-supervised sentiment classification method based on Weibo social relationship. In: Ni, W., Wang, X., Song, W., Li, Y. (eds.) WISA 2019. LNCS, vol. 11817, pp. 480–491. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-30952-7_47
MacQueen, J., et al.: Some methods for classification and analysis of multivariate observations. In: Proceedings of the Fifth Berkeley Symposium on Mathematical Statistics and Probability, vol. 1, pp. 281–297 (1967)
Mai, S., Hu, H., Xing, S.: Modality to modality translation: an adversarial representation learning and graph fusion network for multimodal fusion. In: AAAI Conference on Artificial Intelligence, pp. 164–172 (2020)
Pang, J., Tao, F., Huang, Q., Tian, Q., Yin, B.: Two birds with one stone: a coupled Poisson deconvolution for detecting and describing topics from multimodal web data. IEEE Trans. Neural Netw. Learn. Syst. 30(8), 2397–2409 (2018)
Pennington, J., Socher, R., Manning, C.: GloVe: gobal vectors for word representation. In: EMNLP, pp. 1532–1543 (2014)
Petkos, G., Papadopoulos, S., Aiello, L., Skraba, R., Kompatsiaris, Y.: A soft frequent pattern mining approach for textual topic detection. In: International Conference on Web Intelligence, Mining and Semantics, pp. 1–10 (2014)
Sayyadi, H., Raschid, L.: A graph analytical approach for topic detection. ACM Trans. Internet Technol. 13(2), 1–23 (2013)
Schuster, M., Paliwal, K.K.: Bidirectional recurrent neural networks. IEEE Trans. Signal Process. 45(11), 2673–2681 (1997)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: Bengio, Y., LeCun, Y. (eds.) 3rd International Conference on Learning Representations, ICLR 2015, San Diego, CA, USA, 7–9 May, 2015, Conference Track Proceedings (2015)
Trieschnigg, D., Kraaij, W.: TNO hierarchical topic detection report at TDT 2004. In: Topic Detection and Tracking Workshop Report (2004)
Truong, Q.T., Lauw, H.W.: VistaNet: visual aspect attention network for multimodal sentiment analysis. In: AAAI, vol. 33, pp. 305–312 (2019)
Xiong, Y., Zhang, Y.F., Feng, S., Wang, D.L.: Event detection and tracking in microblog stream based on multimodal feature deep fusion. Control and Decision (2019)
Yang, Y., Carbonell, J.G., Brown, R.D., Pierce, T., Archibald, B.T., Liu, X.: Learning approaches for detecting and tracking news events. IEEE Int. Syst. Appl. 14(4), 32–43 (1999)
Yu, H., Zhang, Y., Ting, L., Sheng, L.: Topic detection and tracking review. J. Chin. Inf. Process. 6(21), 77–79 (2007)
Zadeh, A., Chen, M., Poria, S., Cambria, E., Morency, L.P.: Tensor fusion network for multimodal sentiment analysis. In: EMNLP, pp. 1103–1114 (2017)
Zhang, T., Ramakrishnan, R., Livny, M.: BIRCH: an efficient data clustering method for very large databases. In: Jagadish, H.V., Mumick, I.S. (eds.) ACM Conference on Management of Data, pp. 103–114 (1996)
Acknowledgements
This research is supported by the NSFC-Xinjiang Joint Fund (No. U1903128), NSFC General Technology Joint Fund for Basic Research (No. U1836109, No. U1936206), Natural Science Foundation of Tianjin, China (No. 18ZXZNGX00110, No. 18ZXZNGX00200), and the Fundamental Research Funds for the Central Universities, Nankai University (No. 63211128).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, Y., Song, K., Cai, X., Tuergong, Y., Yuan, L., Zhang, Y. (2021). Multimodal Topic Detection in Social Networks with Graph Fusion. In: Xing, C., Fu, X., Zhang, Y., Zhang, G., Borjigin, C. (eds) Web Information Systems and Applications. WISA 2021. Lecture Notes in Computer Science(), vol 12999. Springer, Cham. https://doi.org/10.1007/978-3-030-87571-8_3
Download citation
DOI: https://doi.org/10.1007/978-3-030-87571-8_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-87570-1
Online ISBN: 978-3-030-87571-8
eBook Packages: Computer ScienceComputer Science (R0)