Abstract
Due to the ubiquitous existence of large-scale data in today’s real-world applications, including learning on cross-media data, we propose a semi-supervised learning method, named Multiple Binary Subspace Regression, for cross-media data concept detection. In order to mine the common features among the data with multiple modalities, we project the original cross-media data onto the same subspace-level representation simultaneously by mapping to the corresponding subspaces for dimensionality reduction. All the subspaces are set to be binary, which only involve the addition operations and omit the multiplication operations in the subsequent computation owing to the good property of the binary values. The dimensionality reduction to a binary subspace and the concept detection on this subspace are also optimized simultaneously leading to a semi-supervised model. For dealing with large-scale data, our learning method is easily implemented to run in a MapReduce-based Hadoop system. Empirical studies demonstrate its competitive performance on convergence, efficiency, and scalability in comparison with the state-of-the-art literature.









Similar content being viewed by others
References
Bottou L (2010) Large-scale machine learning with stochastic gradient descent. In: Proceedings of Springer computational statistics, pp 177–186
Boyd S, Parikh N, Chu E, Peleato B, Eckstein J (2011) Distributed optimization and statistical learning via the alternating direction method of multipliers. J Found Trends Mach Learn 3(1):1–122
Breiman L (1996) Bagging predictors. J Mach Learn 24(2):123–140
Chakrabarti D, Agarwal D, Josifovski V (2008) Contextual advertising by combining relevance with click feedback. In: Proceedings of ACM international conference World Wide Web, pp 417–426
Chang KW, Hsieh CJ, Lin CJ (2008) Coordinate descent method for large-scale l2-loss linear support vector machines. J Mach Learn Res 9:1369–1398
Chapelle O, Schölkopf B, Zien A (2006) A discussion of semi-supervised learning and transduction. MIT Press, New York
Chen N, Zhu J, Xing EP (2010) Predictive subspace learning for multi-view data: a large margin approach. In: Proceedings of neural information processing systems, pp 361–369
Chu C, Kim S, Lin Y, Yu Y, Bradski G, Ng A, Olukotun K (2007) Map-reduce for machine learning on multicore. In: Proceedings of neural information processing systems, p 281
Dean J, Ghemawat S (2004) Mapreduce: simplified data processing on large clusters. In: Proceedings of operating systems design and implementation, pp 137–149
Doan TN, Do TN, Poulet F (2014) Parallel incremental power mean svm for the classification of large-scale image datasets. Int J Multimed Info Retr 3(2):89–96
Gast E, Oerlemans A, Lew MS (2013) Very large scale nearest neighbor search: ideas, strategies and challenges. Int J Multimed Info Retr 2(4):229–241
Gemulla R, Nijkamp E, Haas PJ, Sismanis Y, Sismanis Y (2011) Large-scale matrix factorization with distributed stochastic gradient descent. In: Proceedings of ACM international conference on SIGKDD, pp 69–77
Genkin A, Lewis DD, Madigan D (2007) Large-scale bayesian logistic regression for text categorization. J Tech 49(3):291–304
Hsieh CJ, Chang KW, Lin CJ, Keerthi SS, Sundararajan S (2008) A dual coordinate descent method for large-scale linear svm. In: Proceedings of ACM international conference machine learning, pp 408–415
Hua W, He X (2011) Discriminative concept factorization for data representation. J Neurocomput 74(18):3800–3807
Jiang W, Loui AC (2012) Video concept detection by audio-visual grouplets. Int J Multimed Info Retr 1(4):223–238
Liu C, Yang H, Fan J, He LW, Wang YM, Wang YM (2010) Distributed nonnegative matrix factorization for web-scale dyadic data analysis on mapreduce. In: Proceedings of ACM international conference on World Wide Web, pp 681–690
Liu G, Lin Z, Yu Y (2010) Robust subspace segmentation by low-rank representation. In: Proceedings of international conference on machine learning, pp 663–670
Long B, Wu X, Zhang ZM, Yu PS (2006) Unsupervised learning on k-partite graphs. In: Proceedings of ACM international conference on SIGKDD, pp 317–326
Long B, Zhang ZM, Yu PS (2005) Co-clustering by block value decomposition. In: Proceedings of ACM internatinal conference on SIGKDD, pp 635–640
Mateos G, Bazerque JA, Giannakis GB (2010) Distributed sparse linear regression. IEEE Trans Signal Proc 58(10):5262–5276
Meinshausen N, Bühlmann P (2006) High-dimensional graphs and variable selection with the lasso. J Ann Stat 34(3):1436–1462
Quadrianto N, Lampert CH (2011) Learning multi-view neighborhood preserving projections. In: Proceedings of international conference on machine learning, pp 425–432
Rahman MM, You D, Simpson MS, Antani SK, Demner-Fushman D, Thoma GR (2013) Multimodal biomedical image retrieval using hierarchical classification and modality fusion. Int J Multimed Info Retr 2(3):159–173
Romberg S, Lienhart R, Hörster E (2012) Multimodal image retrieval. Int J Multimed Info Retr 1(1):31–44
Seber GA, Lee AJ (2012) Linear regression analysis, vol 936. Wiley, New York
Seung D, Lee L (2001) Algorithms for non-negative matrix factorization. In: Proceedings of neural information processing systems, pp 556–562
Shalev-Shwartz S, Tewari A (2009) Stochastic methods for 11 regularized loss minimization. In: Proceedings of international conference on machine learning, pp 929–936
Singh A, Gordon G (2008) A unified view of matrix factorization models. J Mach Learn Know Disc Datab 5212:358–373
Ulges A, Borth D, Koch M (2013) Content analysis meets viewers: linking concept detection with demographics on youtube. Int J Multimed Info Retr 2(2):145–157
Weisberg S (2014) Applied linear regression. Wiley, New York
Yu HF, Hsieh CJ, Chang KW, Lin CJ (2012) Large linear classification when data cannot fit in memory. ACM Trans Know Disc Data 5(4):23
Yu HF, Hsieh CJ, Si S, Dhillon I (2012) Scalable coordinate descent approaches to parallel matrix factorization for recommender systems. In: Proceedings of IEEE international conference on data mining, pp 765–774
Yuan Y, Li X, Pang Y, Lu X, Tao D (2009) Binary sparse nonnegative matrix factorization. IEEE Trans Circ Syst Video Tech 19(5):772–777
Zhang Z, Zhuang Y, Jain R, Pan JY (2014) Editorial of the special issue on cross-media analysis. Int J Multimed Info Retr 3(3):129–130
Zhao X, Zhang C, Zhang Z (2014) Distributed binary subspace learning on large-scale cross media data. In: Proceedings of international conference on multimedia and expo, pp 1–6
Zinkevich M, Weimer M, Smola A, Li L (2010) Parallelized stochastic gradient descent. In: Lafferty JD, Williams CKI, Shawe-Taylor J, Zemel RS, Culotta A (eds) Proceedings of neural information processing systems, vol 23. Curran Associates, Inc., pp 2595–2603
Acknowledgments
This work is supported in part by the National Basic Research Program of China (2012CB316400) and Zhejiang Provincial Engineering Center on Media Data Cloud Processing and Analysis. Z. Zhang is also supported by US NSF (CCF-1017828).
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Zhao, X., Zhang, C. & Zhang, Z. Distributed cross-media multiple binary subspace learning. Int J Multimed Info Retr 4, 153–164 (2015). https://doi.org/10.1007/s13735-015-0081-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13735-015-0081-4