Abstract
In this paper, a large-scale human action recognition system is proposed which is built upon the combination of the rising big data processing technology Spark and the powerful Graphics Processing Unit (GPU) in order to fully utilize the efficient in-memory computing ability of Spark and the fine-grained parallel computing capacity of GPU for visual data processing. A number of key algorithms for human action recognition including trajectory based feature extraction, Gaussian Mixture Model (GMM) generation and Fisher Vector (FV) encoding are performed with the proposed GPU-based Spark framework. The experimental results on the benchmark human action dataset Hollywood-2 demonstrate that the proposed GPU-based Spark framework is able to dramatically accelerate the process of human action recognition.
This work was supported in part by the National Natural Science Foundation of China under Grant 61472281, the “Shu Guang” project of Shanghai Municipal Education Commission and Shanghai Education Development Foundation under Grant 12SG23, and the Program for Professor of Special Appointment (Eastern Scholar) at the Shanghai Institutions of Higher Learning under Grant GZ2015005.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Wang, H., Kläser, A., Schmid, C., Liu, C.L.: Action recognition by dense trajectories. In: Proceedings of CVPR 2011, pp. 3169–3176, June 2011
Wang, H., Schmid, C.: Action recognition with improved trajectories. In: Proceedings of ICCV 2013, pp. 3551–3558, December 2013
Wang, H., Yi, Y., Wu, J.: Human action recognition with trajectory based covariance descriptor. In: Proceedings of ACM MM 2015, pp. 1175–1178, October 2015
Dean, J., Ghemawat, S.: MapReduce: simplified data processing on large clusters. Commun. ACM 51(1), 107–113 (2008)
Zaharia, M., Chowdhury, M., Franklin, M.J., Shenker, S., Stoica, I.: Spark: cluster computing with working sets. In: Proceedings of HotCloud 2010, p. 10, June 2010
Hadoop. http://hadoop.apache.org/
CUDA guide. http://docs.nvidia.com/cuda/cuda-c-programming-guide/
Augonnet, C., Thibault, S., Namyst, R., Wacrenier, P.-A.: StarPU: a unified platform for task scheduling on heterogeneous multicore architectures. In: Sips, H., Epema, D., Lin, H.-X. (eds.) Euro-Par 2009. LNCS, vol. 5704, pp. 863–874. Springer, Heidelberg (2009). doi:10.1007/978-3-642-03869-3_80
Linderman, M., Collins, J., Wang, H., Meng, T.: Merge: a programming model for heterogeneous multi-core systems. In: Proceedings of ASPLOS 2008, pp. 287–296, March 2008
Wernsing, J.R., Stitt, G.: Elastic computing: a portable optimization framework for hybrid computers. Parallel Comput. 38(8), 438–464 (2012)
He, B., Fang, W., Luo, Q., Govindaraju, N., Wang, T.: Mars: a MapReduce framework on graphics processors. In: Proceedings of PACT 2008, pp. 260–269, October 2008
Wang, H., Xiao, B., Wang, L., Zhu, F., Jiang, Y.G., Wu, J.: CHCF: a cloud-based heterogeneous computing framework for large-scale image retrieval. IEEE Trans. Circ. Syst. Video Technol. 25(12), 1900–1913 (2015)
Wang, H., Zheng, X., Xiao, B.: Large-scale human action recognition with Spark. In: Proceedings of MMSP 2015, pp. 1–6, October 2015
Perronnin, F., Dance, C.: Fisher kernels on visual vocabularies for image categorization. In: Proceedings of CVPR 2007, pp. 1–8, June 2007
Marszalek, M., Laptev, I., Schmid, C.: Actions in context. In: Proceedings of CVPR 2009, pp. 2929–2936, June 2009
Google protobuf. https://code.google.com/p/protobuf/
Oneata, D., Verbeek, J., Schmid, C.: Action and event recognition with Fisher vectors on a compact feature set. In: Proceedings of ICCV 2013, pp. 1817–1824, December 2013
Sivic, J., Zisserman, A.: Video google: a text retrieval approach to object matching in videos. In: ICCV 2003, pp. 1470–1477, October 2003
Dempster, A.P., Laird, N.M., Rubin, D.B.: Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc. 39(1), 1–38 (1977)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2016 Springer International Publishing AG
About this paper
Cite this paper
Wang, H., Zheng, X., Xiao, B. (2016). Accelerating Large-Scale Human Action Recognition with GPU-Based Spark. In: Chen, E., Gong, Y., Tie, Y. (eds) Advances in Multimedia Information Processing - PCM 2016. PCM 2016. Lecture Notes in Computer Science(), vol 9917. Springer, Cham. https://doi.org/10.1007/978-3-319-48896-7_66
Download citation
DOI: https://doi.org/10.1007/978-3-319-48896-7_66
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-48895-0
Online ISBN: 978-3-319-48896-7
eBook Packages: Computer ScienceComputer Science (R0)