Abstract
We propose a simple yet effective approach to the problem of pedestrian detection which outperforms the current state-of-the-art. Our new features are built on the basis of low-level visual features and spatial pooling. Incorporating spatial pooling improves the translational invariance and thus the robustness of the detection process. We then directly optimise the partial area under the ROC curve (pAUC) measure, which concentrates detection performance in the range of most practical importance. The combination of these factors leads to a pedestrian detector which outperforms all competitors on all of the standard benchmark datasets. We advance state-of-the-art results by lowering the average miss rate from 13% to 11% on the INRIA benchmark, 41% to 37% on the ETH benchmark, 51% to 42% on the TUD-Brussels benchmark and 36% to 29% on the Caltech-USA benchmark.
Chapter PDF
Similar content being viewed by others
Keywords
These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.
References
Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: An evaluation of the state of the art. IEEE Trans. Pattern Anal. Mach. Intell. 34, 743–761 (2012)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn., vol. 1 (2005)
Shen, C., Wang, P., Paisitkriangkrai, S., van den Hengel, A.: Training effective node classifiers for cascade classification. Int. J. Comp. Vis. 103 (2013)
Paisitkriangkrai, S., Shen, C., Zhang, J.: Fast pedestrian detection using a cascade of boosted covariance features. IEEE Trans. Circuits Syst. Video Technol. 18, 1140–1151 (2008)
Tuzel, O., Porikli, F., Meer, P.: Pedestrian detection via classification on Riemannian manifolds. IEEE Trans. Pattern Anal. Mach. Intell. 30, 1713–1727 (2008)
Viola, P., Jones, M.J.: Robust real-time face detection. Int. J. Comp. Vis. 57, 137–154 (2004)
Wang, X., Han, T.X., Yan, S.: An HOG-LBP human detector with partial occlusion handling. In: Proc. IEEE Int. Conf. Comp. (2009)
Park, D., Zitnick, C.L., Ramanan, D., Dollár, P.: Exploring weak stabilization for motion feature extraction. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2013)
Sermanet, P., Kavukcuoglu, K., Chintala, S., LeCun, Y.: Pedestrian detection with unsupervised multi-stage feature learning. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2013)
Wang, X., Yang, M., Zhu, S., Lin, Y.: Regionlets for generic object detection. In: Proc. IEEE Int. Conf. Comp. Vis. (2013)
Yang, J., Yu, K., Gong, Y., Huang, T.: Linear spatial pyramid matching using sparse coding for image classification. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2009)
Dollár, P., Appel, R., Belongie, S., Perona, P.: Fast feature pyramids for object detection. IEEE Trans. Pattern Anal. Mach. Intell. 1 (2014)
Felzenszwalb, P., Girshick, R., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part based models. IEEE Trans. Pattern Anal. Mach. Intell. 32, 1627–1645 (2010)
Ouyang, W., Zeng, X., Wang, X.: Modeling mutual visibility relationship with a deep model in pedestrian detection. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2013)
Dollár, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: A benchmark. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2009)
Benenson, R., Mathias, M., Tuytelaars, T., Gool, L.V.: Seeking the strongest rigid detector. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2013)
Lim, J.J., Zitnick, C.L., Dollár, P.: Sketch Tokens: A learned mid-level representation for contour and object detection. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2013)
Dollár, P., Tu, Z., Perona, P., Belongie, S.: Integral channel features. In: Proc. of British Mach. Vis. Conf. (2009)
Walk, S., Majer, N., Schindler, K., Schiele, B.: New features and insights for pedestrian detection. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn., San Francisco, US (2010)
Boureau, Y., Roux, N.L., Bach, F., Ponce, J., LeCun, Y.: Ask the locals: multi-way local pooling for image recognition. In: Proc. IEEE Int. Conf. Comp. Vis. (2011)
Chatfield, K., Lempitsky, V., Vedaldi, A., Zisserman, A.: The devil is in the details: an evaluation of recent feature encoding methods. In: Proc. of British Mach. Vis. Conf. (2011)
Coates, A., Ng, A.: The importance of encoding versus training with sparse coding and vector quantization. In: Proc. Int. Conf. Mach. Learn. (2011)
Ojala, T., Pietikainen, M., Maenpaa, T.: Multiresolution gray-scale and rotation invariant texture classification with local binary patterns. IEEE Trans. Pattern Anal. Mach. Intell. 24, 971–987 (2002)
Tuzel, O., Porikli, F., Meer, P.: Region covariance: A fast descriptor for detection and classification. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. Part II. LNCS, vol. 3952, pp. 589–600. Springer, Heidelberg (2006)
Jia, Y., Huang, C., Darrell, T.: Beyond spatial pyramids: Receptive field learning for pooled image features. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2012)
Bo, L., Ren, X., Fox, D.: Multipath sparse coding using hierarchical matching pursuit. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2013)
Vedaldi, A., Fulkerson, B.: VLFeat: An open and portable library of computer vision algorithms. In: Int. Conf. on Multimedia (2010)
Paisitkriangkrai, S., Shen, C., van den Hengel, A.: Efficient pedestrian detection by directly optimizing the partial area under the roc curve. In: Proc. IEEE Int. Conf. Comp. Vis. (2013)
Wu, J., Brubaker, S.C., Mullin, M.D., Rehg, J.M.: Fast asymmetric learning for cascade face detection. IEEE Trans. Pattern Anal. Mach. Intell. 30, 369–382 (2008)
Narasimhan, H., Agarwal, S.: \(\textrm{SVM}_\textrm{pAUC}^\textrm{tight}\): A new support vector method for optimizing partial auc based on a tight convex upper bound. In: ACM Int. Conf. on Knowl. Disc. and Data Mining (2013)
Hastie, T., Tibshirani, R., Friedman, J.: The Elements of Statistical Learning: Prediction, Inference and Data Mining. Springer (2009)
Friedman, J., Hastie, T., Tibshirani, R.: Additive logistic regression: A statistical view of boosting. Ann. Stat. 28, 337–407 (2000)
Appel, R., Fuchs, T., Dollár, P., Perona, P.: Quickly boosting decision a trees-pruning underachieving features early. In: Proc. Int. Conf. Mach. Learn. (2013)
Rizzi, A., Gatta, C., Marini, D.: A new algorithm for unsupervised global and local color correction. Patt. Recogn. 24, 1663–1677 (2003)
Ouyang, W., Wang, X.: Single-pedestrian detection aided by multi-pedestrian detection. In: Proc. IEEE Conf. Comp. Vis. Patt. Recogn. (2013)
Getreuer, P.: Automatic color enhancement (ACE) and its fast implementation. Image Proc. On Line 2012 (2012)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2014 Springer International Publishing Switzerland
About this paper
Cite this paper
Paisitkriangkrai, S., Shen, C., van den Hengel, A. (2014). Strengthening the Effectiveness of Pedestrian Detection with Spatially Pooled Features. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8692. Springer, Cham. https://doi.org/10.1007/978-3-319-10593-2_36
Download citation
DOI: https://doi.org/10.1007/978-3-319-10593-2_36
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10592-5
Online ISBN: 978-3-319-10593-2
eBook Packages: Computer ScienceComputer Science (R0)