Fig. 9
From: Tubelets: Unsupervised Action Proposals from Spatiotemporal Super-Voxels

Comparing representations: bag-of-words, Fisher vector and CNN features on UCF Sports, performance is measured by AUC for \(\sigma \) from 0.1 to 0.6, following (Tian et al. 2013). The best AUC is obtained when both Fisher vector and CNN features are combined for the Tubelet representation