Time-frequency recurrent transformer with diversity...

Articles

Scholar

Time–frequency recurrent transformer with diversity constraint for dense video captioning

P Li, P Zhang, T Wang, H Xiao - Information Processing & Management, 2023 - Elsevier

Describing a long video using multiple sentences, ie, dense video captioning, is a very
challenging task. Existing methods neglect the important fact that the actions of several
tempos (aka, frequencies) evolve with the time in video, and do not well handle the phrase
repetition issue. Therefore, we propose a Time-Frequency recurrent Transformer with
Diversity constraint (TFTD) for dense video captioning. Its basic idea is to develop a time–
frequency memory module, which not only stores the history of the past sentences and …

Save Cite Cited by 11 Related articles All 2 versions

Showing the best result for this search. See all results

Cite

Advanced search

Saved to My library

Time–frequency recurrent transformer with diversity constraint for dense video captioning