Existing temporal action detection algorithms cannot distinguish complete and incomplete actions while this property is essential in many applications. To tackle this challenge, we proposed the action progression networks (APN), a novel model that predicts action progression of video frames with continuous numbers. Using the progression sequence of test video, on the top of the APN, a complete action searching algorithm (CAS) was designed to detect complete actions only. With the usage of frame-level fine-grained temporal structure modeling and detecting actions according to their whole temporal context, our framework can locate actions precisely and is good at avoiding incomplete action detection. We evaluated our framework on a new dataset (DFMAD-70) collected by ourselves which contains both complete and incomplete actions. Our framework got good temporal localization results with 95.77% average precision when the IoU threshold is 0.5. On the benchmark THUMOS14, an incomplete-ignostic dataset, our framework still obtain competitive performance. Copyright © 2021 IEEE.
|Title of host publication||Proceedings of ICPR 2020: 25th International Conference on Pattern Recognition|
|Place of Publication||USA|
|Publication status||Published - 2021|