Abstract: Video understanding has become crucial in computer vision research due to the vast amounts of video data generated from various sources. Efficient video analysis, especially the extraction ...