Abstract: The Large Vision-Language Model (LVLM) has achieved impressive performance in the field of visual-language understanding. However, its ability to understand longer videos is still limited ...