AI News

Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs

Kibum Kim, Jiwan Kim, Kyle Min, Yueqi Wang, Jinyoung Moon, Julian McAuley, Chanyoung Park·arXiv cs.LG·1h ago·1 min read

Sink-Token-Aware Pruning for Fine-Grained Video Understanding in Efficient Video LLMs

Kibum Kim, Jiwan Kim, Kyle Min, Yueqi Wang, Jinyoung Moon, Julian McAuley, Chanyoung Park·arXiv cs.LG·1h ago · Friday, April 24, 2026·1 min read

arXiv:2604.20937v1 Announce Type: new Abstract: Video Large Language Models (Video LLMs) incur high inference latency due to a large number of visual tokens provided to LLMs. To address this, training-free visual token pruning has emerged as a solution to reduce computational costs; however, existing methods are primarily validated on Multiple-Choice Question Answering (MCQA) benchmarks, where coa

Continue reading on arXiv cs.LG

This article was sourced from arXiv cs.LG's RSS feed. Visit the original for the complete story.

Read full article