1
MoIIE: Mixture of Intra- and Inter-Modality Experts for Large Vision Language Models
arXiv:2508.09779v2 Announce Type: replace Abstract: Large Vision-Language Models (LVLMs) have demonstrated remarkable performance across multi-modal tasks by scaling model size and training data. However, these dense LVLMs incur significant computational costs and motivate the exploration of sparse…
No comments yet.