arXiv preprint arXiv:2408.11796 (2024)

13 Sharath Turuvekere Sreenivas, Saurav Muralidharan, Raviraj Joshi, Marcin Chochowski, Ameya Sunil Mahabaleshwarkar, Gerald Shen, Jiaqi Zeng, Zijia Chen, Yoshi Suhara, Shizhe Diao · 2024 · arXiv 2408.11796

3 Pith papers cite this work. Polarity classification is still indexing.

3 Pith papers citing it

read on arXiv browse 3 citing papers

representative citing papers

Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment

cs.CL · 2026-04-12 · unverdicted · novelty 7.0

Lesioning a shared core in multilingual LLMs drops whole-brain fMRI encoding correlation by 60.32%, while language-specific lesions selectively weaken predictions only for the matched native language.

Structural Pruning of Large Vision Language Models: A Comprehensive Study on Pruning Dynamics, Recovery, and Data Efficiency

cs.CL · 2026-04-27 · conditional · novelty 5.0

Widthwise pruning of LVLM language backbones combined with supervised finetuning and hidden-state distillation recovers over 95% performance using just 5% of data across 3B-7B models.

Ministral 3

cs.CL · 2026-01-13 · unverdicted · novelty 4.0

Ministral 3 releases 3B/8B/14B parameter-efficient language models with base, instruction, and reasoning variants derived via iterative pruning and distillation, including image understanding capabilities.

citing papers explorer

Showing 3 of 3 citing papers.

Computational Lesions in Multilingual Language Models Separate Shared and Language-specific Brain Alignment cs.CL · 2026-04-12 · unverdicted · none · ref 28
Lesioning a shared core in multilingual LLMs drops whole-brain fMRI encoding correlation by 60.32%, while language-specific lesions selectively weaken predictions only for the matched native language.
Structural Pruning of Large Vision Language Models: A Comprehensive Study on Pruning Dynamics, Recovery, and Data Efficiency cs.CL · 2026-04-27 · conditional · none · ref 66
Widthwise pruning of LVLM language backbones combined with supervised finetuning and hidden-state distillation recovers over 95% performance using just 5% of data across 3B-7B models.
Ministral 3 cs.CL · 2026-01-13 · unverdicted · none · ref 24
Ministral 3 releases 3B/8B/14B parameter-efficient language models with base, instruction, and reasoning variants derived via iterative pruning and distillation, including image understanding capabilities.

arXiv preprint arXiv:2408.11796 (2024)

fields

years

verdicts

representative citing papers

citing papers explorer