CanViT is the first task- and policy-agnostic AVFM pretrained via passive-to-active dense latent distillation on 13.2M scenes and 1B random glimpses, achieving 38.5% ADE20K mIoU in one glimpse and 84.5% ImageNet-1k top-1 after fine-tuning.
Majaj, Rishi Rajalingham, Elias B
10 Pith papers cite this work. Polarity classification is still indexing.
citation-role summary
citation-polarity summary
roles
background 1polarities
background 1representative citing papers
NEvo performs evolutionary search guided by a dynamic voxel-level encoding model to synthesize videos that maximize predicted activity in target brain ROIs, recovering known selectivities and revealing temporal dynamics differences.
Alignment of vision-language models with human V1-V3 early visual cortex negatively predicts resistance to sycophantic gaslighting attacks.
CHASMBrain uses dual-stream Mamba in a coarse-to-fine hierarchy to predict fMRI from images, reporting 0.429 Pearson correlation and 0.261 MSE on NSD with causal evidence that patch and CLS streams specialize to early versus higher visual cortex.
Hybrid JEMs at intermediate generative-discriminative balance maximize human alignment on perceptual similarity, gloss, uncertainty, robustness, cue conflict, and feature attribution benchmarks.
Feature visualization on TRIBE v2 brain encoders recovers the known ventral visual hierarchy from V1 to V4 and produces distinctive patterns for MT, FFA, and PPA, with optimized stimuli driving ~4x higher activation than natural images.
Decoding alignment metrics can remain high and unchanged even when encoding manifold topology is causally altered, so they do not imply similar function or computation across neural populations.
A zero-shot visual world model trained on one child's experience achieves broad competence on physical understanding benchmarks while matching developmental behavioral patterns.
V1 digital twins with comparable neural prediction accuracy differ in linear probe performance, unit tuning, and hidden-layer eigenspectra.
LITcoder introduces a modular open-source library for constructing, benchmarking, and comparing neural encoding models that map continuous stimuli such as stories to fMRI brain data.
citing papers explorer
-
LITcoder: A General-Purpose Library for Building and Comparing Encoding Models
LITcoder introduces a modular open-source library for constructing, benchmarking, and comparing neural encoding models that map continuous stimuli such as stories to fMRI brain data.