Exploratory Experience Shapes the Geometry of Predictive Representations

Abdelrahman Sharafeldin; Advay Balakrishnan; Hannah Choi; Kseniia Shilova

arxiv: 2605.27929 · v1 · pith:2DD4PFBDnew · submitted 2026-05-27 · 🧬 q-bio.NC · cs.LG

Exploratory Experience Shapes the Geometry of Predictive Representations

Kseniia Shilova , Abdelrahman Sharafeldin , Advay Balakrishnan , Hannah Choi This is my paper

Pith reviewed 2026-06-29 09:36 UTC · model grok-4.3

classification 🧬 q-bio.NC cs.LG

keywords explorationexploitationpredictive codingrepresentational geometrymaze navigationlatent spacemouse behavioractive sensing

0 comments

The pith

Exploratory behavior produces more spatially organized predictive representations that preserve maze transitions in both agents and mice.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper examines how the balance between exploration and exploitation in behavior shapes the geometry of internal predictive representations. It introduces an artificial agent whose controllable parameter switches between information-gain-driven exploration and reward-driven exploitation while updating a predictive-coding model of future states. Exploratory regimes yield latent representations organized around spatial locations and transition structure, whereas exploitative regimes produce less organized ones. When the identical model is trained on real mouse trajectories through the same maze, animals with broader visitation patterns generate geometries that align with those of exploratory agents, while restricted patterns align with exploitative ones. This establishes a direct link between behavioral regime and the structure of learned predictive models.

Core claim

Exploratory agents develop representations that are more spatially organized and better preserve the structure of maze transitions in latent space. In contrast, exploitative agents learn less organized representations. More exploratory mice show representational geometries that closely match those of exploratory agents.

What carries the argument

Predictive-coding perception model updated from the agent's own trajectories, predicting future maze states and reward probability, with a controllable parameter selecting actions by expected information gain during exploration or by predicted reward during exploitation.

Load-bearing premise

That training the same predictive-coding model on mouse trajectories produces representations directly comparable to those from the artificial agent, with differences in mouse visitation patterns reflecting the same exploratory versus exploitative regimes as the agent's parameter.

What would settle it

If training the model on trajectories from the most exploratory mice produces latent geometries that fail to match the spatial organization and transition preservation seen in exploratory agents.

Figures

Figures reproduced from arXiv: 2605.27929 by Abdelrahman Sharafeldin, Advay Balakrishnan, Hannah Choi, Kseniia Shilova.

**Figure 1.** Figure 1: Predictive-coding agent with exploration-exploitation switching. Top: Overview of the action-perception loop. At each step, the agent evaluates locally valid actions using its current predictive model. In the exploratory regime, actions are sampled according to expected information gain (EIG); in the reward-driven regime, actions are selected using a value map constructed from learned reward predictions an… view at source ↗

**Figure 2.** Figure 2: Behavioral consequences of exploration-exploitation balance. A: Example trajectories of an exploratory agent and a more reward-driven agent. The reward-driven agent visits the water port more often, whereas the exploratory agent samples a broader range of maze branches. B: Evolution of the learned reward map over training. The reward map is constructed from recent experience by averaging predicted reward a… view at source ↗

**Figure 3.** Figure 3: Exploration shapes the geometry of predictive latent representations in agents and mice. A: UMAP visualizations of prior latent states for exploratory and reward-driven agents, and for two example mice. Exploratory agents develop depth-aligned, branched latent spaces with clearer separation between outward (root-to-leaf) and inward (leaf-to-root) transitions. More reward-driven agents show less organized l… view at source ↗

**Figure 4.** Figure 4: Additional task and mouse-trajectory visualizations. A: Binary-tree maze used in the task. Nodes are colored and labeled by index; node 116 is the fixed water-port node. B: Example trajectories from exploratory mouse B5 and more reward-focused mouse C8. Color indicates normalized step within the plotted bouts. C8 repeatedly follows trajectories toward the water-port, whereas B5 samples the maze more broadl… view at source ↗

**Figure 5.** Figure 5: Additional analyses of reward-map learning and behavioral regimes. A: Transformation of the learned reward map into a value map. B: Example learned reward maps from individual agents, shown without averaging across random seeds. C: Relationship between reward-driven switching probability and behavioral regime. Top: Duration of reward-driven episodes as a function of the switching probability p. Bottom: Ex… view at source ↗

**Figure 6.** Figure 6: Single-unit spatial and path-transition tuning. A: Selected units from the recurrent state ht show spatial tuning across the maze. Examples include units tuned to broad maze regions, smaller subregions, terminal leaves, and specific paths. Similar tuning patterns are observed in a long-trained agent and in a combined-mouse model. B: Path-transition tuning of ht units along an example trajectory. Columns co… view at source ↗

read the original abstract

Active sensing links behavior and learning through an action-perception loop: actions determine the observations used to update internal predictive models of perception, which subsequently guide the next actions. Predictive-coding frameworks provide a natural way to model this process, since internal representations are continuously updated to predict future observations. Here, we ask how exploratory and exploitative behavioral strategies shape these internal predictive representations. We build an online learning agent in a tree-like maze with a controllable parameter regulating the balance between exploratory and exploitative regimes. The agent updates a predictive-coding-based perception model from experience generated by its own behavior. The model predicts both future maze states and reward probability, allowing the agent to select actions either by expected information gain during exploration or by predicted reward during exploitation. We show that the resulting internal predictive representations depend strongly on the agent's behavioral regime. Exploratory agents develop representations that are more spatially organized and better preserve the structure of maze transitions in latent space. In contrast, exploitative agents learn less organized representations. We then train this predictive model on natural trajectories of water-deprived mice navigating the same maze and compare the resulting representations with those learned from agent trajectories. More exploratory mice show representational geometries that closely match those of exploratory agents, whereas mice with more restricted visitation patterns resemble reward-driven, exploitative agents. Together, these findings suggest that exploration enables predictive models to form generalized internal representations by organizing latent space around both spatial location and transition context in artificial agents and animals.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Exploratory behavior produces more spatially organized predictive representations than exploitative behavior in both agents and mice, but the mouse comparison rests on an untested assumption about matched trajectory statistics.

read the letter

The paper's main result is that a predictive-coding agent in a tree maze develops latent representations that are more spatially organized and better preserve transition structure when its behavior is biased toward exploration rather than reward-driven exploitation. The same model trained on mouse trajectories yields geometries that align with the exploratory agents when the mice show broader visitation patterns.

What is new is the side-by-side application of the identical model to both controllable agent trajectories and real mouse paths in the same environment. The agent setup with an explicit balance parameter is clean enough to isolate the regime effect on representation geometry.

The mouse comparison is the weaker part. Exploratory agents generate broader coverage by design, while mice are split post-hoc on a visitation metric. If the geometry metrics are sensitive to the empirical distribution of states and transitions rather than the regime itself, the reported alignment could be an artifact of unequal data statistics. The abstract gives no sign that visit counts or transition matrices were equalized before the geometry comparison, and the stress-test concern on this point stands until the methods show otherwise.

This is worth a serious referee for readers working on predictive coding, exploration, and representational geometry in both artificial agents and systems neuroscience. The central idea is coherent and the agent experiments look reproducible on their own terms. The animal link needs tighter controls on data coverage before the claim can be taken as settled.

Referee Report

1 major / 2 minor

Summary. The paper claims that a predictive-coding agent in a tree-like maze develops more spatially organized latent representations that better preserve transition structure when its behavior is biased toward exploration (via a controllable balance parameter) rather than exploitation; the same model trained on water-deprived mouse trajectories yields geometries that align with the exploratory-agent regime for mice showing broader visitation and with the exploitative regime for mice showing restricted visitation, suggesting that exploratory experience shapes generalized internal predictive representations in both artificial agents and animals.

Significance. If the central comparison holds after controlling for experienced transition statistics, the result would link behavioral regime directly to the geometry of predictive representations and provide a concrete, testable bridge between controllable artificial agents and biological data.

major comments (1)

[Results on mouse–agent representational comparison] The central mouse–agent alignment claim (abstract and Results on mouse trajectories) rests on the assumption that post-hoc partitioning of mice by visitation metric isolates the same exploratory/exploitative regime as the agent’s controllable balance parameter. The abstract gives no indication that training data volume, visit counts, or empirical transition matrices were equalized across conditions before geometry comparison; if the reported metrics (spatial organization, transition preservation) are sensitive to the distribution of experienced transitions rather than the regime per se, the alignment could be an artifact of unequal coverage.

minor comments (2)

[Methods] Clarify the exact definition and units of the “balance parameter” and how it is held fixed versus varied across agent runs.
[Results] Specify the precise geometry metrics (e.g., which distance or correlation measure quantifies “spatial organization” and “transition preservation”) and report effect sizes with confidence intervals.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their constructive comments, which highlight an important methodological consideration for the mouse–agent comparison. We address the concern point-by-point below and will incorporate revisions to strengthen the manuscript.

read point-by-point responses

Referee: [Results on mouse–agent representational comparison] The central mouse–agent alignment claim (abstract and Results on mouse trajectories) rests on the assumption that post-hoc partitioning of mice by visitation metric isolates the same exploratory/exploitative regime as the agent’s controllable balance parameter. The abstract gives no indication that training data volume, visit counts, or empirical transition matrices were equalized across conditions before geometry comparison; if the reported metrics (spatial organization, transition preservation) are sensitive to the distribution of experienced transitions rather than the regime per se, the alignment could be an artifact of unequal coverage.

Authors: We agree that equalizing experienced transition statistics is essential to isolate the effect of behavioral regime from coverage differences. The visitation metric used for partitioning directly operationalizes the regime (broader vs. restricted exploration), and the agent’s controllable parameter produces analogous differences in coverage. However, the referee correctly notes that the abstract does not explicitly state controls for data volume or transition matrices. In the revision we will: (1) report visit counts, trajectory numbers, and transition-matrix statistics for each mouse group in the abstract and Results; (2) add a supplementary analysis that subsamples mouse trajectories to match the empirical transition distributions and total visit counts of the agent conditions as closely as possible, then recompute the geometry metrics (spatial organization and transition preservation); and (3) verify whether the mouse–agent alignment persists under these matched conditions. If the alignment remains, it supports that regime per se shapes the representations; if not, we will qualify the claim accordingly. This control directly addresses the potential artifact raised. revision: yes

Circularity Check

0 steps flagged

No circularity: representations shaped by regime-specific trajectories; mouse comparison is external validation

full rationale

The paper constructs an agent whose behavior parameter controls trajectory statistics, then trains the same predictive model on those trajectories and measures geometry metrics on the resulting latents. This is an empirical demonstration that different input distributions produce different geometries, not a self-definitional loop or a fitted parameter renamed as prediction. The mouse analysis partitions real trajectories post-hoc by a visitation metric and applies the identical model; no equations or self-citations are shown that would make the geometry metrics reduce to the regime parameter by construction. The derivation chain is self-contained against the external mouse data and does not rely on load-bearing self-citation.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The claim rests on the predictive-coding framework as a model of the action-perception loop and on the assumption that the same model architecture can be trained on both synthetic and biological trajectories.

free parameters (1)

balance parameter between exploration and exploitation
Controllable parameter regulating the balance between exploratory and exploitative regimes in the agent.

axioms (1)

domain assumption Predictive-coding frameworks provide a natural way to model the action-perception loop
Stated directly in the abstract as the modeling basis.

pith-pipeline@v0.9.1-grok · 5798 in / 1182 out tokens · 34793 ms · 2026-06-29T09:36:37.046954+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

34 extracted references · 29 canonical work pages

[1]

The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2), 127-138

Karl Friston. The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2): 127–138, February 2010. ISSN 1471-0048. doi: 10.1038/nrn2787. URL https://doi.org/10.1038/ nrn2787

work page doi:10.1038/nrn2787 2010
[2]

Little and Friedrich T

Daniel Y . Little and Friedrich T. Sommer. Learning and exploration in action-perception loops.Frontiers in Neural Circuits, V olume 7 - 2013, 2013. ISSN 1662-5110. doi: 10.3389/fncir.2013.00037. URLhttps:// www.frontiersin.org/journals/neural-circuits/articles/10.3389/fncir.2013.00037

work page doi:10.3389/fncir.2013.00037 2013
[3]

Theoretical perspectives on active sensing.Current Opinion in Behavioral Sciences, 11:100–108, 2016

Scott Cheng-Hsin Yang, Daniel M Wolpert, and Máté Lengyel. Theoretical perspectives on active sensing.Current Opinion in Behavioral Sciences, 11:100–108, 2016. ISSN 2352-1546. doi: https:// doi.org/10.1016/j.cobeha.2016.06.009. URL https://www.sciencedirect.com/science/article/ pii/S2352154616301255. Computational modeling

work page doi:10.1016/j.cobeha.2016.06.009 2016
[4]

The active inference approach to ecological perception: General information dynamics for natural and artificial embodied cognition

Adam Linson, Andy Clark, Subramanian Ramamoorthy, and Karl Friston. The active inference approach to ecological perception: General information dynamics for natural and artificial embodied cognition. Frontiers in Robotics and AI, V olume 5 - 2018, 2018. ISSN 2296-9144. doi: 10.3389/frobt.2018.00021. URL https://www.frontiersin.org/journals/robotics-and-ai...

work page doi:10.3389/frobt.2018.00021 2018
[5]

Cognitive maps in rats and men.Psychological Review, 55(4):189–208, 1948

Edward C Tolman. Cognitive maps in rats and men.Psychological Review, 55(4):189–208, 1948. doi: 10.1037/h0061626. URLhttps://psycnet.apa.org/record/1949-00103-001

work page doi:10.1037/h0061626 1948
[6]

Clarendon Press, Oxford, UK,

John O’Keefe and Lynn Nadel.The Hippocampus as a Cognitive Map. Clarendon Press, Oxford, UK,
[7]

Rikhye, Nishad Gothoskar, J

Dileep George, Rajeev V . Rikhye, Nishad Gothoskar, J. Swaroop Guntupalli, Antoine Dedieu, and Miguel Lázaro-Gredilla. Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps.Nature Communications, 12(1):2392, 2021. doi: 10.1038/s41467-021-22559-5. URL https://www.nature.com/articles/s41467-021-22559-5

work page doi:10.1038/s41467-021-22559-5 2021
[8]

Swaroop Guntupalli, Guangyao Zhou, Carter Wendelken, Miguel Lázaro- Gredilla, and Dileep George

Rajkumar Vasudeva Raju, J. Swaroop Guntupalli, Guangyao Zhou, Carter Wendelken, Miguel Lázaro- Gredilla, and Dileep George. Space is a latent sequence: A theory of the hippocampus.Science Advances, 10(31):eadm8470, 2024. doi: 10.1126/sciadv.adm8470. URL https://www.science.org/doi/abs/ 10.1126/sciadv.adm8470

work page doi:10.1126/sciadv.adm8470 2024
[9]

Sequential predictive learning is a unifying theory for hippocampal representation and replay.bioRxiv, 2024

Daniel Levenstein, Aleksei Efremov, Roy Henha Eyono, Adrien Peyrache, and Blake Richards. Sequential predictive learning is a unifying theory for hippocampal representation and replay.bioRxiv, 2024. doi: 10.1101/2024.04.28.591528. URL https://www.biorxiv.org/content/early/2024/04/29/2024. 04.28.591528

work page doi:10.1101/2024.04.28.591528 2024
[10]

Mice in a labyrinth show rapid learning, sudden insight, and efficient exploration.eLife, 10:e66175, jul 2021

Matthew Rosenberg, Tony Zhang, Pietro Perona, and Markus Meister. Mice in a labyrinth show rapid learning, sudden insight, and efficient exploration.eLife, 10:e66175, jul 2021. ISSN 2050-084X. doi: 10.7554/eLife.66175. URLhttps://doi.org/10.7554/eLife.66175

work page doi:10.7554/elife.66175 2021
[11]

Rajesh P. N. Rao and Dana H. Ballard. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.Nature Neuroscience, 2(1):79–87, January 1999. doi: 10.1038/4580

work page doi:10.1038/4580 1999
[12]

A theory of cortical responses.Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456):815–836, April 2005

Karl Friston. A theory of cortical responses.Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456):815–836, April 2005. doi: 10.1098/rstb.2005.1622. 10

work page doi:10.1098/rstb.2005.1622 2005
[13]

Active sensing with predictive coding and uncertainty minimization.Patterns, 5(6):100983, June 2024

Abdelrahman Sharafeldin, Nabil Imam, and Hannah Choi. Active sensing with predictive coding and uncertainty minimization.Patterns, 5(6):100983, June 2024. ISSN 2666-3899. doi: 10.1016/j.patter.2024. 100983. URLhttps://doi.org/10.1016/j.patter.2024.100983

work page doi:10.1016/j.patter.2024 2024
[14]

Infomax control of eye movements.IEEE Transactions on Autonomous Mental Development, 2(2):91–107, 2010

Nicholas J Butko and Javier R Movellan. Infomax control of eye movements.IEEE Transactions on Autonomous Mental Development, 2(2):91–107, 2010

2010
[15]

Rajesh P. N. Rao, Dimitrios C. Gklezakos, and Vishwas Sathish. Active predictive coding: A unifying neural model for active perception, compositional learning, and hierarchical planning.Neural Computation, 36(1):1–32, 12 2023. ISSN 0899-7667. doi: 10.1162/neco_a_01627. URL https://doi.org/10.1162/ neco_a_01627

work page doi:10.1162/neco_a_01627 2023
[16]

The stability-plasticity dilemma: investigating the continuum from catastrophic forgetting to age-limited learning effects.Fron- tiers in Psychology, V olume 4 - 2013, 2013

Karl Friston, Rick Adams, Laurent Perrinet, and Michael Breakspear. Perceptions as hypotheses: Saccades as experiments.Frontiers in Psychology, V olume 3 - 2012, 2012. ISSN 1664-1078. doi: 10.3389/fpsyg. 2012.00151. URL https://www.frontiersin.org/journals/psychology/articles/10.3389/ fpsyg.2012.00151

work page doi:10.3389/fpsyg 2012
[17]

Seth, and Christopher L

Alexander Tschantz, Beren Millidge, Anil K. Seth, and Christopher L. Buckley. Reinforcement learning through active inference. 2020. URLhttps://arxiv.org/abs/2002.12636

work page arXiv 2020
[18]

Ashwood, Nicholas A

Zoe C. Ashwood, Nicholas A. Roy, Iris R. Stone, Anne E. Urai, Anne K. Churchland, Alexandre Pouget, Jonathan W. Pillow, and The International Brain Laboratory. Mice alternate between discrete strategies during perceptual decision-making.Nature Neuroscience, 25(2):201–212, February 2022. ISSN 1546-1726. doi: 10.1038/s41593-021-01007-z. URLhttps://doi.org/1...

work page doi:10.1038/s41593-021-01007-z 2022
[19]

Multi-intention inverse q-learning for interpretable behavior representation.Transactions on Machine Learning Research, 2024

Hao Zhu, Brice De La Crompe, Gabriel Kalweit, Artur Schneider, Maria Kalweit, Ilka Diester, and Joschka Boedecker. Multi-intention inverse q-learning for interpretable behavior representation.Transactions on Machine Learning Research, 2024. URLhttps://openreview.net/forum?id=hrKHkmLUFk

2024
[20]

Markowitz, and Anqi Wu

Jingyang Ke, Feiyang Wu, Jiyi Wang, Jeffrey E. Markowitz, and Anqi Wu. Inverse reinforcement learning with switching rewards and history dependency for characterizing animal behaviors. InProceedings of the 42nd International Conference on Machine Learning, 2025. URL https://openreview.net/forum? id=yUxVZBYaQA. ICML 2025 poster

2025
[21]

Improving generalization for temporal difference learning: The successor representation

Peter Dayan. Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5(4):613–624, 1993. doi: 10.1162/neco.1993.5.4.613

work page doi:10.1162/neco.1993.5.4.613 1993
[22]

Stachenfeld, Matthew M

Kimberly L. Stachenfeld, Matthew M. Botvinick, and Samuel J. Gershman. The hippocampus as a predictive map.Nature Neuroscience, 20(11):1643–1653, November 2017. ISSN 1546-1726. doi: 10.1038/nn.4650. URLhttps://doi.org/10.1038/nn.4650

work page doi:10.1038/nn.4650 2017
[23]

Predictive learning as a network mechanism for extracting low-dimensional latent space representations.Nature Communications, 12(1):1417, March 2021

Stefano Recanatesi, Matthew Farrell, Guillaume Lajoie, Sophie Deneve, Mattia Rigotti, and Eric Shea-Brown. Predictive learning as a network mechanism for extracting low-dimensional latent space representations.Nature Communications, 12(1):1417, March 2021. ISSN 2041-1723. doi: 10.1038/s41467-021-21696-1. URLhttps://doi.org/10.1038/s41467-021-21696-1

work page doi:10.1038/s41467-021-21696-1 2021
[24]

Andrea Banino, Caswell Barry, Benigno Uria, Charles Blundell, Timothy Lillicrap, Piotr Mirowski, Alexander Pritzel, Martin J. Chadwick, Thomas Degris, Joseph Modayil, Greg Wayne, Hubert Soyer, Fabio Viola, Brian Zhang, Ross Goroshin, Neil Rabinowitz, Razvan Pascanu, Charlie Beattie, Stig Petersen, Amir Sadik, Stephen Gaffney, Helen King, Koray Kavukcuoglu...

work page doi:10.1038/s41586-018-0102-6 2018
[25]

Whittington, Timothy H

James C.R. Whittington, Timothy H. Muller, Shirley Mark, Guifen Chen, Caswell Barry, Neil Burgess, and Timothy E.J. Behrens. The tolman-eichenbaum machine: Unifying space and relational memory through generalization in the hippocampal formation.Cell, 183(5):1249–1263.e23, 2020. ISSN 0092-8674. doi: https://doi.org/10.1016/j.cell.2020.10.024. URL https://w...

work page doi:10.1016/j.cell.2020.10.024 2020
[26]

Di Tullio, Spencer Rooke, and Vijay Balasubramanian

Zhaoze Wang, Ronald W. Di Tullio, Spencer Rooke, and Vijay Balasubramanian. Time makes space: Emergence of place fields in networks encoding temporally continuous sensory experiences.bioRxiv, page 2024.08.11.607484, July 2025. doi: 10.1101/2024.08.11.607484. Preprint

work page doi:10.1101/2024.08.11.607484 2024
[27]

Latent representations in hippocampal network model co-evolve with behavioral exploration of task structure.Nature Communications, 15(1):687, January 2024

Ian Cone and Claudia Clopath. Latent representations in hippocampal network model co-evolve with behavioral exploration of task structure.Nature Communications, 15(1):687, January 2024. ISSN 2041-

2024
[28]

URLhttps://doi.org/10.1038/s41467-024-44871-6

doi: 10.1038/s41467-024-44871-6. URLhttps://doi.org/10.1038/s41467-024-44871-6. 11

work page doi:10.1038/s41467-024-44871-6
[29]

Predictive sequence learning in the hippocampal formation.Neuron, 112(15):2645–2658.e4, 2024

Yusi Chen, Huanqiu Zhang, Mia Cameron, and Terrence Sejnowski. Predictive sequence learning in the hippocampal formation.Neuron, 112(15):2645–2658.e4, 2024. ISSN 0896-6273. doi: https:// doi.org/10.1016/j.neuron.2024.05.024. URL https://www.sciencedirect.com/science/article/ pii/S0896627324003714

work page doi:10.1016/j.neuron.2024.05.024 2024
[30]

The reorganization and reactivation of hippocampal maps predict spatial memory performance.Nature Neuroscience, 13(8): 995–1002, August 2010

David Dupret, Joseph O’Neill, Barty Pleydell-Bouverie, and Jozsef Csicsvari. The reorganization and reactivation of hippocampal maps predict spatial memory performance.Nature Neuroscience, 13(8): 995–1002, August 2010. doi: 10.1038/nn.2599

work page doi:10.1038/nn.2599 2010
[31]

Dylan Rich, Albert K

Huanqiu Zhang, P. Dylan Rich, Albert K. Lee, and Tatyana O. Sharpee. Hippocampal spatial representations exhibit a hyperbolic geometry that expands with experience.Nature Neuroscience, 26(1):131–139, January 2023. ISSN 1546-1726. doi: 10.1038/s41593-022-01212-4. URL https://doi.org/10.1038/ s41593-022-01212-4

work page doi:10.1038/s41593-022-01212-4 2023
[32]

Zhang, Jonathan P

Wei Guo, Jie J. Zhang, Jonathan P. Newman, and Matthew A. Wilson. Latent learning drives sleep- dependent plasticity in distinct ca1 subpopulations.Cell Reports, 43(12):115028, December 2024. doi: 10.1016/j.celrep.2024.115028

work page doi:10.1016/j.celrep.2024.115028 2024
[33]

Nieh, Manuel Schottdorf, Nicolas W

Edward H. Nieh, Manuel Schottdorf, Nicolas W. Freeman, Ryan J. Low, Sam Lewallen, Sue Ann Koay, Lucas Pinto, Jeffrey L. Gauthier, Carlos D. Brody, and David W. Tank. Geometry of abstract learned knowledge in the hippocampus.Nature, 595(7865):80–84, July 2021. ISSN 1476-4687. doi: 10.1038/ s41586-021-03652-7. URLhttps://doi.org/10.1038/s41586-021-03652-7

work page doi:10.1038/s41586-021-03652-7 2021
[34]

Distinct manifold encoding of navigational information in the subiculum and hippocampus.Science Advances, 10(5):eadi4471, 2024

Shinya Nakai, Takuma Kitanishi, and Kenji Mizuseki. Distinct manifold encoding of navigational information in the subiculum and hippocampus.Science Advances, 10(5):eadi4471, 2024. doi: 10.1126/ sciadv.adi4471. URLhttps://www.science.org/doi/abs/10.1126/sciadv.adi4471. A Methods A.1 Maze environment The maze is represented as a full binary tree with N= 127...

work page doi:10.1126/sciadv.adi4471 2024

[1] [1]

The free-energy principle: a unified brain theory? Nature Reviews Neuroscience, 11(2), 127-138

Karl Friston. The free-energy principle: a unified brain theory?Nature Reviews Neuroscience, 11(2): 127–138, February 2010. ISSN 1471-0048. doi: 10.1038/nrn2787. URL https://doi.org/10.1038/ nrn2787

work page doi:10.1038/nrn2787 2010

[2] [2]

Little and Friedrich T

Daniel Y . Little and Friedrich T. Sommer. Learning and exploration in action-perception loops.Frontiers in Neural Circuits, V olume 7 - 2013, 2013. ISSN 1662-5110. doi: 10.3389/fncir.2013.00037. URLhttps:// www.frontiersin.org/journals/neural-circuits/articles/10.3389/fncir.2013.00037

work page doi:10.3389/fncir.2013.00037 2013

[3] [3]

Theoretical perspectives on active sensing.Current Opinion in Behavioral Sciences, 11:100–108, 2016

Scott Cheng-Hsin Yang, Daniel M Wolpert, and Máté Lengyel. Theoretical perspectives on active sensing.Current Opinion in Behavioral Sciences, 11:100–108, 2016. ISSN 2352-1546. doi: https:// doi.org/10.1016/j.cobeha.2016.06.009. URL https://www.sciencedirect.com/science/article/ pii/S2352154616301255. Computational modeling

work page doi:10.1016/j.cobeha.2016.06.009 2016

[4] [4]

The active inference approach to ecological perception: General information dynamics for natural and artificial embodied cognition

Adam Linson, Andy Clark, Subramanian Ramamoorthy, and Karl Friston. The active inference approach to ecological perception: General information dynamics for natural and artificial embodied cognition. Frontiers in Robotics and AI, V olume 5 - 2018, 2018. ISSN 2296-9144. doi: 10.3389/frobt.2018.00021. URL https://www.frontiersin.org/journals/robotics-and-ai...

work page doi:10.3389/frobt.2018.00021 2018

[5] [5]

Cognitive maps in rats and men.Psychological Review, 55(4):189–208, 1948

Edward C Tolman. Cognitive maps in rats and men.Psychological Review, 55(4):189–208, 1948. doi: 10.1037/h0061626. URLhttps://psycnet.apa.org/record/1949-00103-001

work page doi:10.1037/h0061626 1948

[6] [6]

Clarendon Press, Oxford, UK,

John O’Keefe and Lynn Nadel.The Hippocampus as a Cognitive Map. Clarendon Press, Oxford, UK,

[7] [7]

Rikhye, Nishad Gothoskar, J

Dileep George, Rajeev V . Rikhye, Nishad Gothoskar, J. Swaroop Guntupalli, Antoine Dedieu, and Miguel Lázaro-Gredilla. Clone-structured graph representations enable flexible learning and vicarious evaluation of cognitive maps.Nature Communications, 12(1):2392, 2021. doi: 10.1038/s41467-021-22559-5. URL https://www.nature.com/articles/s41467-021-22559-5

work page doi:10.1038/s41467-021-22559-5 2021

[8] [8]

Swaroop Guntupalli, Guangyao Zhou, Carter Wendelken, Miguel Lázaro- Gredilla, and Dileep George

Rajkumar Vasudeva Raju, J. Swaroop Guntupalli, Guangyao Zhou, Carter Wendelken, Miguel Lázaro- Gredilla, and Dileep George. Space is a latent sequence: A theory of the hippocampus.Science Advances, 10(31):eadm8470, 2024. doi: 10.1126/sciadv.adm8470. URL https://www.science.org/doi/abs/ 10.1126/sciadv.adm8470

work page doi:10.1126/sciadv.adm8470 2024

[9] [9]

Sequential predictive learning is a unifying theory for hippocampal representation and replay.bioRxiv, 2024

Daniel Levenstein, Aleksei Efremov, Roy Henha Eyono, Adrien Peyrache, and Blake Richards. Sequential predictive learning is a unifying theory for hippocampal representation and replay.bioRxiv, 2024. doi: 10.1101/2024.04.28.591528. URL https://www.biorxiv.org/content/early/2024/04/29/2024. 04.28.591528

work page doi:10.1101/2024.04.28.591528 2024

[10] [10]

Mice in a labyrinth show rapid learning, sudden insight, and efficient exploration.eLife, 10:e66175, jul 2021

Matthew Rosenberg, Tony Zhang, Pietro Perona, and Markus Meister. Mice in a labyrinth show rapid learning, sudden insight, and efficient exploration.eLife, 10:e66175, jul 2021. ISSN 2050-084X. doi: 10.7554/eLife.66175. URLhttps://doi.org/10.7554/eLife.66175

work page doi:10.7554/elife.66175 2021

[11] [11]

Rajesh P. N. Rao and Dana H. Ballard. Predictive coding in the visual cortex: a functional interpretation of some extra-classical receptive-field effects.Nature Neuroscience, 2(1):79–87, January 1999. doi: 10.1038/4580

work page doi:10.1038/4580 1999

[12] [12]

A theory of cortical responses.Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456):815–836, April 2005

Karl Friston. A theory of cortical responses.Philosophical Transactions of the Royal Society B: Biological Sciences, 360(1456):815–836, April 2005. doi: 10.1098/rstb.2005.1622. 10

work page doi:10.1098/rstb.2005.1622 2005

[13] [13]

Active sensing with predictive coding and uncertainty minimization.Patterns, 5(6):100983, June 2024

Abdelrahman Sharafeldin, Nabil Imam, and Hannah Choi. Active sensing with predictive coding and uncertainty minimization.Patterns, 5(6):100983, June 2024. ISSN 2666-3899. doi: 10.1016/j.patter.2024. 100983. URLhttps://doi.org/10.1016/j.patter.2024.100983

work page doi:10.1016/j.patter.2024 2024

[14] [14]

Infomax control of eye movements.IEEE Transactions on Autonomous Mental Development, 2(2):91–107, 2010

Nicholas J Butko and Javier R Movellan. Infomax control of eye movements.IEEE Transactions on Autonomous Mental Development, 2(2):91–107, 2010

2010

[15] [15]

Rajesh P. N. Rao, Dimitrios C. Gklezakos, and Vishwas Sathish. Active predictive coding: A unifying neural model for active perception, compositional learning, and hierarchical planning.Neural Computation, 36(1):1–32, 12 2023. ISSN 0899-7667. doi: 10.1162/neco_a_01627. URL https://doi.org/10.1162/ neco_a_01627

work page doi:10.1162/neco_a_01627 2023

[16] [16]

The stability-plasticity dilemma: investigating the continuum from catastrophic forgetting to age-limited learning effects.Fron- tiers in Psychology, V olume 4 - 2013, 2013

Karl Friston, Rick Adams, Laurent Perrinet, and Michael Breakspear. Perceptions as hypotheses: Saccades as experiments.Frontiers in Psychology, V olume 3 - 2012, 2012. ISSN 1664-1078. doi: 10.3389/fpsyg. 2012.00151. URL https://www.frontiersin.org/journals/psychology/articles/10.3389/ fpsyg.2012.00151

work page doi:10.3389/fpsyg 2012

[17] [17]

Seth, and Christopher L

Alexander Tschantz, Beren Millidge, Anil K. Seth, and Christopher L. Buckley. Reinforcement learning through active inference. 2020. URLhttps://arxiv.org/abs/2002.12636

work page arXiv 2020

[18] [18]

Ashwood, Nicholas A

Zoe C. Ashwood, Nicholas A. Roy, Iris R. Stone, Anne E. Urai, Anne K. Churchland, Alexandre Pouget, Jonathan W. Pillow, and The International Brain Laboratory. Mice alternate between discrete strategies during perceptual decision-making.Nature Neuroscience, 25(2):201–212, February 2022. ISSN 1546-1726. doi: 10.1038/s41593-021-01007-z. URLhttps://doi.org/1...

work page doi:10.1038/s41593-021-01007-z 2022

[19] [19]

Multi-intention inverse q-learning for interpretable behavior representation.Transactions on Machine Learning Research, 2024

Hao Zhu, Brice De La Crompe, Gabriel Kalweit, Artur Schneider, Maria Kalweit, Ilka Diester, and Joschka Boedecker. Multi-intention inverse q-learning for interpretable behavior representation.Transactions on Machine Learning Research, 2024. URLhttps://openreview.net/forum?id=hrKHkmLUFk

2024

[20] [20]

Markowitz, and Anqi Wu

Jingyang Ke, Feiyang Wu, Jiyi Wang, Jeffrey E. Markowitz, and Anqi Wu. Inverse reinforcement learning with switching rewards and history dependency for characterizing animal behaviors. InProceedings of the 42nd International Conference on Machine Learning, 2025. URL https://openreview.net/forum? id=yUxVZBYaQA. ICML 2025 poster

2025

[21] [21]

Improving generalization for temporal difference learning: The successor representation

Peter Dayan. Improving generalization for temporal difference learning: The successor representation. Neural Computation, 5(4):613–624, 1993. doi: 10.1162/neco.1993.5.4.613

work page doi:10.1162/neco.1993.5.4.613 1993

[22] [22]

Stachenfeld, Matthew M

Kimberly L. Stachenfeld, Matthew M. Botvinick, and Samuel J. Gershman. The hippocampus as a predictive map.Nature Neuroscience, 20(11):1643–1653, November 2017. ISSN 1546-1726. doi: 10.1038/nn.4650. URLhttps://doi.org/10.1038/nn.4650

work page doi:10.1038/nn.4650 2017

[23] [23]

Predictive learning as a network mechanism for extracting low-dimensional latent space representations.Nature Communications, 12(1):1417, March 2021

Stefano Recanatesi, Matthew Farrell, Guillaume Lajoie, Sophie Deneve, Mattia Rigotti, and Eric Shea-Brown. Predictive learning as a network mechanism for extracting low-dimensional latent space representations.Nature Communications, 12(1):1417, March 2021. ISSN 2041-1723. doi: 10.1038/s41467-021-21696-1. URLhttps://doi.org/10.1038/s41467-021-21696-1

work page doi:10.1038/s41467-021-21696-1 2021

[24] [24]

Andrea Banino, Caswell Barry, Benigno Uria, Charles Blundell, Timothy Lillicrap, Piotr Mirowski, Alexander Pritzel, Martin J. Chadwick, Thomas Degris, Joseph Modayil, Greg Wayne, Hubert Soyer, Fabio Viola, Brian Zhang, Ross Goroshin, Neil Rabinowitz, Razvan Pascanu, Charlie Beattie, Stig Petersen, Amir Sadik, Stephen Gaffney, Helen King, Koray Kavukcuoglu...

work page doi:10.1038/s41586-018-0102-6 2018

[25] [25]

Whittington, Timothy H

James C.R. Whittington, Timothy H. Muller, Shirley Mark, Guifen Chen, Caswell Barry, Neil Burgess, and Timothy E.J. Behrens. The tolman-eichenbaum machine: Unifying space and relational memory through generalization in the hippocampal formation.Cell, 183(5):1249–1263.e23, 2020. ISSN 0092-8674. doi: https://doi.org/10.1016/j.cell.2020.10.024. URL https://w...

work page doi:10.1016/j.cell.2020.10.024 2020

[26] [26]

Di Tullio, Spencer Rooke, and Vijay Balasubramanian

Zhaoze Wang, Ronald W. Di Tullio, Spencer Rooke, and Vijay Balasubramanian. Time makes space: Emergence of place fields in networks encoding temporally continuous sensory experiences.bioRxiv, page 2024.08.11.607484, July 2025. doi: 10.1101/2024.08.11.607484. Preprint

work page doi:10.1101/2024.08.11.607484 2024

[27] [27]

Latent representations in hippocampal network model co-evolve with behavioral exploration of task structure.Nature Communications, 15(1):687, January 2024

Ian Cone and Claudia Clopath. Latent representations in hippocampal network model co-evolve with behavioral exploration of task structure.Nature Communications, 15(1):687, January 2024. ISSN 2041-

2024

[28] [28]

URLhttps://doi.org/10.1038/s41467-024-44871-6

doi: 10.1038/s41467-024-44871-6. URLhttps://doi.org/10.1038/s41467-024-44871-6. 11

work page doi:10.1038/s41467-024-44871-6

[29] [29]

Predictive sequence learning in the hippocampal formation.Neuron, 112(15):2645–2658.e4, 2024

Yusi Chen, Huanqiu Zhang, Mia Cameron, and Terrence Sejnowski. Predictive sequence learning in the hippocampal formation.Neuron, 112(15):2645–2658.e4, 2024. ISSN 0896-6273. doi: https:// doi.org/10.1016/j.neuron.2024.05.024. URL https://www.sciencedirect.com/science/article/ pii/S0896627324003714

work page doi:10.1016/j.neuron.2024.05.024 2024

[30] [30]

The reorganization and reactivation of hippocampal maps predict spatial memory performance.Nature Neuroscience, 13(8): 995–1002, August 2010

David Dupret, Joseph O’Neill, Barty Pleydell-Bouverie, and Jozsef Csicsvari. The reorganization and reactivation of hippocampal maps predict spatial memory performance.Nature Neuroscience, 13(8): 995–1002, August 2010. doi: 10.1038/nn.2599

work page doi:10.1038/nn.2599 2010

[31] [31]

Dylan Rich, Albert K

Huanqiu Zhang, P. Dylan Rich, Albert K. Lee, and Tatyana O. Sharpee. Hippocampal spatial representations exhibit a hyperbolic geometry that expands with experience.Nature Neuroscience, 26(1):131–139, January 2023. ISSN 1546-1726. doi: 10.1038/s41593-022-01212-4. URL https://doi.org/10.1038/ s41593-022-01212-4

work page doi:10.1038/s41593-022-01212-4 2023

[32] [32]

Zhang, Jonathan P

Wei Guo, Jie J. Zhang, Jonathan P. Newman, and Matthew A. Wilson. Latent learning drives sleep- dependent plasticity in distinct ca1 subpopulations.Cell Reports, 43(12):115028, December 2024. doi: 10.1016/j.celrep.2024.115028

work page doi:10.1016/j.celrep.2024.115028 2024

[33] [33]

Nieh, Manuel Schottdorf, Nicolas W

Edward H. Nieh, Manuel Schottdorf, Nicolas W. Freeman, Ryan J. Low, Sam Lewallen, Sue Ann Koay, Lucas Pinto, Jeffrey L. Gauthier, Carlos D. Brody, and David W. Tank. Geometry of abstract learned knowledge in the hippocampus.Nature, 595(7865):80–84, July 2021. ISSN 1476-4687. doi: 10.1038/ s41586-021-03652-7. URLhttps://doi.org/10.1038/s41586-021-03652-7

work page doi:10.1038/s41586-021-03652-7 2021

[34] [34]

Distinct manifold encoding of navigational information in the subiculum and hippocampus.Science Advances, 10(5):eadi4471, 2024

Shinya Nakai, Takuma Kitanishi, and Kenji Mizuseki. Distinct manifold encoding of navigational information in the subiculum and hippocampus.Science Advances, 10(5):eadi4471, 2024. doi: 10.1126/ sciadv.adi4471. URLhttps://www.science.org/doi/abs/10.1126/sciadv.adi4471. A Methods A.1 Maze environment The maze is represented as a full binary tree with N= 127...

work page doi:10.1126/sciadv.adi4471 2024