Reinforcement learning policies using elapsed time since odor detection and exponentially filtered local wind direction outperform cast-and-surge in simulated turbulent plumes with mild mean wind and show optimal performance at intermediate memory times in isotropic turbulence.
Clock-state olfactory search in turbulent flows using Q-learning: The geometry of plume recovery
1 Pith paper cite this work. Polarity classification is still indexing.
abstract
Finding an odor source in a turbulent flow requires effectively leveraging the history of olfactory observations into a robust navigation strategy. In this work, we use tabular Q-learning to train an olfactory search agent with a minimal memory of past observations: only a running clock since the last whiff. This agent learns an interpretable strategy to recover the plume which combines well-known behaviors observed in insects: surging, casting, and a return downwind. While achieving good performance on data from direct numerical simulations of turbulence, the agent is limited by an inability to adapt its strategy to the local intermittency level; we show that providing more flexibility improves robustness.
fields
physics.flu-dyn 1years
2026 1verdicts
UNVERDICTED 1representative citing papers
citing papers explorer
-
Smart strategies to navigate turbulent odor plumes reorienting to local wind
Reinforcement learning policies using elapsed time since odor detection and exponentially filtered local wind direction outperform cast-and-surge in simulated turbulent plumes with mild mean wind and show optimal performance at intermediate memory times in isotropic turbulence.