Real-time reinforcement learning for turbulent state-dependent control in a bluff-body wake
Pith reviewed 2026-05-21 22:38 UTC · model grok-4.3
The pith
A reinforcement learning agent learns real-time state-dependent control of a turbulent bluff-body wake from sparse onboard sensors alone and reduces drag with net energy savings.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The REACT agent autonomously converges to a policy that reduces aerodynamic drag while achieving net energy savings by dynamically suppressing spatiotemporally coherent flow structures in the bluff-body wake, achieving two to four times greater performance than model-based baseline controllers, and learns a single offline policy that remains effective across Reynolds numbers 86400 to 518400 by training in nondimensional space and conditioning on Reynolds number for temporal adaptation.
What carries the argument
The REACT reinforcement learning agent trained directly from sparse onboard sensor measurements in a nondimensional state-reward space with Reynolds-number conditioning.
If this is right
- The policy suppresses spatiotemporally coherent instabilities rather than adjusting only the mean flow.
- Net energy savings accompany the drag reduction because the control avoids unnecessary actuation.
- A single policy generalizes across a factor-of-six range in Reynolds number without retraining.
- State-dependent, dynamics-aware control outperforms representative quasi-steady baselines in this turbulent regime.
Where Pith is reading between the lines
- The same sensor-only learning approach could be tested on other separated flows such as airfoils or vehicles at scale.
- If the nondimensional formulation holds at even higher Reynolds numbers, model-free control might extend to industrial turbulent systems.
- Similar agents could be examined for multi-objective goals such as simultaneous drag and noise reduction.
Load-bearing premise
Sparse onboard sensor measurements alone contain sufficient information for the reinforcement learning agent to discover and stably execute a high-performance state-dependent control policy in a real high-Reynolds-number turbulent environment without any turbulence model or prior flow physics knowledge.
What would settle it
Deploy the learned policy at a Reynolds number well above 518400 or with substantially fewer sensors and measure whether drag reduction and net energy savings collapse.
read the original abstract
Controlling turbulent dynamics remains a major challenge because of its chaotic, multi-scale dynamics, which strongly influence the performance of many fluid systems. Here we report REACT (Reinforcement Learning for Environmental Adaptation and Control of Turbulence), an autonomous reinforcement learning framework for real-time state-dependent control of turbulent wake dynamics in a real wind-tunnel environment. Deployed on an Ahmed-body model equipped solely with onboard sensors and servo-actuated surfaces, REACT learns directly from sparse experimental measurements in a wind-tunnel environment, bypassing empirical turbulence models. The agent autonomously converges to a policy that reduces aerodynamic drag while achieving net energy savings. Without prior knowledge of flow physics, it discovers that dynamically suppressing spatiotemporally coherent flow structures in the bluff-body wake maximizes energy efficiency, achieving two to four times greater performance than model-based baseline controllers. We contrast the state-dependent, dynamics-aware policy of REACT with representative quasi-steady, mean-flow-oriented policies learned by standard reinforcement learning baselines, which deliver lower drag reduction and no direct suppression of coherent instabilities in this turbulent-wake regime. Finally, by training in a nondimensional state-reward space whose amplitudes are approximately Reynolds-number-invariant, and by conditioning on Reynolds number for temporal adaptation, REACT learns a single offline policy that remains effective across the tested Reynolds-number range 86,400 to 518,400, without retraining. These results demonstrate autonomous closed-loop reinforcement learning control in a high-Reynolds-number wind-tunnel environment and suggest a path toward data-driven state-dependent control of turbulent flows.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents REACT, an autonomous reinforcement learning framework for real-time state-dependent control of turbulent wake dynamics behind a bluff body in a wind-tunnel experiment. Using only onboard sensors and actuators on an Ahmed-body model, the agent learns a policy that reduces drag and achieves net energy savings by suppressing spatiotemporally coherent flow structures, outperforming model-based baselines by a factor of two to four. The approach uses nondimensional state-reward space for generalization across Reynolds numbers from 86,400 to 518,400 without retraining.
Significance. If the central claims hold under additional verification, this would represent a notable experimental demonstration of model-free RL for high-Re turbulent flow control without turbulence models or prior physics knowledge. The cross-Re generalization via nondimensional scaling and the explicit contrast with quasi-steady baselines are strengths that could inform future data-driven aerodynamics work.
major comments (3)
- Abstract and results on performance: the claim of 'two to four times greater performance' and 'direct suppression of coherent instabilities' is not supported by reported error bars, number of independent runs, or statistical tests, which is load-bearing for assessing robustness over model-based baselines.
- Methods section on sensor configuration: no observability metric, sensor placement diagram, or wake-velocity reconstruction error from the sparse onboard pressure/force measurements is provided, leaving open whether the MDP is sufficiently rich to discover and stabilize suppression of spatiotemporally coherent structures rather than quasi-steady mean-flow adjustment.
- RL framework and reward section: the reward weights and scaling factors are listed as free parameters without full specification or sensitivity analysis, which directly affects reproducibility of the reported convergence to a structure-suppressing policy.
minor comments (2)
- Clarify the exact nondimensionalization procedure for the state-reward space and how Reynolds-number conditioning is implemented in the policy network.
- Figure captions for flow visualizations should explicitly label the coherent structures being suppressed and include quantitative measures of suppression (e.g., modal energy reduction).
Simulated Author's Rebuttal
We thank the referee for their thoughtful and constructive review. The comments highlight important aspects of statistical robustness, observability, and reproducibility that we address point by point below. We have prepared revisions to strengthen these elements while preserving the core contributions of the work.
read point-by-point responses
-
Referee: Abstract and results on performance: the claim of 'two to four times greater performance' and 'direct suppression of coherent instabilities' is not supported by reported error bars, number of independent runs, or statistical tests, which is load-bearing for assessing robustness over model-based baselines.
Authors: We agree that explicit statistical support is necessary to substantiate the performance claims. In the revised manuscript we will report results aggregated over five independent experimental runs per controller, include standard-error bars on all drag-reduction and energy-savings metrics, and add two-sample t-tests confirming that the observed 2–4× improvement relative to the quasi-steady baselines is statistically significant (p < 0.01). We will also include spectral analysis of wake-velocity time series demonstrating statistically significant attenuation of the dominant coherent-structure frequencies under the REACT policy. These additions directly address the robustness concern while leaving the reported performance ratios unchanged. revision: yes
-
Referee: Methods section on sensor configuration: no observability metric, sensor placement diagram, or wake-velocity reconstruction error from the sparse onboard pressure/force measurements is provided, leaving open whether the MDP is sufficiently rich to discover and stabilize suppression of spatiotemporally coherent structures rather than quasi-steady mean-flow adjustment.
Authors: We acknowledge the absence of these details. The revised Methods section will include (i) a labeled diagram of the pressure-tap and force-sensor locations on the Ahmed-body model, (ii) an observability Gramian analysis of the chosen state vector, and (iii) quantitative reconstruction error metrics (RMS and spectral) obtained by comparing sparse-sensor estimates against simultaneous PIV measurements in a subset of runs. These additions will demonstrate that the state space captures the essential dynamics of the dominant wake instabilities, supporting the claim that the learned policy targets coherent-structure suppression rather than purely mean-flow adjustment. revision: yes
-
Referee: RL framework and reward section: the reward weights and scaling factors are listed as free parameters without full specification or sensitivity analysis, which directly affects reproducibility of the reported convergence to a structure-suppressing policy.
Authors: We will expand the reward-function description to provide the exact numerical values of all weights and scaling factors used in the reported experiments. In addition, we will include a sensitivity study showing that the emergence of the structure-suppressing policy remains consistent across a ±20 % variation in the primary reward coefficients. These revisions will enable full reproducibility without altering the policy or performance results presented in the original manuscript. revision: yes
Circularity Check
No significant circularity: experimental RL results rest on physical measurements
full rationale
The paper reports an empirical demonstration of model-free RL control in a physical wind-tunnel experiment on an Ahmed body. Performance metrics (drag reduction, energy savings, wake structure suppression) are obtained by direct comparison against physical baselines and quasi-steady policies, not by deriving quantities from fitted parameters or self-referential equations. The nondimensional state-reward space and Reynolds-number conditioning are presented as practical design choices for generalization rather than as outputs of a closed mathematical derivation. No load-bearing self-citations, uniqueness theorems, or ansatzes that reduce to the target claim appear in the text. The central results therefore remain self-contained against external experimental benchmarks.
Axiom & Free-Parameter Ledger
free parameters (1)
- reward weights and scaling factors
axioms (1)
- domain assumption Reinforcement learning algorithms converge to a useful policy when trained on sparse, noisy experimental measurements from a real turbulent flow.
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
REACT learns directly from sparse experimental measurements... autonomously converges to a policy that reduces aerodynamic drag while achieving net energy savings... dynamically suppressing spatiotemporally coherent flow structures
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
physics-informed training that recasts data in terms of dimensionless physical groups... Reynolds-conditioned learning
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
Feynman, R.P., Leighton, R.B., Sands, M.: The Feynman Lectures on Physics vol. 1. Addison- Wesley, Reading, MA (1964)
work page 1964
-
[2]
Hof, B., Westerweel, J., Schneider, T.M., Eckhardt, B.: Finite lifetime of turbulence in shear flows. Nature443(7107), 59–62 (2006)
work page 2006
-
[3]
Nature526(7574), 550–553 (2015)
Barkley, D., Song, B., Mukund, V., Lemoult, G., Avila, M., Hof, B.: The rise of fully turbulent flow. Nature526(7574), 550–553 (2015)
work page 2015
-
[4]
Nature Physics12(3), 245–248 (2016)
Shih, H.-Y., Hsieh, T.-L., Goldenfeld, N.: Ecological collapse and the emergence of travelling waves at the onset of shear turbulence. Nature Physics12(3), 245–248 (2016)
work page 2016
-
[5]
Nature communications10(1), 2277 (2019)
Reetz, F., Kreilos, T., Schneider, T.M.: Exact invariant solution reveals the origin of self- organized oblique turbulent-laminar stripes. Nature communications10(1), 2277 (2019)
work page 2019
-
[6]
Nature communications5(1), 3820 (2014) 18
Huisman, S.G., Van Der Veen, R.C., Sun, C., Lohse, D.: Multiple states in highly turbulent Taylor–Couette flow. Nature communications5(1), 3820 (2014) 18
work page 2014
-
[7]
Science advances8(19), 4786 (2022)
Callaham, J.L., Rigas, G., Loiseau, J.-C., Brunton, S.L.: An empirical mean-field model of symmetry-breaking in a turbulent wake. Science advances8(19), 4786 (2022)
work page 2022
-
[8]
Nature627(8004), 515–521 (2024)
Wit, X.M., Fruchart, M., Khain, T., Toschi, F., Vitelli, V.: Pattern formation by turbulent cascades. Nature627(8004), 515–521 (2024)
work page 2024
-
[9]
Nature Physics13(11), 1135–1140 (2017)
Young, R.M., Read, P.L.: Forward and inverse kinetic energy cascades in Jupiter’s turbulent weather layer. Nature Physics13(11), 1135–1140 (2017)
work page 2017
-
[10]
Applied Mechanics Reviews67(5), 050801 (2015)
Brunton, S.L., Noack, B.R.: Closed-loop turbulence control: Progress and challenges. Applied Mechanics Reviews67(5), 050801 (2015)
work page 2015
-
[11]
Nature communications12(1), 5805 (2021)
Marusic, I., Chandran, D., Rouhi, A., Fu, M.K., Wine, D., Holloway, B., Chung, D., Smits, A.J.: An energy-efficient pathway to turbulent drag reduction. Nature communications12(1), 5805 (2021)
work page 2021
-
[12]
Annual Review of Control, Robotics, and Autonomous Systems5(1), 579–602 (2022)
Shapiro, C.R., Starke, G.M., Gayme, D.F.: Turbulence and control of wind farms. Annual Review of Control, Robotics, and Autonomous Systems5(1), 579–602 (2022)
work page 2022
-
[13]
Annual Review of Fluid Mechanics40(1), 113–139 (2008)
Choi, H., Jeon, W.-P., Kim, J.: Control of flow over a bluff body. Annual Review of Fluid Mechanics40(1), 113–139 (2008)
work page 2008
-
[14]
Annual Review of Fluid Mechanics39(1), 383–417 (2007)
Kim, J., Bewley, T.R.: A linear systems approach to flow control. Annual Review of Fluid Mechanics39(1), 383–417 (2007)
work page 2007
-
[15]
Annual Review of Fluid Mechanics53(1), 311–345 (2021)
Jovanovi´ c, M.R.: From bypass transition to flow control and data-driven turbulence modeling: an input–output viewpoint. Annual Review of Fluid Mechanics53(1), 311–345 (2021)
work page 2021
-
[16]
Nature620(7976), 982–987 (2023)
Kaufmann, E., Bauersfeld, L., Loquercio, A., M¨ uller, M., Koltun, V., Scaramuzza, D.: Champion- level drone racing using deep reinforcement learning. Nature620(7976), 982–987 (2023)
work page 2023
-
[17]
Nature Machine Intelligence6(7), 787–798 (2024)
Han, L., Zhu, Q., Sheng, J., Zhang, C., Li, T., Zhang, Y., Zhang, H., Liu, Y., Zhou, C., Zhao, R., et al.: Lifelike agility and play in quadrupedal robots using reinforcement learning and generative pre-trained models. Nature Machine Intelligence6(7), 787–798 (2024)
work page 2024
-
[18]
Science Robotics9(89), 9579 (2024)
Radosavovic, I., Xiao, T., Zhang, B., Darrell, T., Malik, J., Sreenath, K.: Real-world humanoid locomotion with reinforcement learning. Science Robotics9(89), 9579 (2024)
work page 2024
-
[19]
The International Journal of Robotics Research39(1), 3–20 (2020)
Andrychowicz, O.M., Baker, B., Chociej, M., Jozefowicz, R., McGrew, B., Pachocki, J., Petron, A., Plappert, M., Powell, G., Ray, A.,et al.: Learning dexterous in-hand manipulation. The International Journal of Robotics Research39(1), 3–20 (2020)
work page 2020
-
[20]
Science robotics5(47), 5986 (2020)
Lee, J., Hwangbo, J., Wellhausen, L., Koltun, V., Hutter, M.: Learning quadrupedal locomotion over challenging terrain. Science robotics5(47), 5986 (2020)
work page 2020
-
[21]
Nature602(7897), 414–419 (2022)
Degrave, J., Felici, F., Buchli, J., Neunert, M., Tracey, B., Carpanese, F., Ewalds, T., Hafner, R., Abdolmaleki, A., Las Casas, D.,et al.: Magnetic control of tokamak plasmas through deep reinforcement learning. Nature602(7897), 414–419 (2022)
work page 2022
-
[22]
Cambridge University Press, Cambridge (2000)
Pope, S.B.: Turbulent Flows. Cambridge University Press, Cambridge (2000)
work page 2000
-
[23]
In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp
Tobin, J., Fong, R., Ray, A., Schneider, J., Zaremba, W., Abbeel, P.: Domain randomization for transferring deep neural networks from simulation to the real world. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 23–30 (2017). IEEE
work page 2017
-
[24]
Artificial intelligence101(1-2), 99–134 (1998)
Kaelbling, L.P., Littman, M.L., Cassandra, A.R.: Planning and acting in partially observable 19 stochastic domains. Artificial intelligence101(1-2), 99–134 (1998)
work page 1998
-
[25]
Nature communications16(1), 1422 (2025)
Font, B., Alc´ antara-´Avila, F., Rabault, J., Vinuesa, R., Lehmkuhl, O.: Deep reinforcement learn- ing for active flow control in a turbulent separation bubble. Nature communications16(1), 1422 (2025)
work page 2025
-
[26]
Journal of Fluid Mechanics984, 9 (2024)
Wang, Z., Lin, R., Zhao, Z., Chen, X., Guo, P., Yang, N., Wang, Z., Fan, D.: Learn to flap: Foil non-parametric path planning via deep reinforcement learning. Journal of Fluid Mechanics984, 9 (2024)
work page 2024
-
[27]
Journal of Fluid Mechanics981, 17 (2024)
Xia, C., Zhang, J., Kerrigan, E.C., Rigas, G.: Active flow control for bluff body drag reduction using reinforcement learning with partial measurements. Journal of Fluid Mechanics981, 17 (2024)
work page 2024
-
[28]
Journal of Fluid Mechanics 960, 30 (2023)
Sonoda, T., Liu, Z., Itoh, T., Hasegawa, Y.: Reinforcement learning of control strategies for reducing skin friction drag in a fully developed turbulent channel flow. Journal of Fluid Mechanics 960, 30 (2023)
work page 2023
-
[29]
Ren, F., Rabault, J., Tang, H.: Applying deep reinforcement learning to active flow control in weakly turbulent conditions. Physics of Fluids33(3) (2021)
work page 2021
-
[30]
Rabault, J., Kuchta, M., Jensen, A., R´ eglade, U., Cerardi, N.: Artificial neural networks trained through deep reinforcement learning discover control strategies for active flow control. J. Fluid Mech.865, 281–302 (2019)
work page 2019
-
[31]
Proceedings of the National Academy of Sciences115(23), 5849–5854 (2018)
Verma, S., Novati, G., Koumoutsakos, P.: Efficient collective swimming by harnessing vortices through deep reinforcement learning. Proceedings of the National Academy of Sciences115(23), 5849–5854 (2018)
work page 2018
-
[32]
Communications Engineering1(1), 45 (2022)
Renn, P.I., Gharib, M.: Machine learning for flow-informed aerodynamic control in turbulent wind conditions. Communications Engineering1(1), 45 (2022)
work page 2022
-
[33]
Journal of Fluid Mechanics1009, 3 (2025)
Zong, H., Wu, Y., Li, J., Su, Z., Liang, H.: Closed-loop supersonic flow control with a high-speed experimental deep reinforcement learning framework. Journal of Fluid Mechanics1009, 3 (2025)
work page 2025
-
[34]
Proceedings of the National Academy of Sciences117(42), 26091–26098 (2020)
Fan, D., Yang, L., Wang, Z., Triantafyllou, M.S., Karniadakis, G.E.: Reinforcement learning for bluff body active flow control in experiments and simulations. Proceedings of the National Academy of Sciences117(42), 26091–26098 (2020)
work page 2020
-
[35]
Annual Review of Fluid Mechanics52(1), 477–508 (2020)
Brunton, S.L., Noack, B.R., Koumoutsakos, P.: Machine learning for fluid mechanics. Annual Review of Fluid Mechanics52(1), 477–508 (2020)
work page 2020
-
[36]
Journal of Artificial Intelligence Research76, 201–264 (2023)
Kirk, R., Zhang, A., Grefenstette, E., Rockt¨ aschel, T.: A survey of zero-shot generalisation in deep reinforcement learning. Journal of Artificial Intelligence Research76, 201–264 (2023)
work page 2023
-
[37]
npj Computational Materials9(1), 55 (2023)
Li, K., DeCost, B., Choudhary, K., Greenwood, M., Hattrick-Simpers, J.: A critical examination of robustness and generalizability of machine learning prediction of materials properties. npj Computational Materials9(1), 55 (2023)
work page 2023
-
[38]
SAE transactions, 473–503 (1984)
Ahmed, S.R., Ramm, G., Faltin, G.: Some salient features of the time-averaged ground vehicle wake. SAE transactions, 473–503 (1984)
work page 1984
- [39]
-
[40]
MIT Press, Cambridge, MA (2018)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction, 2nd edn. MIT Press, Cambridge, MA (2018). Chap. 3 20
work page 2018
-
[41]
Soft Actor-Critic Algorithms and Applications
Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., et al.: Soft actor-critic algorithms and applications. arXiv preprint arXiv:1812.05905 (2018)
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[42]
Mamba: Linear-Time Sequence Modeling with Selective State Spaces
Gu, A., Dao, T.: Mamba: Linear-time sequence modeling with selective state spaces. arXiv preprint arXiv:2312.00752 (2023)
work page internal anchor Pith review Pith/arXiv arXiv 2023
-
[43]
Grandemange, M., Gohlke, M., Cadot, O.: Bi-stability in the turbulent wake past parallelepiped bodies with various aspect ratios and wall effects. Physics of Fluids25(9) (2013)
work page 2013
-
[44]
Journal of Fluid Mechanics802, 726–749 (2016)
Brackston, R.D., De La Cruz, J.G., Wynn, A., Rigas, G., Morrison, J.: Stochastic modelling and feedback control of bistability in a turbulent bluff body wake. Journal of Fluid Mechanics802, 726–749 (2016)
work page 2016
-
[45]
Atmospheric turbulence and radio wave propagation, 166–178 (1967)
Lumley, J.L.: The structure of inhomogeneous turbulent flows. Atmospheric turbulence and radio wave propagation, 166–178 (1967)
work page 1967
-
[46]
Annual Review of Fluid Mechanics25(1), 539–575 (1993)
Berkooz, G., Holmes, P., Lumley, J.L.: The proper orthogonal decomposition in the analysis of turbulent flows. Annual Review of Fluid Mechanics25(1), 539–575 (1993)
work page 1993
-
[47]
Journal of Fluid Mechanics755, 5 (2014)
Rigas, G., Oxlade, A., Morgans, A., Morrison, J.: Low-dimensional dynamics of a turbulent axisymmetric wake. Journal of Fluid Mechanics755, 5 (2014)
work page 2014
-
[48]
Journal of Fluids and Structures4(3), 231–257 (1990)
Berger, E., Scholz, D., Schumm, M.: Coherent vortex structures in the wake of a sphere and a circular disk at rest and under forced vibrations. Journal of Fluids and Structures4(3), 231–257 (1990)
work page 1990
-
[49]
Ba, J.L., Kiros, J.R., Hinton, G.E.: Layer normalization. arXiv preprint arXiv:1607.06450 (2016)
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[50]
Bradbury, J., Frostig, R., Hawkins, P., Johnson, M.J., Leary, C., Maclaurin, D., Necula, G., Paszke, A., VanderPlas, J., Wanderman-Milne, S., Zhang, Q.: JAX: Composable Transformations of Python+NumPy programs. http://github.com/jax-ml/jax
-
[51]
Journal of Machine Learning Research22(268), 1–8 (2021)
Raffin, A., Hill, A., Gleave, A., Kanervisto, A., Ernestus, M., Dormann, N.: Stable-baselines3: Reliable reinforcement learning implementations. Journal of Machine Learning Research22(268), 1–8 (2021)
work page 2021
-
[52]
Dao, T., Gu, A.: Transformers are ssms: Generalized models and efficient algorithms through structured state space duality. arXiv preprint arXiv:2405.21060 (2024)
work page internal anchor Pith review Pith/arXiv arXiv 2024
-
[53]
In: International Conference on Learning Representations (2020)
Bouteiller, Y., Ramstedt, S., Beltrame, G., Pal, C., Binas, J.: Reinforcement learning with random delays. In: International Conference on Learning Representations (2020)
work page 2020
-
[54]
Neurocomputing450, 119–128 (2021)
Chen, B., Xu, M., Li, L., Zhao, D.: Delay-aware model-based reinforcement learning for continuous control. Neurocomputing450, 119–128 (2021)
work page 2021
-
[55]
Adam: A Method for Stochastic Optimization
Kingma, D.P., Ba, J.: Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[56]
In: Advances in Neural Information Processing Systems, vol
Paszke, A., Gross, S., Massa, F., Lerer, A., Bradbury, J., Chanan, G., Killeen, T., Lin, Z., Gimelshein, N., Antiga, L., Desmaison, A., Kopf, A., Yang, E., DeVito, Z., Raison, M., Tejani, A., Chilamkurthy, S., Steiner, B., Fang, L., Bai, J., Chintala, S.: Pytorch: An imperative style, high-performance deep learning library. In: Advances in Neural Informat...
work page 2019
-
[57]
Physical Review Fluids9(4), 043902 (2024)
Chatzimanolakis, M., Weber, P., Koumoutsakos, P.: Learning in two dimensions and controlling 21 in three: Generalizable drag reduction strategies for flows past circular cylinders through deep reinforcement learning. Physical Review Fluids9(4), 043902 (2024)
work page 2024
-
[58]
Roshko, A.: Experiments on the flow past a circular cylinder at very high Reynolds number. Journal of Fluid Mechanics10(3), 345–356 (1961) Acknowledgements.We acknowledge support from the UKRI AI for Net Zero grant EP/Y005619/1. J.Z is supported by the President’s Scholarship at Imperial College London. Author contribution.J.Z. developed the learning algo...
work page 1961
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.