arxiv: 2604.15463 · v2 · submitted 2026-04-16 · 💱 q-fin.PM · math.OC

Recognition: unknown

Risk-Sensitive Investment Management via Free Energy-Entropy Duality

Sebastien Lleo, Wolfgang Runggaldier

Authors on Pith no claims yet

Pith reviewed 2026-05-10 09:00 UTC · model grok-4.3

classification 💱 q-fin.PM math.OC

keywords risk-sensitive portfoliobenchmarked investmentfree energy-entropy dualitylinear-quadratic-Gaussian gameentropic regularizationKelly strategystochastic differential gamefactor model

0 comments

The pith

The free energy-entropy duality reformulates benchmarked risk-sensitive portfolio problems as linear-quadratic-Gaussian games under an equivalent measure with entropic regularization, producing quadratic value functions and explicit affine

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that the free energy-entropy duality converts a benchmarked risk-sensitive portfolio problem in a factor-based setting into an equivalent linear-quadratic-Gaussian stochastic differential game. This reformulation includes entropic regularization and delivers a quadratic value function together with explicit affine feedback controls for the optimal allocations. A sympathetic reader would care because the approach supplies a direct analytical solution, clarifies the meaning of risk sensitivity through two interpretations of the optimal portfolio as a fractional Kelly strategy or an entropically adjusted Kelly portfolio, embeds prior change-of-measure techniques, and supports reinforcement-learning implementations, as confirmed numerically on U.S. equity data.

Core claim

The duality yields a direct solution of the benchmarked problem by reformulating it as a linear-quadratic-Gaussian stochastic differential game under a suitable equivalent probability measure, with an entropic regularization. The resulting value function is quadratic, the optimal controls are explicit affine feedback maps, and the optimal allocation admits two complementary interpretations: as a fractional Kelly strategy and as a Kelly portfolio adjusted via the entropic regularization.

What carries the argument

The free energy-entropy duality, which recasts the risk-sensitive objective as an entropically regularized linear-quadratic-Gaussian game under a changed probability measure.

Load-bearing premise

The free energy-entropy duality applies directly to the benchmarked risk-sensitive portfolio problem in the factor-based setting while preserving the quadratic structure and explicit controls.

What would settle it

A direct solution of the original risk-sensitive problem that produces a non-quadratic value function or an optimal allocation that cannot be expressed as the claimed fractional Kelly strategy would show the duality reformulation does not hold.

Figures

Figures reproduced from arXiv: 2604.15463 by Sebastien Lleo, Wolfgang Runggaldier.

read the original abstract

We study a benchmarked risk-sensitive portfolio problem in a factor-based setting to bring together three strands of the literature: benchmarked risk-sensitive investment management, the Kuroda-Nagai change-of-measure method, and the free energy-entropy duality of Dai Pra et al. (1996). We show that the duality yields a direct solution of the benchmarked problem by reformulating it as a linear-quadratic-Gaussian stochastic differential game under a suitable equivalent probability measure, with an entropic regularization. The resulting value function is quadratic, the optimal controls are explicit affine feedback maps, and the optimal allocation admits two complementary interpretations: as a fractional Kelly strategy and as a Kelly portfolio adjusted via the entropic regularization. This formulation, therefore, contributes both a direct analytical route to the solution and a clearer interpretation of risk sensitivity, thereby embedding the classical Kuroda-Nagai change-of-measure approach within a more general framework. An added benefit of this formulation is that it is suitable for implementation via an RL algorithm. A simple implementation on U.S. equity data illustrates the tractability of the framework and numerically confirms the equivalence of the two approaches.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

This paper applies free energy-entropy duality to benchmarked risk-sensitive portfolio problems, yielding explicit quadratic solutions and dual Kelly interpretations in a factor model.

read the letter

The main point is that the authors use the free energy-entropy duality to recast a benchmarked risk-sensitive investment problem as an LQG stochastic differential game with entropic regularization. This produces a quadratic value function, explicit affine controls, and two views of the optimal allocation as a fractional Kelly strategy or a regularized Kelly portfolio. It also places the Kuroda-Nagai change-of-measure inside a wider setup that supports RL methods. The U.S. equity example shows the two approaches line up numerically and that the framework is workable in practice. They do a clean job of linking the three literature strands without adding restrictions that would break the quadratic structure or explicit forms. The RL angle is a practical plus that goes beyond pure theory. The soft spots are limited and mostly about verification. The abstract states the duality applies directly in the factor setting, but the full derivations would need checking to confirm no hidden conditions affect the controls or value function. The numerical illustration is confirmatory rather than a full stress test, so the risk sensitivity parameter and regularization strength could influence outcomes depending on calibration, though this does not touch the core analytical claims. No circularity appears in the reformulation itself. This work suits specialists in quantitative finance who focus on stochastic control, risk measures, or information-theoretic portfolio methods. A reader already comfortable with LQG games or Kelly criteria would pick up the new interpretations and RL link quickly. The reasoning is coherent and engages the cited literature directly. I would bring it to a reading group for the Kelly views and RL angle. I would not cite it in my own work soon because the scope is narrow. It deserves peer review to examine the derivations and perhaps expand the numerical checks.

Referee Report

2 major / 2 minor

Summary. The paper studies a benchmarked risk-sensitive portfolio problem in a factor-based setting, combining elements from benchmarked risk-sensitive investment management, the Kuroda-Nagai change-of-measure method, and the free energy-entropy duality of Dai Pra et al. (1996). It claims that the duality reformulates the problem as a linear-quadratic-Gaussian stochastic differential game under an equivalent probability measure with entropic regularization. This yields a quadratic value function, explicit affine feedback optimal controls, and dual interpretations of the optimal allocation as a fractional Kelly strategy and as a Kelly portfolio adjusted via entropic regularization. The framework is presented as suitable for RL implementation, with a numerical illustration on U.S. equity data confirming tractability and equivalence of the approaches.

Significance. If the reformulation holds without structural breakdown, the work offers a direct analytical route to solutions of the benchmarked problem and embeds the Kuroda-Nagai approach in a more general duality framework, providing clearer interpretations of risk sensitivity. The explicit affine controls and RL compatibility are practical strengths. The numerical example on U.S. equity data supports implementability, though its value as confirmation depends on parameter independence.

major comments (2)

[Main derivation / reformulation section] The weakest assumption is that the free energy-entropy duality applies directly to the benchmarked factor-based problem, preserving the LQG structure and explicit controls. The derivation (following the model setup) must explicitly verify that the benchmark process and factor dynamics introduce no terms that invalidate the quadratic value function or affine form; otherwise the central claim does not follow.
[Numerical illustration section] Numerical confirmation on U.S. equity data: the paper states it illustrates tractability and confirms equivalence, but the risk sensitivity parameter and entropic regularization strength are free parameters. Without explicit reporting of how they are selected (independent of the confirmation dataset), the numerical results risk circularity and do not fully substantiate the analytical equivalence.

minor comments (2)

[Abstract / Introduction] The abstract and introduction should include a brief equation or statement clarifying the precise form of the entropic regularization term in the equivalent measure to aid readability.
[References] Ensure the full bibliographic details for Dai Pra et al. (1996) and any other key references are complete and consistent.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and detailed comments. We address each major comment point by point below, indicating the revisions planned for the next version of the manuscript.

read point-by-point responses

Referee: [Main derivation / reformulation section] The weakest assumption is that the free energy-entropy duality applies directly to the benchmarked factor-based problem, preserving the LQG structure and explicit controls. The derivation (following the model setup) must explicitly verify that the benchmark process and factor dynamics introduce no terms that invalidate the quadratic value function or affine form; otherwise the central claim does not follow.

Authors: We agree that an explicit verification step will improve transparency. In the revised manuscript we will insert, immediately after the model setup, a short lemma-style paragraph confirming that the augmented state vector (factors plus benchmark process) retains linear dynamics and that the objective remains quadratic under the entropic change of measure. Because the benchmark enters linearly and the running cost is quadratic in the deviation from the benchmark, no nonlinear terms are generated; the LQG structure is therefore preserved and the quadratic value function together with the affine feedback controls follow directly from the duality. This addition will make the central claim fully rigorous without altering any existing results. revision: yes
Referee: [Numerical illustration section] Numerical confirmation on U.S. equity data: the paper states it illustrates tractability and confirms equivalence, but the risk sensitivity parameter and entropic regularization strength are free parameters. Without explicit reporting of how they are selected (independent of the confirmation dataset), the numerical results risk circularity and do not fully substantiate the analytical equivalence.

Authors: The analytical equivalence between the original problem and the dual game is established in Sections 3–4 independently of any numerical choice. The U.S. equity example serves only to demonstrate computational tractability. In the revision we will add an explicit statement that the risk-sensitivity parameter and the entropic-regularization strength are fixed a priori using representative values drawn from the existing risk-sensitive investment literature (e.g., moderate risk aversion levels reported in prior studies) and are not tuned or validated on the confirmation dataset. We will also reiterate that the theoretical equivalence holds for any fixed parameter pair, thereby eliminating any appearance of circularity. revision: yes

Circularity Check

0 steps flagged

No significant circularity; derivation relies on external duality and standard reformulation

full rationale

The paper's central derivation applies the free energy-entropy duality from the external reference Dai Pra et al. (1996) to reformulate the benchmarked risk-sensitive problem as an equivalent LQG stochastic differential game under a changed measure, yielding a quadratic value function and explicit affine feedback controls. This step is a direct mathematical equivalence under the cited duality and does not reduce to self-definition, fitted inputs renamed as predictions, or load-bearing self-citations. The numerical U.S. equity illustration is described only as confirming tractability and equivalence of approaches, without any claim that fitted parameters generate the analytical results. The framework remains self-contained against the external duality and stochastic control theory.

Axiom & Free-Parameter Ledger

2 free parameters · 3 axioms · 0 invented entities

The paper rests on the prior duality result and standard stochastic control assumptions. Free parameters are the risk sensitivity coefficient and entropic regularization strength, both typical in such models and likely chosen or fitted. No new entities are postulated.

free parameters (2)

risk sensitivity parameter
Controls the degree of risk aversion in the objective function; appears as a free parameter in the risk-sensitive criterion.
entropic regularization strength
Determines the weight of the entropy term in the duality reformulation; required to obtain the equivalent game.

axioms (3)

domain assumption Asset returns follow a linear factor model with affine drift and diffusion coefficients.
Invoked in the factor-based setting to enable the LQG structure.
domain assumption The free energy-entropy duality holds for the class of problems considered, as established by Dai Pra et al. (1996).
Central assumption enabling the change to an equivalent measure and game formulation.
standard math Solutions to the resulting linear-quadratic-Gaussian stochastic differential game exist and are quadratic.
Standard existence result in stochastic control invoked to guarantee explicit affine controls.

pith-pipeline@v0.9.0 · 5499 in / 1796 out tokens · 55941 ms · 2026-05-10T09:00:57.646117+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

30 extracted references

[1]

Bensoussan, A. (1992). Stochastic Control of Partially Observable Systems . Cambridge University Press

1992
[2]

& Pliska, S

Bielecki, T. & Pliska, S. (1999). Risk-sensitive dynamic asset management. Applied Mathematics and Optimization , 39, 337--360

1999
[3]

& Pliska, S

Bielecki, T. & Pliska, S. (2000). Risk sensitive asset management with transaction costs. Finance and Stochastics , 4, 1--33

2000
[4]

& Pliska, S

Bielecki, T. & Pliska, S. (2003). Economic properties of the risk sensitive criterion for portfolio management. The Review of Accounting and Finance , 2(2), 3--17

2003
[5]

& Pliska, S

Bielecki, T. & Pliska, S. (2004). Risk sensitive intertemporal CAPM . IEEE Transactions on Automatic Control , 49(3), 420--432

2004
[6]

R., Chen, T., & Cialenco, I

Bielecki, T. R., Chen, T., & Cialenco, I. (2022). Risk- Sensitive Markov Decision Problems under Model Uncertainty : Finite Time Horizon Case . In G. Yin & T. Zariphopoulou (Eds.), Stochastic Analysis , Filtering , and Stochastic Optimization : A Commemorative Volume to Honor Mark H . A . Davis 's Contributions (pp.\ 33--52). Cham: Springer International ...

2022
[7]

Carhart, M. (1997). On persistence in mutual fund performance. The Journal of Finance , 52(1), 57--82

1997
[8]

Dai Pra, P., Meneghini, L., & Runggaldier, W. J. (1996). Connections between stochastic control and dynamic games. Mathematics of Control, Signals and Systems , 9(4), 303--326

1996
[9]

& Lleo, S

Davis, M. & Lleo, S. (2008). Risk-sensitive benchmarked asset management. Quantitative Finance , 8(4), 415--426

2008
[10]

& Lleo, S

Davis, M. & Lleo, S. (2011). Jump-diffusion risk-sensitive asset management I : Diffusion factor model. SIAM Journal on Financial Mathematics , 2, 22--54

2011
[11]

& Lleo, S

Davis, M. & Lleo, S. (2013a). Black- L itterman in continuous time: The case for filtering. Quantitative Finance Letters , 1(1), 30--35
[12]

& Lleo, S

Davis, M. & Lleo, S. (2013b). Jump-diffusion risk-sensitive asset management ii: Jump-diffusion factor model. SIAM Journal on Control and Optimization , 51(2), 1441
[13]

& Lleo, S

Davis, M. & Lleo, S. (2013c). Jump-diffusion risk-sensitive benchmarked asset management. In H. Gassmann & W. Ziemba (Eds.), Stochastic Programming: Applications in Finance, Energy, Planning and Logistics (pp.\ 97--128).: World Scientific Publishing
[14]

& Lleo, S

Davis, M. & Lleo, S. (2014). Risk-Sensitive Investment Management , volume 19 of Advanced Series on Statistical Science and Applied Probability . World Scientific Publishing

2014
[15]

& Lleo, S

Davis, M. & Lleo, S. (2015). Jump-diffusion asset-liability management via risk-sensitive control. OR Spectrum , 37(3), 655--675

2015
[16]

& Lleo, S

Davis, M. & Lleo, S. (2021). Risk-sensitive benchmarked asset management with expert forecasts. Mathematical Finance , 31(4), 1162--1189

2021
[17]

& Lleo, S

Davis, M. & Lleo, S. (2024). Jump-diffusion risk-sensitive benchmarked asset management with traditional and alternative data. Annals of Operations Research , 336, 661--689

2024
[18]

& French, K

Fama, E. & French, K. R. (2015). A five-factor asset pricing model. Journal of Financial Economics , 116(1), 1--22

2015
[19]

& Soner, H

Fleming, W. & Soner, H. (2006). Controlled Markov Processes and Viscosity Solutions , volume 25 of Stochastic Modeling and Applied Probability . Springer-Verlag, 2nd edition

2006
[20]

Hata, H. (2017). Risk-sensitive asset management in a general diffusion factor model: Risk-seeking case. Japan Journal of Industrial and Applied Mathematics , 34(1), 59--98

2017
[21]

Hata, H. (2018). Risk-sensitive portfolio optimization problem for a large trader with inside information. Japan Journal of Industrial and Applied Mathematics , 35(3), 1037--1063

2018
[22]

Hata, H. (2021). Risk- Sensitive Asset Management with Lognormal Interest Rates . Asia-Pacific Financial Markets , 28(2), 169--206

2021
[23]

& Iida, Y

Hata, H. & Iida, Y. (2006). A risk-sensitive stochastic control approach to an optimal investment problem with partial information. Finance and Stochastics , 10(3), 395--426

2006
[24]

& Nagai, H

Kuroda, K. & Nagai, H. (2002). Risk-sensitive portfolio optimization on infinite time horizon. Stochastics and Stochastics Reports , 73, 309--331

2002
[25]

& Shiryaev, A

Liptser, R. & Shiryaev, A. (2004). Statistics of Random Processes: I. General Theory . Probability and Its Applications. Springer-Verlag, 2 edition

2004
[26]

& MacLean, L

Lleo, S. & MacLean, L. C. (2025). Dual dominance: how Harry Markowitz and William Ziemba impacted portfolio management. Annals of Operations Research , 346(1), 181--216

2025
[27]

& Runggaldier, W

Lleo, S. & Runggaldier, W. (2024). On the separation of estimation and control in risk-sensitive investment problems under incomplete observation. European Journal of Operational Research , 316(1), 200--214

2024
[28]

& Runggaldier, W

Lleo, S. & Runggaldier, W. (2026). Reinforcement learning for risk-sensitive investment management: A free energy--entropy duality approach. Preprint

2026
[29]

Meneghini, L. (1994). Modelli Risolvibili per Problemi di Controllo di Sistemi Dinamici Imprecisi Multivariati . PhD thesis, Universit\` a Degli Studi Di Padova

1994
[30]

& Peng, S

Nagai, H. & Peng, S. (2002). Risk-sensitive dynamic portfolio optimization with partial information on infinite time horizon. The Annals of Applied Probability , 12(1), 173--195

2002