A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem

Zhengyao Jiang , Dixing Xu , Jinjun Liang

Authors on Pith no claims yet

classification 💱 q-fin.CP cs.AIq-fin.PM

keywords frameworklearningthreefinancialmanagementportfoliocryptocurrencydeep

read the original abstract

Financial portfolio management is the process of constant redistribution of a fund into different financial products. This paper presents a financial-model-free Reinforcement Learning framework to provide a deep machine learning solution to the portfolio management problem. The framework consists of the Ensemble of Identical Independent Evaluators (EIIE) topology, a Portfolio-Vector Memory (PVM), an Online Stochastic Batch Learning (OSBL) scheme, and a fully exploiting and explicit reward function. This framework is realized in three instants in this work with a Convolutional Neural Network (CNN), a basic Recurrent Neural Network (RNN), and a Long Short-Term Memory (LSTM). They are, along with a number of recently reviewed or published portfolio-selection strategies, examined in three back-test experiments with a trading period of 30 minutes in a cryptocurrency market. Cryptocurrencies are electronic and decentralized alternatives to government-issued money, with Bitcoin as the best-known example of a cryptocurrency. All three instances of the framework monopolize the top three positions in all experiments, outdistancing other compared trading algorithms. Although with a high commission rate of 0.25% in the backtests, the framework is able to achieve at least 4-fold returns in 50 days.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 6 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Prediction Arena: Benchmarking AI Models on Real-World Prediction Markets
cs.LG 2026-03 unverdicted novelty 7.0

Frontier AI models lose 16-31% trading on Kalshi over 57 days but show better results on Polymarket, with platform design strongly affecting outcomes and prediction accuracy mattering more than research volume.
A Meta Reinforcement Learning Approach to Goals-Based Wealth Management
cs.LG 2026-05 unverdicted novelty 6.0

MetaRL pre-trained on GBWM problems delivers near-optimal dynamic strategies in 0.01s achieving 97.8% of DP optimal utility and handles larger problems where DP fails.
SBCA: Cross-Modal BERT-driven Actor-Critic for Multi-Asset Portfolio Optimization
q-fin.CP 2026-05 unverdicted novelty 6.0

SBCA is a reinforcement learning framework using BERT cross-modal fusion and Actor-Critic to integrate price data with sentiment text for multi-asset portfolio optimization with practical trading constraints.
When Missing Becomes Structure: Intent-Preserving Policy Completion from Financial KOL Discourse
cs.LG 2026-04 unverdicted novelty 6.0

KICL completes execution decisions in KOL financial discourse using offline RL, achieving top returns and Sharpe ratios with no unsupported trades or direction changes on YouTube and X data from 2022-2025.
Portfolio Optimization Proxies under Label Scarcity and Regime Shifts via Bayesian and Deterministic Students under Semi-Supervised Sandwich Training
cs.LG 2026-04 unverdicted novelty 4.0

A semi-supervised teacher-student framework enables neural networks to proxy CVaR portfolio optimization using synthetic data augmentation for scarce labels and regime shifts.
A Systematic Review of Recent Advancements in PINN Augmented Deep Learning and Mathematical Modeling for Efficient Portfolio Management
math.OC 2026-04 unverdicted novelty 2.0

A systematic review of physics-informed neural networks and mathematical modeling approaches for portfolio optimization and management in finance.