Emission-Aware Reinforcement Learning for Sustainable Electric Vehicle Charging and Carbon Dioxide Reduction Under Varying Renewable Penetration
Pith reviewed 2026-06-30 13:06 UTC · model grok-4.3
The pith
An emission-aware reinforcement learning agent using Soft Actor-Critic reduces EV charging carbon intensity to 23.96 gCO2/kWh under 50% wind penetration.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The SAC agent achieves a carbon intensity as low as 23.96 grams of carbon dioxide per kilowatt-hour under 50% wind penetration, representing up to 87% emission reduction versus the uncontrolled baseline, and outperforms the external graph-based Power Distribution Network benchmark. Transformer overload stays below 7 kWh across scenarios compared with up to 1093 kWh for the As Fast As Possible heuristic, and renewable self-consumption reaches 52% under combined wind and solar supply.
What carries the argument
The multi-objective reward in the Soft Actor-Critic algorithm that penalizes carbon emissions, curtailed renewables, and unmet demand inside the EV2Gym environment with time-varying carbon intensity in the state.
If this is right
- Charging schedules align with low-emission grid periods while keeping overloads low.
- Renewable self-consumption improves to 52% under combined wind and solar supply.
- The agent outperforms both heuristic and model-predictive baselines across all tested renewable shares.
- Grid compliance and user satisfaction are preserved alongside the emission reductions.
Where Pith is reading between the lines
- The same reward structure could be adapted to include vehicle-to-grid discharge during high-carbon periods.
- Scaling the method to residential chargers would require adjusting the state for different arrival and departure patterns.
- Embedding live carbon-price signals in the reward might further increase alignment with market incentives.
Load-bearing premise
The EV2Gym simulator with behind-the-meter solar and wind profiles and EirGrid carbon intensity data sufficiently represents real-world EV user behavior, grid constraints, and renewable variability.
What would settle it
A controlled deployment on live EVSE units connected to a real distribution feeder that records actual carbon intensity and measures whether emission reductions reach the simulated 87% level versus an uncontrolled baseline.
Figures
read the original abstract
The rapid growth of Electric Vehicle (EV) adoption challenges power distribution networks through peak load spikes, voltage instability, and transformer overloads from uncoordinated charging. While Model Predictive Control (MPC) and standard Reinforcement Learning (RL) methods have addressed these issues, existing approaches rarely treat real-time carbon intensity or fluctuating renewable energy (RE) availability as primary scheduling objectives, leaving substantial decarbonisation potential unrealised. This paper proposes an emission-aware RL strategy based on the Soft Actor Critic (SAC) algorithm, with a multi-objective reward that penalises carbon emissions, curtailed on-site renewables, and unmet user demand. The agent is trained within a unified benchmarking framework on the EV2Gym platform, incorporating behind-the-meter solar and wind profiles, time-varying EirGrid carbon intensity data, and realistic workplace EV behaviour across 25 Electric Vehicle Supply Equipment (EVSE) units. Nine control strategies, including heuristics, emission-aware MPC variants, and the proposed RL agent, are compared under five renewable penetration scenarios (0%-50%) over ten independent runs each. The RL agent achieves a carbon intensity as low as 23.96 grams of carbon dioxide per kilowatt-hour under 50% wind penetration, representing up to 87% emission reduction versus the uncontrolled baseline, and outperforms the external graph-based Power Distribution Network (PDN) benchmark. Transformer overload remains below 7 kWh across scenarios, against up to 1093 kWh for the As Fast As Possible (AFAP) heuristic, and renewable self-consumption reaches 52% under combined wind and solar supply. Embedding carbon intensity forecasts into the RL state and reward aligns charging with low-emission periods while preserving grid compliance and user satisfaction.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript proposes an emission-aware Soft Actor-Critic (SAC) reinforcement learning agent for EV charging scheduling. It uses a multi-objective reward penalizing carbon emissions, renewable curtailment, and unmet demand, trained in the EV2Gym simulator incorporating behind-the-meter solar/wind profiles, EirGrid carbon intensity data, and realistic workplace EV behavior across 25 EVSE units. Nine strategies (heuristics, MPC variants, RL) are compared over five renewable penetration scenarios (0-50%) in ten runs each. The central claims are that the RL agent reaches a carbon intensity of 23.96 gCO2/kWh at 50% wind penetration (up to 87% reduction vs. uncontrolled baseline), outperforms the graph-based PDN benchmark, keeps transformer overload below 7 kWh (vs. 1093 kWh for AFAP), and achieves 52% renewable self-consumption.
Significance. If the simulation results are robust and transferable, the work demonstrates that embedding carbon forecasts into RL state and reward can yield substantial emission reductions in EV charging while maintaining grid compliance and user satisfaction, outperforming both simple heuristics and MPC under varying renewable levels. The multi-scenario evaluation and use of real carbon intensity data provide a useful benchmark for sustainable control strategies.
major comments (2)
- [Abstract/Methods] Abstract and Methods: The reported performance numbers (23.96 gCO2/kWh, 87% reduction, outperformance vs. PDN) are obtained from ten independent runs but supply no reward-function weights for the multi-objective SAC, no SAC training hyperparameters, and no statistical significance tests or sensitivity analysis on user-arrival stochasticity; these omissions make the central quantitative claims impossible to reproduce or assess for robustness.
- [Methods] Methods: The 87% emission-reduction claim and outperformance results rest entirely on the EV2Gym simulator's internal models of EV user behavior, behind-the-meter renewable profiles, and grid constraints, yet no validation against empirical charging-session logs, grid telemetry, or sensitivity tests on renewable variability/curtailment dynamics is reported; this is load-bearing because any systematic mismatch would render the metrics simulation artifacts rather than transferable findings.
Simulated Author's Rebuttal
We thank the referee for the constructive comments on reproducibility and robustness. We address each major point below and will revise the manuscript to improve transparency and add requested analyses where feasible.
read point-by-point responses
-
Referee: [Abstract/Methods] Abstract and Methods: The reported performance numbers (23.96 gCO2/kWh, 87% reduction, outperformance vs. PDN) are obtained from ten independent runs but supply no reward-function weights for the multi-objective SAC, no SAC training hyperparameters, and no statistical significance tests or sensitivity analysis on user-arrival stochasticity; these omissions make the central quantitative claims impossible to reproduce or assess for robustness.
Authors: We agree the manuscript omitted these details. In the revision we will explicitly report the reward-function weights for emissions, curtailment and unmet demand; the complete SAC hyperparameters (learning rate, discount factor, batch size, network architecture, entropy coefficient schedule); results of statistical significance tests (e.g., paired t-tests or Wilcoxon tests across the ten runs); and a sensitivity analysis that perturbs user-arrival distributions while keeping other factors fixed. revision: yes
-
Referee: [Methods] Methods: The 87% emission-reduction claim and outperformance results rest entirely on the EV2Gym simulator's internal models of EV user behavior, behind-the-meter renewable profiles, and grid constraints, yet no validation against empirical charging-session logs, grid telemetry, or sensitivity tests on renewable variability/curtailment dynamics is reported; this is load-bearing because any systematic mismatch would render the metrics simulation artifacts rather than transferable findings.
Authors: The study is a controlled simulation benchmark using real EirGrid carbon-intensity traces and the EV2Gym environment. We will add sensitivity experiments that systematically vary renewable generation profiles and curtailment parameters to quantify robustness. Full empirical validation against proprietary charging-session logs and grid telemetry is not possible in the current work because such datasets were unavailable to the authors. revision: partial
- Empirical validation of simulator models against real-world charging-session logs and grid telemetry, as such data were not accessible.
Circularity Check
No significant circularity; results are forward simulation comparisons
full rationale
The paper's central claims consist of empirical performance metrics (carbon intensity of 23.96 gCO2/kWh, up to 87% reduction vs baseline, outperformance vs heuristics/MPC/PDN) obtained by training an SAC agent with a multi-objective reward in the EV2Gym simulator and running forward simulations under five renewable scenarios. No equations, fitted parameters, or self-citations are shown that reduce these metrics to the inputs by construction. The reward design and state embedding are explicit choices, not tautological reductions. Comparisons to external baselines and ten independent runs make the evaluation self-contained and falsifiable outside the paper's fitted values.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
Licence: CC BY 4.0
IEA, Global ev outlook 2024,https://www.iea.org/reports/global-ev-outlook-2024, 2024. Licence: CC BY 4.0
2024
-
[2]
Rossi, C
F. Rossi, C. Diaz-Londono, Y. Li, C. Zou, G. Gruosso, Smart electric vehicle charging algorithm to reduce the impact on power grids: A reinforcement learning based methodology, IEEE Open Journal of Vehicular Technology (2025)
2025
-
[3]
Panda, S
S. Panda, S. Ganguly, A multi-objective optimization model for smart ev charging scheduling considering the benefits of ev owners and distribution network operators, in: 2024 23rd National Power Systems Conference (NPSC), IEEE, 2024, pp. 1–6
2024
-
[4]
J.Chu,B.Gilmore,J.Hassol,A.Jenn,S.Lommele,L.Myers,H.Richardson,A.Schroeder,M.Shah,NationalElectricVehicleInfrastructure Formula Program Annual Report: Plan Year 2022-2023, Technical Report, National Renewable Energy Laboratory (NREL), Golden, CO (United States), 2023
2022
-
[5]
Şengör, O
İ. Şengör, O. Erdinç, B. Yener, A. Taşcıkaraoğlu, J. P. Catalao, Optimal energy management of ev parking lots under peak load reduction based dr programs considering uncertainty, IEEE Transactions on Sustainable Energy 10 (2018) 1034–1043
2018
-
[6]
K. S. Holkar, L. M. Waghmare, An overview of model predictive control, International Journal of control and automation 3 (2010) 47–63
2010
-
[7]
Kempton, J
W. Kempton, J. Tomić, Vehicle-to-grid power fundamentals: Calculating capacity and net revenue, Journal of power sources 144 (2005) 268–279
2005
-
[8]
M. Dan, A. Easwaran, A cooperative bargaining game framework for vehicle-to-vehicle energy sharing and trading at charging stations, in: 2024 IEEE 27th International Conference on Intelligent Transportation Systems (ITSC), IEEE, 2024, pp. 96–103
2024
-
[9]
Deep Reinforcement Learning: An Overview
Y. Li, Deep reinforcement learning: An overview, arXiv preprint arXiv:1701.07274 (2017)
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[10]
Y.Yang,H.-G.Yeh,R.Nguyen, Arobustmodelpredictivecontrol-basedschedulingapproachforelectricvehiclechargingwithphotovoltaic systems, IEEE Systems Journal 17 (2022) 111–121
2022
-
[11]
Orfanoudakis, C
S. Orfanoudakis, C. Diaz-Londono, Y. Yılmaz, P. Palensky, P. Vergara, Ev2gym: A flexible v2g simulator for ev smart charging research and benchmarking, 2024
2024
-
[12]
Y. E. Yılmaz, S. Orfanoudakis, P. P. Vergara, Reinforcement learning for optimized ev charging through power setpoint tracking, in: 2024 IEEE PES Innovative Smart Grid Technologies Europe (ISGT EUROPE), IEEE, 2024, pp. 1–5
2024
-
[13]
A.-M.Koufakis,E.S.Rigas,N.Bassiliades,S.D.Ramchurn, Offlineandonlineelectricvehiclechargingschedulingwithv2venergytransfer, IEEE Transactions on Intelligent Transportation Systems 21 (2019) 2128–2138
2019
-
[14]
U.Qureshi,A.Ghosh,B.K.Panigrahi, Multiobjectivepareto-optimalintelligentelectricvehiclechargingscheduleinacommercialcharging station: A stochastic convex optimization approach, IEEE Transactions on Industrial Informatics (2024)
2024
-
[15]
Hosseini, A
S. Hosseini, A. Yassine, M. S. Hossain, Optimizing electric vehicle charging through an artificial intelligence mechanism for smart transportation, IEEE Internet of Things Journal (2024)
2024
-
[16]
W.Kanchana,J.G.Singh,W.Ongsakul,etal., Optimizingelectricvehiclechargingschedules:Minimizingchargingtimeandtraveldistance, in: 2024 International Conference on Sustainable Energy: Energy Transition and Net-Zero Climate Future (ICUE), IEEE, 2024, pp. 1–5
2024
-
[17]
Diaz-Londono, G
C. Diaz-Londono, G. Fambri, P. Maffezzoni, G. Gruosso, Enhanced ev charging algorithm considering data-driven workplace chargers categorization with multiple vehicle types, eTransportation 20 (2024) 100326
2024
-
[18]
D.Qiu,Y.Wang,W.Hua,G.Strbac, Reinforcementlearningforelectricvehicleapplicationsinpowersystems:Acriticalreview, Renewable and Sustainable Energy Reviews 173 (2023) 113052
2023
-
[19]
Zhang, Z
F. Zhang, Z. Wang, Y. Li, C. Zhang, 2019 ieee international conference on communications, control, and computing technologies for smart grids (smartgridcomm) (2019)
2019
-
[20]
C. Yeh, V. Li, R. Datta, J. Arroyo, N. Christianson, C. Zhang, Y. Chen, M. M. Hosseini, A. Golmohammadi, Y. Shi, et al., Sustaingym: Reinforcementlearningenvironmentsforsustainableenergysystems, AdvancesinNeuralInformationProcessingSystems36(2023)59464– 59476
2023
-
[21]
D. Qiu, Y. Wang, Z. Ding, G. Strbac, Graph reinforcement learning for carbon-aware electric vehicles in power-transport networks, IEEE Transactions on Smart Grid 15 (2024) 3919–3935. First Author et al.:Preprint submitted to ElsevierPage 30 of 31 Short Title of the Article
2024
-
[22]
J. Fan, A. Liebman, H. Wang, Safety-aware reinforcement learning for electric vehicle charging station management in distribution network, in: 2024 IEEE Power and Energy Society General Meeting (PESGM), IEEE, 2024. doi:10.1109/PESGM51994.2024.10688856
-
[23]
Q.Zhao,C.Xu,C.Sun,Y.Han, Smartresidentialelectricvehiclecharginganddischargingschedulingviamulti-agentasynchronous-updating deep reinforcement learning, Computers and Electrical Engineering 126 (2025) 110473
2025
-
[24]
C.A.M.Silva,R.J.Bessa, Carbon-awaredynamictariffdesignforelectricvehiclechargingstationswithexplainablestochasticoptimization, Applied Energy 389 (2025) 125674
2025
-
[25]
Meenakumar, M
P. Meenakumar, M. Aunedi, G. Strbac, Optimal business case for provision of grid services through evs with v2g capabilities, in: 2020 Fifteenth International Conference on Ecological Vehicles and Renewable Energies (EVER), IEEE, 2020, pp. 1–10
2020
-
[26]
C. F. Lee, K. Bjurek, V. Hagman, Y. Li, C. Zou, Vehicle-to-grid optimization considering battery aging, IFAC-PapersOnLine 56 (2023) 6624–6629
2023
-
[27]
Saxena, J
S. Saxena, J. MacDonald, S. Moura, Charging ahead on the transition to electric vehicles with standard 120v wall outlets, Applied Energy 157 (2015) 720–728
2015
-
[28]
E. S. Rigas, S. Karapostolakis, N. Bassiliades, S. D. Ramchurn, Evlibsim: A tool for the simulation of electric vehicles’ charging stations using the evlib library, Simulation Modelling Practice and Theory 87 (2018) 99–119
2018
-
[29]
T.Morstyn,K.A.Collett,A.Vijay,M.Deakin,S.Wheeler,S.M.Bhagavathy,F.Fele,M.D.McCulloch, Open:Anopen-sourceplatformfor developing smart local energy system applications, Applied Energy 275 (2020) 115397
2020
-
[30]
G.Karatzinis,C.Korkas,M.Terzopoulos,C.Tsaknakis,A.Stefanopoulou,I.Michailidis,E.Kosmatopoulos,Chargym:Anevchargingstation modelforcontrollerbenchmarking, in:IFIPInternationalConferenceonArtificialIntelligenceApplicationsandInnovations,Springer,2022, pp. 241–252
2022
-
[31]
EirGrid and SONI, Smart Grid Dashboard: CO2 Intensity,https://www.smartgriddashboard.com/roi/co2/, 2025
2025
-
[32]
B. Liu, W. Ni, R. P. Liu, Y. J. Guo, H. Zhu, Optimal electric vehicle charging strategies for long-distance driving, IEEE Transactions on Vehicular Technology 73 (2023) 4949–4960
2023
-
[33]
URL:https://www.ipcc-nggip.iges.or.jp/public/2006gl/, prepared by the National Greenhouse Gas Inventories Programme
Intergovernmental Panel on Climate Change (IPCC), 2006 IPCC Guidelines for National Greenhouse Gas Inventories, Institute for Global Environmental Strategies (IGES), Hayama, Japan, 2006. URL:https://www.ipcc-nggip.iges.or.jp/public/2006gl/, prepared by the National Greenhouse Gas Inventories Programme
2006
-
[34]
URL:https://ghgprotocol
World Resources Institute (WRI), World Business Council for Sustainable Development (WBCSD), GHG Protocol Scope 2 Guidance: An Amendment to the GHG Protocol Corporate Standard, World Resources Institute, Washington, DC, 2015. URL:https://ghgprotocol. org/sites/default/files/2023-03/Scope%202%20Guidance.pdf
2015
-
[35]
EuropeanCommissionJointResearchCentre,PhotovoltaicGeographicalInformationSystem(PVGIS),https://re.jrc.ec.europa.eu/ pvg_tools/en/, 2025
2025
-
[36]
Holttinen, J
H. Holttinen, J. Kiviluoma, D. Flynn, J. C. Smith, A. Orths, P. B. Eriksen, N. Cutululis, L. Söder, M. Korpås, A. Estanqueiro, et al., System impactstudiesfornear100%renewableenergysystemsdominatedbyinverterbasedvariablegeneration, IEEETransactionsonPowerSystems 37 (2020) 3249–3258
2020
-
[37]
Dudurych, M
I. Dudurych, M. Holly, M. Power, Integration of wind power generation in the irish grid, in: 2006 IEEE Power Engineering Society General Meeting, IEEE, 2006, pp. 8–pp
2006
-
[38]
Hurtado, T
M. Hurtado, T. Kërçi, S. Tweed, E. Kennedy, N. Kamaluddin, F. Milano, Analysis of wind energy curtailment in the ireland and northern ireland power systems, in: 2023 IEEE Power & Energy Society General Meeting (PESGM), IEEE, 2023, pp. 1–5
2023
-
[39]
A. H. Ganesh, B. Xu, A review of reinforcement learning based energy management systems for electrified powertrains: Progress, challenge, and potential solution, Renewable and Sustainable Energy Reviews 154 (2022) 111833
2022
-
[40]
Raffin, A
A. Raffin, A. Hill, A. Gleave, A. Kanervisto, M. Ernestus, N. Dormann, Stable-baselines3: Reliable reinforcement learning implementations, Journal of machine learning research 22 (2021) 1–8
2021
-
[41]
J. Fan, A. Liebman, H. Wang, Safety-aware reinforcement learning for electric vehicle charging station management in distribution network, in: 2024 IEEE Power & Energy Society General Meeting (PESGM), IEEE, 2024, pp. 1–5
2024
-
[42]
U.Damodarin,G.C.Cardarilli,L.DiNunzio,M.Re,S.Spanò, Smartelectricvehiclechargingmanagementusingreinforcementlearningon FPGA platforms, Sensors 25 (2025) 2585
2025
-
[43]
K. Park, I. Moon, Multi-agent deep reinforcement learning approach for ev charging scheduling in a smart grid, Applied Energy 328 (2022) 120111
2022
-
[44]
URL:http://data
European Parliament and the Council of the European Union, Regulation (eu) 2023/1804 of the european parliament and of the council of 13 september 2023 on the deployment of alternative fuels infrastructure, and repealing directive 2014/94/eu, 2023. URL:http://data. europa.eu/eli/reg/2023/1804/oj, text with EEA relevance
2023
-
[45]
URL:http://data.europa.eu/eli/reg_del/2025/656/oj, published in the Official Journal of the European Union, 18 June 2025
EuropeanCommission,Commissiondelegatedregulation(eu)2025/656of2april2025amendingregulation(eu)2023/1804oftheeuropean parliamentandofthecouncilasregardsstandardsforwirelessrecharging,electricroadsystem,vehicle-to-gridcommunicationandhydrogen supply for road transport vehicles, 2025. URL:http://data.europa.eu/eli/reg_del/2025/656/oj, published in the Offici...
2025
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.