Automated Energy-Aware Time-Series Model Deployment on Embedded FPGAs for Resilient Combined Sewer Overflow Management
Pith reviewed 2026-05-18 22:10 UTC · model grok-4.3
The pith
An automated pipeline deploys 8-bit LSTM and Transformer models on FPGAs to forecast sewer overflow levels at very low energy cost.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors establish that an automated hardware-aware deployment pipeline can jointly optimize quantized Transformer and LSTM models for on-device execution on the AMD Spartan-7 XC7S15 FPGA, delivering the selected 8-bit Transformer at MSE 0.0376 and 0.370 mJ per inference or the optimal 8-bit LSTM at MSE 0.0432 and 0.009 mJ per inference when both are trained on 24 hours of historical sewer measurements.
What carries the argument
The automated hardware-aware deployment pipeline that searches model configurations to minimize both prediction error and energy consumption on the target FPGA.
If this is right
- Local FPGA inference keeps sewer-level forecasts available during communication outages.
- LSTM configurations are suitable when energy is the primary constraint while Transformer configurations are preferable when accuracy matters more.
- Integer-only quantization enables efficient mapping onto resource-limited FPGAs without large accuracy penalties.
- The resulting systems support more resilient management of combined sewer networks under intensified rainfall.
Where Pith is reading between the lines
- The same search pipeline could be reused for other sensor-driven infrastructure tasks such as water-quality monitoring or urban flood early warning.
- Running the models on data collected during documented extreme events outside the original training window would test robustness to changing climate statistics.
- Pairing the FPGA with local rain gauges could create a fully standalone overflow-alert node.
Load-bearing premise
The real-world sewer dataset used for training and testing represents the range of conditions that will occur in future extreme weather events, and that 8-bit quantization plus FPGA mapping keeps accuracy high enough for actionable early-warning decisions.
What would settle it
A side-by-side field trial during an actual heavy-rain event that records whether the FPGA predictions trigger timely interventions and match measured overflow volumes better than a simple threshold rule.
Figures
read the original abstract
Extreme weather events, intensified by climate change, increasingly challenge aging combined sewer systems, raising the risk of untreated wastewater overflow. Accurate forecasting of sewer overflow basin filling levels can provide actionable insights for early intervention, helping mitigating uncontrolled discharge. In recent years, AI-based forecasting methods have offered scalable alternatives to traditional physics-based models, but their reliance on cloud computing limits their reliability during communication outages. To address this, we propose an end-to-end forecasting framework that enables energy-efficient inference directly on edge devices. Our solution integrates lightweight Transformer and Long Short-Term Memory (LSTM) models, compressed via integer-only quantization for efficient on-device execution. Moreover, an automated hardware-aware deployment pipeline is used to search for optimal model configurations by jointly minimizing prediction error and energy consumption on an AMD Spartan-7 XC7S15 FPGA. Evaluated on real-world sewer data, the selected 8-bit Transformer model, trained on 24 hours of historical measurements, achieves high accuracy (MSE 0.0376) at an energy cost of 0.370 mJ per inference. In contrast, the optimal 8-bit LSTM model requires significantly less energy (0.009 mJ, over 40x lower) but yields 14.89% worse accuracy (MSE 0.0432) and much longer training time. This trade-off highlights the need to align model selection with deployment priorities, favoring LSTM for ultra-low energy consumption or Transformer for higher predictive accuracy. In general, our work enables local, energy-efficient forecasting, contributing to more resilient combined sewer systems. All code can be found in the GitHub Repository (https://github.com/tianheng-ling/EdgeOverflowForecast).
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents an end-to-end automated pipeline for deploying integer-quantized Transformer and LSTM models on an AMD Spartan-7 XC7S15 FPGA for forecasting sewer basin levels to mitigate combined sewer overflows. Using 24 hours of real-world historical measurements, the authors perform a hardware-aware search that jointly optimizes prediction error and energy consumption, reporting that the selected 8-bit Transformer achieves MSE 0.0376 at 0.370 mJ per inference while the optimal 8-bit LSTM achieves MSE 0.0432 at 0.009 mJ (over 40x lower energy but 14.89% higher error). The work emphasizes the resulting accuracy-energy trade-off and includes a GitHub repository with the full experimental code.
Significance. If the reported hardware measurements hold, the paper offers a concrete, reproducible demonstration of edge-AI deployment for resilient infrastructure monitoring under communication constraints. Strengths include direct FPGA energy measurements rather than simulation, open-source code enabling verification of the quantization and mapping steps, and an explicit comparison of two architectures under the same automated search framework. These elements make the work relevant to both embedded-systems and environmental-engineering communities.
major comments (2)
- [Methods / Automated Deployment Pipeline] The description of the automated hardware-aware search (mentioned in the abstract and presumably detailed in the methods) does not explicitly define the search space, including the ranges or discrete choices for model hyperparameters, quantization bit-widths, and FPGA resource-mapping parameters. This is load-bearing for reproducing the claimed optimal 8-bit configurations and the specific MSE/energy numbers.
- [Results / Experimental Evaluation] The central empirical claims rest on point estimates of MSE (0.0376 and 0.0432) and energy (0.370 mJ and 0.009 mJ) without reported error bars, standard deviations across folds, or a clear statement of the cross-validation procedure. This omission makes it difficult to evaluate the statistical robustness of the 14.89% accuracy difference and the overall trade-off conclusion.
minor comments (2)
- [Abstract] The abstract contains a minor grammatical issue ('helping mitigating' should read 'helping to mitigate').
- [Figures] Ensure that any figures illustrating the energy-accuracy Pareto front or FPGA resource utilization include explicit axis labels, units, and legends for immediate readability.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback, which has helped us strengthen the reproducibility and statistical clarity of the manuscript. We address each major comment below and have revised the paper where appropriate to improve transparency without altering the core claims or results.
read point-by-point responses
-
Referee: [Methods / Automated Deployment Pipeline] The description of the automated hardware-aware search (mentioned in the abstract and presumably detailed in the methods) does not explicitly define the search space, including the ranges or discrete choices for model hyperparameters, quantization bit-widths, and FPGA resource-mapping parameters. This is load-bearing for reproducing the claimed optimal 8-bit configurations and the specific MSE/energy numbers.
Authors: We agree that an explicit definition of the search space is necessary for full reproducibility. The original manuscript described the automated pipeline at a high level and pointed to the open-source GitHub repository for implementation details. In the revised version, we have expanded Section 3 (Methods) with a dedicated subsection and table that enumerates the discrete search space: Transformer hyperparameters (layers: 1-4, hidden dimension: 16-64, heads: 1-4), LSTM hyperparameters (layers: 1-3, hidden size: 16-128), quantization options (primarily 8-bit integer with limited 4/16-bit trials), and FPGA mapping parameters (target clock frequencies, DSP/BRAM utilization bounds, and resource allocation heuristics). These additions directly support reproduction of the reported 8-bit configurations and associated metrics. revision: yes
-
Referee: [Results / Experimental Evaluation] The central empirical claims rest on point estimates of MSE (0.0376 and 0.0432) and energy (0.370 mJ and 0.009 mJ) without reported error bars, standard deviations across folds, or a clear statement of the cross-validation procedure. This omission makes it difficult to evaluate the statistical robustness of the 14.89% accuracy difference and the overall trade-off conclusion.
Authors: We acknowledge the value of statistical robustness indicators. The evaluation used a single chronological train-test split on the 24-hour real-world dataset to emulate realistic deployment with limited historical data and to prevent temporal leakage. In the revised manuscript we have added an explicit statement of this procedure in Section 4. We have also performed and reported results from five independent training runs with different random seeds, now including mean MSE/energy values with standard deviations and error bars in the updated figures and tables. The relative trade-off (approximately 14.89% higher error for the LSTM) remains consistent across runs, supporting the original conclusion while addressing the concern for variability assessment. revision: partial
Circularity Check
No significant circularity: empirical hardware measurements
full rationale
The manuscript describes an end-to-end experimental pipeline that trains, quantizes, and deploys Transformer and LSTM models on an AMD Spartan-7 FPGA, then reports directly measured MSE and energy values on a real-world sewer dataset. No equations, uniqueness theorems, or self-citations are invoked to derive predictions; the reported trade-off (MSE 0.0376 at 0.370 mJ vs. 0.0432 at 0.009 mJ) is the outcome of the automated search and hardware execution rather than a quantity forced by construction from fitted inputs. The argument is therefore self-contained against external benchmarks and receives the default non-circularity finding.
Axiom & Free-Parameter Ledger
free parameters (1)
- quantization bit-width and model hyperparameters
axioms (1)
- domain assumption FPGA energy profiling accurately reflects real deployment consumption
Forward citations
Cited by 1 Pith paper
-
StrikeWatch: Wrist-worn Gait Recognition with Compact Time-series Models on Low-power FPGAs
StrikeWatch deploys four compact time-series models on two low-power FPGAs for on-device heel-forefoot strike classification from wrist IMU data, with the best 6-bit 1D-SepCNN reaching 0.847 F1 at 0.35 µJ and 0.14 ms ...
Reference graph
Works this paper leans on
-
[1]
Climate change adaptation: Infrastructure and extreme weather,
R. F. Allard, “Climate change adaptation: Infrastructure and extreme weather,” Industry, Innovation and Infrastructure , pp. 105–116, 2021
work page 2021
-
[2]
J. A. Van Der Werf, Z. Kapelan, and J. Langeveld, “Towards the long term implementation of real time control of combined sewer systems: A review of performance and influencing factors,” Water Science and Technology, 2022
work page 2022
-
[3]
A. O. Sojobi and T. Zayed, “Impact of sewer overflow on public health: A comprehensive scientometric analysis and systematic review,” Environmental research, vol. 203, p. 111609, 2022
work page 2022
-
[4]
De- tection of untreated sewage discharges to watercourses using Machine Learning,
P. Hammond, M. Suttie, V . T. Lewis, A. P. Smith, and A. C. Singer, “De- tection of untreated sewage discharges to watercourses using Machine Learning,” NPJ Clean Water, vol. 4, no. 1, p. 18, 2021
work page 2021
-
[5]
Overflow prevention and wastewater harmony: Innovative strategies for urban drain management,
A. Baneerjee, H. Ranjan, S. Debdas, A. Srivastava, A. Pandey, and S. Goyal, “Overflow prevention and wastewater harmony: Innovative strategies for urban drain management,” in 2024 IEEE 3rd World Conference on Applied Intelligence and Computing (AIC). IEEE, 2024
work page 2024
-
[6]
Smart management of combined sewer overflows: From an ancient technology to Artificial Intelligence,
M. M. Saddiqi, W. Zhao, S. Cotterill, and R. K. Dereli, “Smart management of combined sewer overflows: From an ancient technology to Artificial Intelligence,” Wiley Interdisciplinary Reviews: Water, 2023
work page 2023
-
[7]
A committee evolutionary neural network for the prediction of combined sewer overflows,
T. Rosin, M. Romano, E. Keedwell, and Z. Kapelan, “A committee evolutionary neural network for the prediction of combined sewer overflows,”Water Resources Management, vol. 35, no. 4, 2021
work page 2021
-
[8]
Data-driven modeling of combined sewer systems for urban sustainability: An empirical evaluation,
V . Singh, T. Ling, T. Chiaburu, and F. Biessmann, “Data-driven modeling of combined sewer systems for urban sustainability: An empirical evaluation,” in The 47th German Conference on AI (2nd Workshop on Public Interest AI) , vol. 3958. CEUR Workshop Proceedings, 2024
work page 2024
-
[9]
——, “Evaluating time series models for urban wastewater management: Predictive performance, model complexity and resilience (in proceed- ings),” in The 10th International Conference on Smart and Sustainable Technologies. IEEE, 2025
work page 2025
-
[10]
Automating versatile time-series analysis with tiny Transformers on embedded FPGAs,
T. Ling, C. Qian, L. Haßler, and G. Schiele, “Automating versatile time-series analysis with tiny Transformers on embedded FPGAs,” in Proceedings of the 2025 IEEE Computer Society Annual Symposium on VLSI (ISVLSI). IEEE, 2025, to appear
work page 2025
-
[11]
M. Nagpal, M. A. Siddique, K. Sharma, N. Sharma, and A. Mittal, “Optimizing wastewater treatment through Artificial Intelligence: Recent advances and future prospects,” Water Science and Technology, vol. 90, no. 3, pp. 731–757, 07 2024
work page 2024
-
[12]
C. Monday, M. S. Zaghloul, D. Krishnamurthy, and G. Achari, “A review of AI-Driven control strategies in the activated sludge process with emphasis on aeration control,” Water, vol. 16, no. 2, 2024
work page 2024
-
[13]
LSTM-based Autoencoder models for real-time quality control of wastewater treatment sensor data,
S. Seshan, D. Vries, J. Immink, A. van der Helm, and J. Poinapen, “LSTM-based Autoencoder models for real-time quality control of wastewater treatment sensor data,” Journal of Hydroinformatics , 2024
work page 2024
-
[14]
A. Alshami, E. Ali, M. Elsayed, A. E. Eltoukhy, and T. Zayed, “IoT innovations in sustainable water and wastewater management and water quality monitoring: A comprehensive review of advancements, implications, and future directions.” IEEE Access, 2024
work page 2024
-
[15]
Towards auto-building of embedded FPGA-based soft sensors for wastewater flow estimation,
T. Ling, C. Qian, and G. Schiele, “Towards auto-building of embedded FPGA-based soft sensors for wastewater flow estimation,” in Annual Congress on Artificial Intelligence of Things . IEEE, 2024
work page 2024
-
[16]
Efficient acceleration of Deep Learning inference on resource-constrained edge devices: A review,
M. M. H. Shuvo, S. K. Islam, J. Cheng, and B. I. Morshed, “Efficient acceleration of Deep Learning inference on resource-constrained edge devices: A review,” Proceedings of the IEEE , vol. 111, no. 1, 2022
work page 2022
-
[17]
Transformers in time series: A survey,
Q. Wen, T. Zhou, C. Zhang, W. Chen, Z. Ma, J. Yan, and L. Sun, “Transformers in time series: A survey,” in Proceedings of the Thirty- Second International Joint Conference on Artificial Intelligence , 2023
work page 2023
-
[18]
Quantization and training of Neural Networks for efficient integer-arithmetic-only inference,
B. Jacob, S. Kligys, B. Chen, M. Zhu, M. Tang, A. Howard, H. Adam, and D. Kalenichenko, “Quantization and training of Neural Networks for efficient integer-arithmetic-only inference,” in Proceedings of the IEEE conference on computer vision and pattern recognition , 2018
work page 2018
-
[19]
Post training quantization after Neural Network,
H. Jiang, Q. Li, and Y . Li, “Post training quantization after Neural Network,” in2022 14th International Conference on Computer Research and Development (ICCRD) , 2022, pp. 1–6
work page 2022
-
[20]
A White Paper on Neural Network Quantization
M. Nagel, M. Fournarakis, R. A. Amjad, Y . Bondarenko, M. Van Baalen, and T. Blankevoort, “A white paper on Neural Network quantization,” arXiv preprint arXiv:2106.08295 , 2021
work page internal anchor Pith review Pith/arXiv arXiv 2021
-
[21]
A review of Artificial Intelligence in embedded systems,
Z. Zhang and J. Li, “A review of Artificial Intelligence in embedded systems,” Micromachines, vol. 14, no. 5, p. 897, 2023
work page 2023
-
[22]
A. Negi, S. Raj, S. Thapa, and S. Indu, Field Programmable Gate Array (FPGA) Based IoT for Smart City Applications . Cham: Springer International Publishing, 2021, pp. 135–158
work page 2021
-
[23]
C. Qian, T. Ling, and G. Schiele, “ElasticAI: creating and deploying energy-efficient deep learning accelerator for pervasive computing,” in International Conference on Pervasive Computing and Communications Workshops and other Affiliated Events . IEEE, 2023, pp. 297–299
work page 2023
-
[24]
C. Qian, T. Ling, C. Cichiwskyj, and G. Schiele, “Configuration- aware approaches for enhancing energy efficiency in FPGA-based Deep Learning accelerators,” Journal of Systems Architecture , 2025
work page 2025
-
[25]
Integer-only quantized Transformers for embedded FPGA-based time-series forecasting in AIoT,
T. Ling, C. Qian, and G. Schiele, “Integer-only quantized Transformers for embedded FPGA-based time-series forecasting in AIoT,” in Annual Congress on Artificial Intelligence of Things . IEEE, 2024, pp. 38–44
work page 2024
-
[26]
C. Qian, T. Ling, and G. Schiele, “Exploring energy efficiency of LSTM accelerators: A parameterized architecture design for embedded FPGAs,” Journal of Systems Architecture , vol. 152, p. 103181, 2024
work page 2024
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.