Hybrid Edge-HPC Systems for Low-Latency Data-Driven Inference
Pith reviewed 2026-05-25 05:40 UTC · model grok-4.3
The pith
RBF decouples low-latency edge inference from delayed HPC model updates using asynchronous surrogate models.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
RBF is a hybrid edge-HPC architecture that integrates low-latency inference with asynchronous simulation-driven model improvement by deploying lightweight surrogate models at the edge and incorporating improved models as they become available from HPC, targeting settings where updates are limited by simulation throughput and scheduling delays.
What carries the argument
Reverse Backfill (RBF), which uses opportunistic HPC computation to improve model accuracy rather than system utilization, while decoupling inference from simulation and training through pluggable surrogate models orchestrated across edge, 5G, cloud, and HPC resources.
If this is right
- Continuous low-latency inference is maintained despite irregular model updates.
- Model fidelity improves progressively as new simulations complete.
- The system quantifies impacts of delayed updates on prediction accuracy in real deployments.
- Pluggable surrogates enable adaptation to various domains and infrastructures.
Where Pith is reading between the lines
- Applications in other sensor-driven domains like environmental monitoring could benefit from similar decoupling.
- Reducing the frequency of full HPC runs might become feasible if surrogates hold accuracy well.
- Extending the approach to include feedback loops where edge data informs the next simulations could be tested.
Load-bearing premise
Lightweight surrogate models at the edge maintain usable accuracy during intervals between asynchronous HPC model updates.
What would settle it
Demonstrating that edge surrogate predictions deviate unacceptably from ground truth during typical delays between model updates in the screenhouse CFD deployment would disprove the claim.
Figures
read the original abstract
Emerging cyber-physical systems increasingly require low-latency inference from streaming sensor data while maintaining models that reflect complex and evolving physical processes. In many domains, however, model updates depend on high-fidelity simulations and training executed on remote high-performance computing (HPC) systems under batch scheduling. This creates a fundamental mismatch between the responsiveness required at the edge and the cost, throughput, and availability of simulation-driven model updates. We present RBF (Reverse Backfill), a hybrid edge-HPC learning and inference architecture that integrates low-latency edge inference with asynchronous, simulation-driven model improvement. RBF targets simulation-bounded settings in which model updates are constrained by simulation throughput and HPC scheduling delays, and reinterprets HPC backfilling by using opportunistic computation to improve model accuracy rather than system utilization. RBF decouples inference from simulation and training by deploying lightweight surrogate models at the edge while incorporating improved models asynchronously as they become available. The architecture supports pluggable surrogate models and orchestrates computation across heterogeneous infrastructure spanning edge devices, private 5G, cloud, and HPC resources. We instantiate RBF using a real-world digital agriculture deployment that couples edge sensing with computational fluid dynamics (CFD) simulations to infer airflow patterns in a large agricultural screenhouse. Our evaluation characterizes end-to-end system behavior under realistic constraints, quantifying simulation latency, training cost, inference throughput, and the impact of delayed model updates on prediction accuracy. Results demonstrate that RBF enables continuous, low-latency inference while improving model fidelity over time despite delayed and irregular model updates.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript introduces RBF (Reverse Backfill), a hybrid edge-HPC architecture that decouples low-latency inference at the edge (via lightweight surrogate models) from asynchronous, simulation-driven model updates on HPC systems. It targets simulation-bounded cyber-physical systems and is instantiated in a real digital agriculture deployment coupling edge sensing with CFD simulations for airflow inference in a screenhouse. The evaluation characterizes end-to-end behavior including simulation latency, training cost, inference throughput, and the impact of delayed model updates on prediction accuracy, claiming that RBF enables continuous low-latency inference with improving fidelity over time despite irregular updates.
Significance. If the results hold, this work is significant for addressing the responsiveness mismatch between edge devices and batch-scheduled HPC in data-driven applications. The real-world deployment and explicit quantification of delayed-update effects on accuracy directly test the viability of edge surrogates between asynchronous updates, rather than leaving the assumption unexamined. The architecture's support for pluggable surrogates and orchestration across edge, private 5G, cloud, and HPC resources is a practical strength.
minor comments (2)
- [Abstract] Abstract: the claim that 'results demonstrate' the benefits would be strengthened by including one or two key quantitative outcomes (e.g., accuracy delta or latency values with error bars) even in the abstract.
- The manuscript would benefit from a short table summarizing the measured end-to-end metrics (simulation latency, inference throughput, accuracy under delay) for quick reference.
Simulated Author's Rebuttal
We thank the referee for their positive assessment of the work and for recommending minor revision. No major comments were provided in the report.
Circularity Check
No significant circularity; architecture and evaluation are self-contained
full rationale
The paper describes a hybrid edge-HPC architecture (RBF) and reports an empirical evaluation on a digital agriculture deployment that measures simulation latency, training cost, inference throughput, and accuracy impact from delayed updates. No equations, fitted parameters renamed as predictions, self-citations used as load-bearing uniqueness theorems, or ansatzes smuggled via prior work appear in the provided text. The central claim is supported by direct measurement of the decoupling assumption rather than by construction or redefinition of inputs. This is the normal case of an externally falsifiable systems paper.
Axiom & Free-Parameter Ledger
Reference graph
Works this paper leans on
-
[1]
“OpenFOAM — openfoam.com,” https://www.openfoam.com/, 2024, [Accessed 26-08-2024]
work page 2024
-
[2]
Openfoam: User guide: snappyhexmesh,
O. , “Openfoam: User guide: snappyhexmesh,” www.openfoam.com, 01
-
[3]
[Online]. Available: https://www.openfoam.com/documentation/ guides/latest/doc/guide-meshing-snappyhexmesh.html
-
[4]
Openfoam: Api guide: poroussimple- foam directory reference,
Openfoam.com, “Openfoam: Api guide: poroussimple- foam directory reference,” 2026. [Online]. Avail- able: https://www.openfoam.com/documentation/guides/v2012/api/dir 03decea705bc0c0c22f85f91452a296f.html
work page 2026
-
[5]
M. Raissi, P. Perdikaris, and G. E. Karniadakis, “Physics-informed neural networks: A deep learning framework for solving forward and inverse problems involving nonlinear partial differential equations,”Journal of Computational physics, vol. 378, pp. 686–707, 2019
work page 2019
-
[6]
Physics- informed neural networks (pinns) for fluid mechanics: A review,
S. Cai, Z. Mao, Z. Wang, M. Yin, and G. E. Karniadakis, “Physics- informed neural networks (pinns) for fluid mechanics: A review,”Acta Mechanica Sinica, vol. 37, no. 12, pp. 1727–1738, 2021
work page 2021
-
[7]
Fourier Neural Operator for Parametric Partial Differential Equations
Z. Li, N. Kovachki, K. Azizzadenesheli, B. Liu, K. Bhattacharya, A. Stuart, and A. Anandkumar, “Fourier neural operator for parametric partial differential equations,”arXiv preprint arXiv:2010.08895, 2020
work page internal anchor Pith review Pith/arXiv arXiv 2010
-
[8]
A note on the use of principal components in regression,
I. T. Jolliffe, “A note on the use of principal components in regression,”Journal of the Royal Statistical Society. Series C (Applied Statistics), vol. 31, no. 3, pp. 300–303, 1982. [Online]. Available: http://www.jstor.org/stable/2348005
- [9]
-
[10]
CSPOT: Portable, Multi-scale Functions-as-a-Service for IoT,
R. Wolski, C. Krintz, F. Bakir, G. George, and W.-T. Lin, “CSPOT: Portable, Multi-scale Functions-as-a-Service for IoT,” inACM Symposium on Edge Computing, 2019, pp. 1–14
work page 2019
-
[11]
CAPLets: Resource Aware, Capability-Based Access Control for IoT,
F. Bakir, C. Krintz, and R. Wolski, “CAPLets: Resource Aware, Capability-Based Access Control for IoT,” inACM/IEEE Symposium on Edge Computing, 2021
work page 2021
-
[12]
srsRAN Project, https://www.srsran.com
-
[13]
Open5GS, https://open5gs.org/
-
[14]
Perlmutter HPC Queue Wait Times,
NERSC, “Perlmutter HPC Queue Wait Times,” 2026. [Online]. Available: https://rest.nersc.gov/rest/mynersc/frames/queuewaittimes.html
work page 2026
-
[15]
L. Kurafeeva, A. Subedi, R. Hartung, M. Fay, A. Biswas, S. Jha, O. Kilic, C. Krintz, A. Merzky, D. Thain, M. Vuran, and R. Wolski, “xgfabric: Coupling sensor networks and hpc facilities with private 5g wireless networks for real-time digital agriculture,” inSC25 Workshop on High Performance Computing, Networking, Storage and Analysis, 2025
work page 2025
-
[16]
Ray: a distributed framework for emerging ai applications,
P. Moritz, R. Nishihara, S. Wang, A. Tumanov, R. Liaw, E. Liang, M. Elibol, Z. Yang, W. Paul, M. I. Jordan, and I. Stoica, “Ray: a distributed framework for emerging ai applications,” inUSENIX Conference on Operating Systems Design and Implementation, 2018
work page 2018
-
[17]
Inferline: latency-aware provisioning and scaling for prediction serving pipelines,
D. Crankshaw, G.-E. Sela, X. Mo, C. Zumar, I. Stoica, J. Gonzalez, and A. Tumanov, “Inferline: latency-aware provisioning and scaling for prediction serving pipelines,” inSymposium on Cloud Computing, 2020
work page 2020
-
[18]
Inference serving with end-to-end latency slos over dynamic edge networks,
V . Nigade, P. Bauszat, H. Bal, and L. Wang, “Inference serving with end-to-end latency slos over dynamic edge networks,”Real-Time Systems, vol. 60, 2024
work page 2024
-
[19]
Ec5: Edge–cloud collaborative computing framework with compressive communication,
J. Tan, F. Liu, B. Wang, Q. Wu, and C. P. Chen, “Ec5: Edge–cloud collaborative computing framework with compressive communication,” Future Generation Computer Systems, vol. 166, 2025
work page 2025
-
[20]
Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,
Y . Kang, J. Hauswald, C. Gao, A. Rovinski, T. Mudge, J. Mars, and L. Tang, “Neurosurgeon: Collaborative intelligence between the cloud and mobile edge,”Proceedings of the Twenty-Second International Conference on Architectural Support for Programming Languages and Operating Systems, Apr. 2017
work page 2017
-
[21]
Edge intelligence: On-demand deep learning model co-inference with device-edge synergy,
E. Li, Z. Zhou, and X. Chen, “Edge intelligence: On-demand deep learning model co-inference with device-edge synergy,” inWorkshop on Mobile Edge Communications, 2018
work page 2018
-
[22]
A. E. Eshratifar, M. S. Abrishami, and M. Pedram, “Jointdnn: An efficient training and inference engine for intelligent mobile cloud computing services,”IEEE Transactions on Mobile Computing, vol. 20, no. 2, 2021
work page 2021
-
[23]
Expanding the cloud-to-edge continuum to the IoT in serverless federated learning,
D. Loconte, F. Ieva, L. Pinto, G. Loseto, F. Scioscia, and M. Ruta, “Expanding the cloud-to-edge continuum to the IoT in serverless federated learning,”Future Generation Computer Systems, vol. 155, 2024
work page 2024
-
[24]
A comprehensive survey of continual learning: Theory, method and application,
L. Wang, X. Zhang, H. Su, and J. Zhu, “A comprehensive survey of continual learning: Theory, method and application,”IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 46, no. 8, pp. 5362–5383, 2024. [Online]. Available: https://arxiv.org/abs/2302.00487
-
[25]
Machine learning for streaming data: state of the art, challenges, and opportunities,
H. M. Gomes, J. Read, A. Bifet, J. P. Barddal, and J. Gama, “Machine learning for streaming data: state of the art, challenges, and opportunities,” ACM SIGKDD Explorations Newsletter, vol. 21, no. 2, pp. 6–22, 2019. [Online]. Available: https://dl.acm.org/doi/10.1145/3373464.3373470
-
[26]
Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators,
L. Lu, P. Jin, G. Pang, Z. Zhang, and G. E. Karniadakis, “Learning nonlinear operators via DeepONet based on the universal approximation theorem of operators,”Nature Machine Intelligence, vol. 3, 2021
work page 2021
-
[27]
A. Marcato, G. Boccardo, and D. Marchisio, “A computational workflow to study particle transport and filtration in porous media: Coupling cfd and deep learning,”Chemical Engineering Journal, vol. 417, 2021
work page 2021
-
[28]
T. Botarelli, M. Fanfani, P. Nesi, and L. Pinelli, “Using physics-informed neural networks for solving navier-stokes equations in fluid dynamic complex scenarios,”Engineering Applications of Artificial Intelligence, vol. 148, 2025
work page 2025
-
[29]
Lbann: livermore big artificial neural network hpc toolkit,
B. Van Essen, H. Kim, R. Pearce, K. Boakye, and B. Chen, “Lbann: livermore big artificial neural network hpc toolkit,” inWorkshop on Machine Learning in High-Performance Computing Environments, 2015
work page 2015
-
[30]
ExaLearn: US Department of Energy (DOE) Exascale Computing Project (ECP) center,
“ExaLearn: US Department of Energy (DOE) Exascale Computing Project (ECP) center,” 2020. [Online]. Available: https://acdc.alcf.anl. gov/exalearn/
work page 2020
-
[31]
Distributed dataflow across the edge-cloud continuum,
T. Ekaireb, L. Brand, N. Avaraddy, M. Mock, C. Krintz, and R. Wolski, “Distributed dataflow across the edge-cloud continuum,” in2024 IEEE 17th International Conference on Cloud Computing (CLOUD). IEEE, 2024, pp. 316–327
work page 2024
-
[32]
Pegasus for computational workflows,
“Pegasus for computational workflows,” https://pegasus.isi.edu, [Online; accessed 11-July-2024]
work page 2024
-
[33]
Design and performance characterization of radical-pilot on leadership-class platforms,
A. Merzky, M. Turilli, M. Titov, A. Al-Saadi, and S. Jha, “Design and performance characterization of radical-pilot on leadership-class platforms,”IEEE Transactions on Parallel and Distributed Systems, vol. 33, no. 4, 2022
work page 2022
-
[34]
Swift/T High Performance Dataflow Computing,
“Swift/T High Performance Dataflow Computing,” http://swift-lang.org/ Swift-T/, [Online; accessed 15-Nov-2016]
work page 2016
-
[35]
KubeFlow: The foundation of tools for AI Platforms on Kubernetes,
KubeFlow.org, “KubeFlow: The foundation of tools for AI Platforms on Kubernetes,” 2020. [Online]. Available: https://www.kubeflow.org/
work page 2020
-
[36]
“TensorFlow Extended,” https://www.tensorflow.org/tfx, [Online; ac- cessed 15-Apr-2026]
work page 2026
-
[37]
In situ framework for coupling simulation and machine learning with application to CFD,
R. Balin, F. Simini, J. T. Simpson, A. Shao, A. Rigazzi, B. Ellis, S. Becker, A. Doostan, J. A. Evans, and K. E. Jansen, “In situ framework for coupling simulation and machine learning with application to CFD,” inWorkshop on Machine Learning in HPC Environments (MLHPC),
-
[38]
Available: https://arxiv.org/abs/2306.12900
[Online]. Available: https://arxiv.org/abs/2306.12900
-
[39]
M. Flatken, A. Podobas, R. Fellegara, A. Basermann, J. Holke, L. Knapp, M. Kontak, N. Krullikowski, B. Nolde, N. Brownet al., “VESTEC: Visual exploration and sampling toolkit for extreme computing—urgent decision making meets HPC,”IEEE Access, vol. 11, 2023
work page 2023
-
[40]
Rose: Radical orchestrator for surrogate exploration,
A. Alsaadi, T. Wang, A. Park, P. Bajracharya, L. Wang, F. Sun, S. Seal, V . Jadhao, G. Fox, and S. Jha, “Rose: Radical orchestrator for surrogate exploration,” inProceedings of the SC ’25 Workshops of the International Conference for High Performance Computing, Networking, Storage and Analysis, ser. SC Workshops ’25. New York, NY , USA: Association for Co...
-
[41]
Empowering the 6G cellular architecture with open RAN,
M. Polese, M. Dohler, A. Dressler, M. Erol-Kantarci, R. Jana, R. Knopp, and T. Melodia, “Empowering the 6G cellular architecture with open RAN,”IEEE Journal on Selected Areas in Communications, vol. 42, no. 2, pp. 245–262, 2024
work page 2024
-
[42]
Resource allocation in multi-access edge computing for 5G-and-beyond networks,
Z. Sarah, G. Nencioni, and M. A. Khan, “Resource allocation in multi-access edge computing for 5G-and-beyond networks,”Computer Networks, vol. 227, 2023
work page 2023
-
[43]
An industrial private 5G testbed for networked automation systems,
J. Geng, M. K. Hany, and R. Candell, “An industrial private 5G testbed for networked automation systems,” inIEEE/ASME International Conference on Advanced Intelligent Mechatronics (AIM), 2024. [Online]. Available: https://www.nist.gov/publications/ industrial-private-5g-testbed-networked-automation-systems
work page 2024
-
[44]
Time- sensitive networking (TSN) for industrial automation: Current advances and future directions,
F. Zhang, J. Wang, J. Xue, R. Wang, M. Nixon, and Y . Han, “Time- sensitive networking (TSN) for industrial automation: Current advances and future directions,”ACM Computing Surveys, vol. 57, no. 2, 2024
work page 2024
-
[45]
A comprehensive systematic review of integration of time sensitive networking and 5G communication,
Z. Satka, M. Ashjaei, H. Fotouhi, M. Daneshtalab, M. Sj ¨odin, and S. Mubeen, “A comprehensive systematic review of integration of time sensitive networking and 5G communication,”Journal of Systems Architecture, vol. 138, 2023
work page 2023
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.