Recognition: no theorem link
Fair and Efficient Scheduling for Sensor Networks via Online Whittle Index Policy
Pith reviewed 2026-05-12 01:02 UTC · model grok-4.3
The pith
An online Whittle index policy using Age of Incorrect Information cuts sensor network transmissions by up to 70 percent compared to round-robin polling while keeping estimation errors within acceptable limits.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The paper establishes that an online state-estimation procedure can compute Whittle indices for the Age of Incorrect Information metric without prior knowledge of transition dynamics, yielding WAoII and FWAoII policies that schedule node polling in wake-up radio networks. These policies reduce packet transmissions by up to 70 percent relative to round-robin polling while keeping root-mean-square error within acceptable application tolerances on both real and synthetic data sets.
What carries the argument
The online Whittle Index AoII (WAoII) policy, derived by estimating unknown transition dynamics from observed states and then applying the index policy of the resulting restless multi-armed bandit formulation of AoII minimization.
Load-bearing premise
The online state-estimation step recovers enough information about the unknown transition dynamics to produce reliable Whittle indices that correctly rank which nodes to poll.
What would settle it
A controlled deployment in which the state estimator converges to inaccurate transition estimates and the resulting WAoII policy either transmits at least as many packets as round-robin or produces root-mean-square error above the stated application tolerance.
Figures
read the original abstract
Wake-Up Radio (WUR) enables resource-constrained, battery-powered sensor nodes to remain in a low-power deep sleep state while continuously listening for a Wake-Up Signal (WUS). Sensor nodes only wake and transmit data after receiving the WUS, significantly reducing energy consumption. However, polling nodes whose transmitted data provides little or no meaningful update to the remote monitor can still result in unnecessary energy usage and increased storage overhead. To address this issue, this paper uses the Age of Incorrect Information (AoII) metric to prioritise the polling of nodes that provide informative updates to the remote monitor. Determining the optimal set of nodes to poll based on AoII can be formulated as a Restless Multi-Armed Bandit (RMAB) problem, which traditionally requires prior knowledge of the monitored process transition dynamics. Since such dynamics are often unknown in practical deployments, we propose an online learning framework based on state estimation to derive Whittle Index AoII (WAoII) and Fair Whittle Index AoII (FWAoII) policies without assuming known transition probabilities. The proposed policies efficiently schedule node polling while adapting to unknown process behaviour. Experimental evaluation using both real-world and synthetic datasets demonstrates that the proposed online WAoII policy can reduce packet transmissions by up to 70\% compared to the widely used Round Robin (RR) polling strategy, while maintaining Root Mean Squared Error (RMSE) values within acceptable application error tolerances. These results demonstrate the effectiveness of WAoII and FWAoII as energy-efficient polling techniques for low-power WUR sensor networks.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper formulates polling scheduling for Wake-Up Radio sensor networks as a Restless Multi-Armed Bandit (RMAB) problem using the Age of Incorrect Information (AoII) metric to prioritize informative updates. It proposes online WAoII and FWAoII policies that use state estimation to compute Whittle indices without assuming known transition probabilities, and reports that these policies reduce packet transmissions by up to 70% versus Round Robin while keeping RMSE within acceptable tolerances on real-world and synthetic datasets.
Significance. If the online state estimation reliably recovers the underlying dynamics, the work provides a practical, adaptive scheduling method that extends battery life in resource-constrained WUR networks without requiring prior process models. The experimental results on both real and synthetic traces constitute a concrete strength, demonstrating measurable transmission savings while respecting application-level error bounds.
major comments (2)
- [online learning framework and WAoII/FWAoII policy derivation] The online learning framework (state estimation for unknown transition probabilities) provides no convergence guarantees, error bounds, or robustness analysis for the recovered dynamics used to compute Whittle indices. This is load-bearing for the central claim, as inaccurate indices would invalidate the prioritization that produces the reported 70% transmission reduction.
- [Experimental Evaluation] Experimental Evaluation: the manuscript reports RMSE values within tolerances and up to 70% savings versus RR but supplies no quantitative comparison of estimated versus true transition probabilities, no statistical significance tests across runs, and no tests under non-stationarity or observation noise. Without these, it is unclear whether the performance generalizes beyond the specific traces.
minor comments (2)
- [Abstract] The abstract states that RMSE remains 'within acceptable application error tolerances' but does not define or justify those tolerances or link them to specific application requirements.
- [Proposed online learning framework] Notation for the estimated state and the online estimator could be clarified with an explicit algorithm box or pseudocode to improve reproducibility.
Simulated Author's Rebuttal
We thank the referee for the constructive and detailed feedback. We address each major comment point by point below, indicating where revisions will be made to the manuscript.
read point-by-point responses
-
Referee: [online learning framework and WAoII/FWAoII policy derivation] The online learning framework (state estimation for unknown transition probabilities) provides no convergence guarantees, error bounds, or robustness analysis for the recovered dynamics used to compute Whittle indices. This is load-bearing for the central claim, as inaccurate indices would invalidate the prioritization that produces the reported 70% transmission reduction.
Authors: We agree that the manuscript does not include formal convergence guarantees, error bounds, or a dedicated robustness analysis for the state estimation step. The estimation uses online frequency counts of observed transitions, a standard method for learning unknown Markov dynamics, but we did not derive Whittle-index-specific bounds or prove convergence rates in the RMAB setting. In the revision we will add a dedicated subsection on the estimation procedure, recall its known asymptotic consistency under standard ergodicity assumptions, and include empirical plots of estimation error versus sample size on the synthetic traces. We will also discuss how index computation is affected by moderate estimation error. These additions will clarify the practical reliability of the approach while acknowledging that a full theoretical analysis remains future work. revision: partial
-
Referee: [Experimental Evaluation] Experimental Evaluation: the manuscript reports RMSE values within tolerances and up to 70% savings versus RR but supplies no quantitative comparison of estimated versus true transition probabilities, no statistical significance tests across runs, and no tests under non-stationarity or observation noise. Without these, it is unclear whether the performance generalizes beyond the specific traces.
Authors: We accept that the current experimental section lacks these quantitative checks. In the revised manuscript we will: (i) add direct comparisons (tables and plots) of estimated versus ground-truth transition probabilities on all synthetic datasets, reporting L1 or total-variation error; (ii) repeat all experiments over 20 independent runs and report mean performance with standard deviation together with paired statistical significance tests (t-tests or Wilcoxon signed-rank) against Round-Robin; (iii) introduce new experiments that inject controlled non-stationarity (abrupt or gradual changes in transition matrices) and additive observation noise, measuring degradation in transmission savings and RMSE. These results will be placed in an expanded experimental section to support claims of generalizability. revision: yes
Circularity Check
No significant circularity; derivation is self-contained via standard RMAB theory plus empirical validation
full rationale
The paper formulates AoII-based polling as an RMAB, adopts the standard Whittle index policy, and augments it with an online state-estimation procedure to handle unknown transition probabilities. The reported 70% transmission reduction is an empirical outcome measured on held-out real-world and synthetic traces, not a quantity that reduces by construction to parameters fitted inside the same experiment or to a self-citation chain. No self-definitional steps, fitted-input predictions, or load-bearing self-citations appear in the derivation; the online estimator is presented as an independent approximation whose accuracy is tested externally rather than assumed tautologically.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Polling decisions in WUR sensor networks can be modeled as a Restless Multi-Armed Bandit problem.
Reference graph
Works this paper leans on
-
[1]
When to pull data from sensors for minimum age of incorrect information,
S. Kriouile and M. Assaad, “When to pull data from sensors for minimum age of incorrect information,” in2023 21st International Symposium on Modeling and Optimization in Mobile, Ad Hoc, and Wireless Networks (WiOpt). IEEE, 2023, pp. 603–610
work page 2023
-
[2]
Admin: Adaptive mon- itoring dissemination for the internet of things,
D. Trihinas, G. Pallis, and M. D. Dikaiakos, “Admin: Adaptive mon- itoring dissemination for the internet of things,” inIEEE INFOCOM 2017-IEEE conference on computer communications. IEEE, 2017, pp. 1–9
work page 2017
-
[3]
Edge mining the internet of things,
E. I. Gaura, J. Brusey, M. Allen, R. Wilkins, D. Goldsmith, and R. Rednic, “Edge mining the internet of things,”IEEE Sensors Journal, vol. 13, no. 10, pp. 3816–3825, 2013
work page 2013
-
[4]
Learn to schedule: Data freshness- oriented intelligent scheduling in industrial iot,
J. Tang, F. Chen, J. Li, and Z. Liu, “Learn to schedule: Data freshness- oriented intelligent scheduling in industrial iot,”IEEE Transactions on Cognitive Communications and Networking, 2024
work page 2024
-
[5]
Goal-oriented scheduling in sensor networks with applica- tion timing awareness,
J. Holm, F. Chiariotti, A. E. Kalør, B. Soret, T. B. Pedersen, and P. Popovski, “Goal-oriented scheduling in sensor networks with applica- tion timing awareness,”IEEE Transactions on Communications, vol. 71, no. 8, pp. 4513–4527, 2023
work page 2023
-
[6]
B. Liang, L. Xu, A. Taneja, M. Tambe, and L. Janson, “A bayesian ap- proach to online learning for contextual restless bandits with applications to public health,”arXiv preprint arXiv:2402.04933, 2024
-
[7]
Energy-efficient internet of things monitoring with content-based wake-up radio,
A. A. Deshpande, F. Chiariotti, and A. Zanella, “Energy-efficient internet of things monitoring with content-based wake-up radio,”arXiv preprint arXiv:2312.04294, 2023
-
[8]
Nc-approximation schemes for np- and pspace-hard problems for geometric graphs,
H. B. Hunt III, M. V . Marathe, V . Radhakrishnan, S. S. Ravi, D. J. Rosenkrantz, and R. E. Stearns, “Nc-approximation schemes for np- and pspace-hard problems for geometric graphs,”Journal of algorithms, vol. 26, no. 2, pp. 238–274, 1998
work page 1998
-
[9]
Restless-ucb, an efficient and low- complexity algorithm for online restless bandits,
S. Wang, L. Huang, and J. Lui, “Restless-ucb, an efficient and low- complexity algorithm for online restless bandits,”Advances in Neural Information Processing Systems, vol. 33, pp. 11 878–11 889, 2020
work page 2020
-
[10]
Optimistic whittle index policy: Online learning for restless bandits,
K. Wang, L. Xu, A. Taneja, and M. Tambe, “Optimistic whittle index policy: Online learning for restless bandits,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 8, 2023, pp. 10 131– 10 139
work page 2023
-
[11]
Energy efficient wake up radio polling based on value of information,
S. Jonah, S. K. Yoo, and S. Sthapit, “Energy efficient wake up radio polling based on value of information,” 2025, presented at the IEEE International Black Sea Conference on Communications and Networking (BlackSeaCom), Chisinau, Moldova, 23–26 June 2025
work page 2025
-
[12]
J. Oller, I. Demirkol, J. Casademont, J. Paradells, G. U. Gamm, and L. Reindl, “Has time come to switch from duty-cycled mac protocols to wake-up radio for wireless sensor networks?”IEEE/ACM Transactions on Networking, vol. 24, no. 2, pp. 674–687, 2015
work page 2015
-
[13]
Energy efficiency trade-off between duty-cycling and wake-up radio techniques in iot networks,
A. Kozłowski and J. Sosnowski, “Energy efficiency trade-off between duty-cycling and wake-up radio techniques in iot networks,”Wireless Personal Communications, vol. 107, no. 4, pp. 1951–1971, 2019
work page 1951
-
[14]
Ieee 802.11 ba wake-up radio: Performance evaluation and practical designs,
D.-J. Deng, S.-Y . Lien, C.-C. Lin, M. Gan, and H.-C. Chen, “Ieee 802.11 ba wake-up radio: Performance evaluation and practical designs,”IEEE Access, vol. 8, pp. 141 547–141 557, 2020
work page 2020
-
[15]
Radio- on-demand sensor and actuator networks (rod-san): System design and field trial,
H. Yomo, K. Abe, Y . Ezure, T. Ito, A. Hasegawa, and T. Ikenaga, “Radio- on-demand sensor and actuator networks (rod-san): System design and field trial,” in2015 IEEE Global Communications Conference (GLOBECOM). IEEE, 2015, pp. 1–6
work page 2015
-
[16]
Value of information- based packet scheduling scheme for auv-assisted uasns,
X. Zhuo, W. Wu, L. Tang, F. Qu, and X. Shen, “Value of information- based packet scheduling scheme for auv-assisted uasns,”IEEE Transac- tions on Wireless Communications, 2023
work page 2023
-
[17]
6g networks: Beyond shannon towards semantic and goal-oriented communications,
E. C. Strinati and S. Barbarossa, “6g networks: Beyond shannon towards semantic and goal-oriented communications,”Computer Networks, vol. 190, p. 107930, 2021
work page 2021
-
[18]
Toward goal- oriented semantic communications: New metrics, framework, and open challenges,
A. Li, S. Wu, S. Meng, R. Lu, S. Sun, and Q. Zhang, “Toward goal- oriented semantic communications: New metrics, framework, and open challenges,”IEEE Wireless Communications, 2024
work page 2024
-
[19]
Goal-oriented wireless communication resource allocation for cyber-physical systems,
C. Feng, K. Zheng, Y . Wang, K. Huang, and Q. Chen, “Goal-oriented wireless communication resource allocation for cyber-physical systems,” IEEE Transactions on Wireless Communications, 2024
work page 2024
-
[20]
Making sense of meaning: A survey on metrics for semantic and goal-oriented communication,
T. M. Getu, G. Kaddoum, and M. Bennis, “Making sense of meaning: A survey on metrics for semantic and goal-oriented communication,” IEEE Access, vol. 11, pp. 45 456–45 492, 2023
work page 2023
-
[21]
Push-and pull-based effective communication in cyber-physical systems,
P. Talli, F. Mason, F. Chiariotti, and A. Zanella, “Push-and pull-based effective communication in cyber-physical systems,”arXiv preprint arXiv:2401.10921, 2024
-
[22]
Content-based wake-up for top-k query in wireless sensor networks,
J. Shiraishi, H. Yomo, K. Huang, ˇC. Stefanovi ´c, and P. Popovski, “Content-based wake-up for top-k query in wireless sensor networks,” IEEE Transactions on Green Communications and Networking, vol. 5, no. 1, pp. 362–377, 2020
work page 2020
-
[23]
Exact top-k queries in wireless sensor networks,
B. Malhotra, M. A. Nascimento, and I. Nikolaidis, “Exact top-k queries in wireless sensor networks,”IEEE Transactions on Knowledge and Data Engineering, vol. 23, no. 10, pp. 1513–1525, 2010
work page 2010
-
[24]
Real-time status: How often should one update?
S. Kaul, R. Yates, and M. Gruteser, “Real-time status: How often should one update?” in2012 Proceedings IEEE INFOCOM. IEEE, 2012, pp. 2731–2735
work page 2012
-
[25]
Wireless scheduling to optimize age of information based on earliest update time,
Q. Liu, C. Li, Y . T. Hou, W. Lou, J. H. Reed, and S. Kompella, “Wireless scheduling to optimize age of information based on earliest update time,” IEEE Internet of Things Journal, vol. 10, no. 7, pp. 6352–6366, 2022
work page 2022
-
[26]
W. Jin, J. Sun, K. Chi, and S. Zhang, “Deep reinforcement learning based scheduling for minimizing age of information in wireless powered sensor networks,”Computer Communications, vol. 191, pp. 1–10, 2022
work page 2022
-
[27]
Age-of-information aware scheduling for edge-assisted industrial wireless networks,
M. Li, C. Chen, H. Wu, X. Guan, and X. Shen, “Age-of-information aware scheduling for edge-assisted industrial wireless networks,”IEEE Transactions on Industrial Informatics, vol. 17, no. 8, pp. 5562–5571, 2020
work page 2020
-
[28]
The age of incorrect information: A new performance metric for status updates,
A. Maatouk, S. Kriouile, M. Assaad, and A. Ephremides, “The age of incorrect information: A new performance metric for status updates,” IEEE/ACM Transactions on Networking, vol. 28, no. 5, pp. 2215–2228, 2020
work page 2020
-
[29]
Optimization of aoii and qaoii in multi-user links,
M. Ayik, E. T. Ceran, and E. Uysal, “Optimization of aoii and qaoii in multi-user links,” inIEEE INFOCOM 2023-IEEE Conference on Computer Communications Workshops (INFOCOM WKSHPS). IEEE, 2023, pp. 1–6
work page 2023
-
[30]
Scheduling to minimize age of incorrect information with imperfect channel state information,
Y . Chen and A. Ephremides, “Scheduling to minimize age of incorrect information with imperfect channel state information,”Entropy, vol. 23, no. 12, p. 1572, 2021
work page 2021
-
[31]
The age of incorrect in- formation: An enabler of semantics-empowered communication,
A. Maatouk, M. Assaad, and A. Ephremides, “The age of incorrect in- formation: An enabler of semantics-empowered communication,”IEEE Transactions on Wireless Communications, vol. 22, no. 4, pp. 2621– 2635, 2022
work page 2022
-
[32]
Minimizing the age of incorrect information for unknown markovian source,
S. Kriouile and M. Assaad, “Minimizing the age of incorrect information for unknown markovian source,”IEEE Transactions on Networking, 2026
work page 2026
-
[33]
Minimizing age of incorrect information over a channel with random delay,
Y . Chen and A. Ephremides, “Minimizing age of incorrect information over a channel with random delay,”IEEE/ACM Transactions on Net- working, vol. 32, no. 4, pp. 2752–2764, 2024
work page 2024
-
[34]
Ao 2 i: Minimizing age of outdated information to improve freshness in data collection,
Q. Liu, C. Li, Y . T. Hou, W. Lou, J. H. Reed, and S. Kompella, “Ao 2 i: Minimizing age of outdated information to improve freshness in data collection,” inIEEE INFOCOM 2022-IEEE Conference on Computer Communications. IEEE, 2022, pp. 1359–1368
work page 2022
-
[35]
Age of information: An introduction and survey,
R. D. Yates, Y . Sun, D. R. Brown, S. K. Kaul, E. Modiano, and S. Ulukus, “Age of information: An introduction and survey,”IEEE Journal on Selected Areas in Communications, vol. 39, no. 5, pp. 1183– 1210, 2021
work page 2021
-
[36]
Scheduling to minimize age of information with multiple sources,
K. Saurav and R. Vaze, “Scheduling to minimize age of information with multiple sources,”IEEE Journal on Selected Areas in Information Theory, vol. 4, pp. 539–550, 2023
work page 2023
-
[37]
B. Peng, Y . Xie, G. Seco-Granados, H. Wymeersch, and E. A. Jorswieck, “Communication scheduling by deep reinforcement learning for remote traffic state estimation with bayesian inference,”IEEE Transactions on Vehicular Technology, vol. 71, no. 4, pp. 4287–4300, 2022
work page 2022
-
[38]
X. Yuan, Y . Wang, C. Yang, Z. Ge, Z. Song, and W. Gui, “Weighted linear dynamic system for feature representation and soft sensor appli- cation in nonlinear dynamic industrial processes,”IEEE Transactions on Industrial Electronics, vol. 65, no. 2, pp. 1508–1517, 2017
work page 2017
-
[39]
Linearization of the sensors character- istics: A review,
T. Islam and S. Mukhopadhyay, “Linearization of the sensors character- istics: A review,”International Journal on Smart Sensing and Intelligent Systems, vol. 12, no. 1, pp. 1–21, 2019
work page 2019
-
[40]
Adaptive retransmission for wireless sensor nodes under bursty error conditions,
S. Jonah, S. K. Yoo, and S. Sthapit, “Adaptive retransmission for wireless sensor nodes under bursty error conditions,” in2024 5th International 16 Conference on Smart Sensors and Application (ICSSA). IEEE, 2024, pp. 1–6
work page 2024
-
[41]
Adaptive burst transmission scheme for wsns,
Z. Ansar and W. Dargie, “Adaptive burst transmission scheme for wsns,” in2017 26th International Conference on Computer Communication and Networks (ICCCN). IEEE, 2017, pp. 1–7
work page 2017
-
[42]
The complexity of optimal queueing network control,
C. H. Papadimitriou and J. N. Tsitsiklis, “The complexity of optimal queueing network control,” inProceedings of IEEE 9th annual confer- ence on structure in complexity Theory. IEEE, 1994, pp. 318–322
work page 1994
-
[43]
V . Mehta, R. Meshram, K. Kaza, S. N. Merchant, and U. B. Desai, “Rested and restless bandits with constrained arms and hidden states: Applications in social networks and 5g networks,”IEEE Access, vol. 6, pp. 56 782–56 799, 2018
work page 2018
-
[44]
Markovian restless bandits and index policies: A review,
J. Ni ˜no-Mora, “Markovian restless bandits and index policies: A review,” Mathematics, vol. 11, no. 7, p. 1639, 2023
work page 2023
-
[45]
Adaptive scheduling: A reinforce- ment learning whittle index approach for wireless sensor networks,
S. Jonah, S. K. Yoo, and S. Sthapit, “Adaptive scheduling: A reinforce- ment learning whittle index approach for wireless sensor networks,” IEEE Access, 2026
work page 2026
-
[46]
On learning whittle index policy for restless bandits with scalable regret,
N. Akbarzadeh and A. Mahajan, “On learning whittle index policy for restless bandits with scalable regret,”IEEE Transactions on Control of Network Systems, vol. 11, no. 3, pp. 1190–1202, 2023
work page 2023
-
[47]
G. Xiong and J. Li, “Finite-time analysis of whittle index based q- learning for restless multi-armed bandits with neural network function approximation,”Advances in Neural Information Processing Systems, vol. 36, pp. 29 048–29 073, 2023
work page 2023
-
[48]
A. Biswas, G. Aggarwal, P. Varakantham, and M. Tambe, “Learn to intervene: An adaptive learning policy for restless bandits in application to preventive healthcare,”arXiv preprint arXiv:2105.07965, 2021
-
[49]
Asymptotically optimal delay-aware scheduling in queueing systems,
S. Kriouile, M. Assaad, and M. Larranaga, “Asymptotically optimal delay-aware scheduling in queueing systems,”Journal of Communica- tions and Networks, 2024
work page 2024
-
[50]
Restless bandits: Activity allocation in a changing world,
P. Whittle, “Restless bandits: Activity allocation in a changing world,” Journal of applied probability, vol. 25, no. A, pp. 287–298, 1988
work page 1988
-
[51]
Aoi-bounded scheduling for industrial wireless sensor networks,
C. Pu, H. Yang, P. Wang, and C. Dong, “Aoi-bounded scheduling for industrial wireless sensor networks,”Electronics, vol. 12, no. 6, p. 1499, 2023
work page 2023
-
[52]
Monitoring correlated sources: Aoi-based scheduling is nearly optimal,
R. V . Ramakanth, V . Tripathi, and E. Modiano, “Monitoring correlated sources: Aoi-based scheduling is nearly optimal,”IEEE Transactions on Mobile Computing, 2024
work page 2024
-
[53]
S. Madden, “Intel lab data,” http://db.lcs.mit.edu/labdata/labdata.html, Jul. 2010, online; accessed 2010-07-01
work page 2010
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.