When Mobile Crowdsourcing Meets Queueing Systems: Human-in-the-Loop Learning
Pith reviewed 2026-06-26 21:40 UTC · model grok-4.3
The pith
Dynamic side payments coordinate crowdsourced queue choices to bound price of anarchy below 2 with exact budget balance.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
Myopic decentralized server choices in endogenous queueing systems can produce infinite price of anarchy; a dynamic side-payment mechanism that charges and rewards customers periodically coordinates congestion management and information acquisition across heterogeneous servers, guarantees PoA below 2, and maintains ex-post budget balance.
What carries the argument
dynamic side-payment mechanism that periodically charges some customers and rewards others
If this is right
- Myopic choices induce infinite price of anarchy by over-exploring likely congested servers.
- In the single-server case the lower bound on price of anarchy falls as buffer size increases.
- In the multi-server case the upper bound on price of anarchy falls as the number of servers grows.
- The mechanism coordinates congestion management and information acquisition while keeping ex-post budget balance.
- Real-dataset experiments show strong average-case performance in addition to the worst-case PoA bound.
Where Pith is reading between the lines
- The same side-payment structure might stabilize other crowdsourced systems in which individual reports affect shared state, such as traffic apps or shared-bike availability.
- Extending the mechanism to allow limited forward-looking customer behavior could further reduce the required payment magnitudes.
- Testing the payments in a live mobile app would reveal whether users actually respond to the small, repeated transfers as predicted.
Load-bearing premise
Customers choose servers to maximize only their immediate service utility and ignore the value their observations create for future customers.
What would settle it
A deployment or simulation in which the side-payment mechanism is used yet the realized price of anarchy exceeds 2 or the budget fails to balance ex post.
Figures
read the original abstract
In service systems, customers now rely on congestion information before deciding which queue or server to join, from restaurants and theme-park attractions to road networks. We study this setting as human-in-the-loop learning (HILL), where customers both consume and generate time-sensitive congestion information through crowdsourcing platforms. Because congestion reports become stale, efficient system operation requires continued exploration of servers whose current states are uncertain. Yet selfish customers avoid such exploration when it reduces their immediate service utility, even though their observations would benefit future customers. We analyze this tension between individual incentives and system-wide learning in queueing systems with endogenous congestion. We first show that myopic server choices can induce an infinite price of anarchy (PoA): decentralized customers may cause arbitrarily large efficiency losses by overexploring servers that are likely congested. In the single-server case, we prove that the lower bound on PoA decreases as buffer size grows, while in the multi-server case the upper bound decreases as the number of servers increases. We further show that existing informational, non-monetary mechanisms for exploration-exploitation with exogenous information fail in our setting, as customers' choices directly reshape the queue states and still lead to infinite PoA. To address this challenge, we design a dynamic side-payment mechanism that periodically charges some customers and rewards others, discouraging excessive exploration while maintaining ex-post budget balance. The mechanism coordinates congestion management and information acquisition across heterogeneous servers, and guarantees PoA below 2. Beyond worst-case analysis, experiments using real datasets demonstrate that the proposed mechanism also achieves strong average-case performance.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper studies human-in-the-loop learning in queueing systems where myopic, selfish customers both consume and generate congestion information via crowdsourcing. It shows that myopic server choices yield infinite price of anarchy (PoA), that prior non-monetary mechanisms for exploration-exploitation also yield infinite PoA in this endogenous setting, and that a proposed dynamic side-payment mechanism restores a PoA bound below 2 while preserving ex-post budget balance. The analysis further establishes that the lower bound on PoA decreases with buffer size in the single-server case and the upper bound decreases with the number of servers in the multi-server case. Real-dataset experiments are reported to show strong average-case performance.
Significance. If the PoA < 2 guarantee and ex-post budget balance hold under the stated model, the result would be significant for mechanism design at the intersection of game theory and queueing theory. It directly addresses the exploration-exploitation tension induced by myopic incentives in systems with endogenous congestion, provides explicit dependence of PoA bounds on system parameters (buffer size, number of servers), and includes empirical validation on real data. The combination of a constant-factor PoA bound with budget balance is a notable theoretical contribution in this setting.
minor comments (2)
- Abstract: the description of the real-dataset experiments does not name the datasets, the performance metrics, or the baselines used; adding these details would improve clarity without affecting the central claims.
- Abstract: the model of customer utilities, the precise definition of the dynamic side-payment rule, and the statement of the PoA theorem are not shown; while these belong in the body, a brief parenthetical reference to the relevant theorem number in the abstract would help readers locate the formal result.
Simulated Author's Rebuttal
We thank the referee for the positive assessment of our work on human-in-the-loop learning in queueing systems and for recommending minor revision. No major comments were provided in the report.
Circularity Check
No significant circularity identified
full rationale
The paper's derivation proceeds via standard theoretical steps: establishing infinite PoA under myopic selfish choices, deriving parameter-dependent bounds on PoA for single- and multi-server cases, showing failure of prior non-monetary mechanisms, and constructing a dynamic side-payment mechanism whose PoA<2 and ex-post budget-balance properties are stated as theorem-level results. No step reduces a claimed prediction or bound to a fitted parameter, self-citation, or definitional equivalence; the argument chain is self-contained against external queueing and mechanism-design benchmarks.
Axiom & Free-Parameter Ledger
axioms (2)
- domain assumption Customers are myopic and selfish, maximizing only immediate service utility.
- domain assumption Congestion reports become stale over time.
Reference graph
Works this paper leans on
-
[1]
Asymptotically optimal downlink scheduling over markovian fading channels,
W. Ouyang, A. Eryilmaz, and N. B. Shroff, “Asymptotically optimal downlink scheduling over markovian fading channels,” in2012 Proceed- ings IEEE INFOCOM. IEEE, 2012, pp. 1224–1232
2012
-
[2]
On queue-length information when customers travel to a queue,
R. Hassin and R. Roet-Green, “On queue-length information when customers travel to a queue,”Manufacturing & Service Operations Management, 2020
2020
-
[3]
Efficient inaccuracy: User-generated information sharing in a queue,
J. Wang and M. Hu, “Efficient inaccuracy: User-generated information sharing in a queue,”Management Science, vol. 66, no. 10, pp. 4648– 4666, 2020
2020
-
[4]
Food delivery service and restaurant: Friend or foe?
M. Chen, M. Hu, and J. Wang, “Food delivery service and restaurant: Friend or foe?”Management Science, 2022
2022
-
[5]
There is now an app to help you skip restaurant queues,
D. Phelan, “There is now an app to help you skip restaurant queues,” https://www.timeout.com/london/blog/theres-now-an-app-to-h elp-you-skip-restaurant-queues-012717, 2017
2017
-
[6]
Wait times for Disneyland app review: View user-submitted waiting times for Disneyland 2021,
AppPicker, “Wait times for Disneyland app review: View user-submitted waiting times for Disneyland 2021,” https://www.apppicker.com/review s/10509, 2021
2021
-
[7]
Transforming the holiday park experience with smart park- ing,
Pavemint, “Transforming the holiday park experience with smart park- ing,” https://www.pavemint.com/blog/holiday-park-smart-parking, 2022
2022
-
[8]
Distributed trip selection game for Public Bike System with crowdsourcing,
J. Zhang, P. Lu, Z. Li, and J. Gan, “Distributed trip selection game for Public Bike System with crowdsourcing,” inIEEE INFOCOM 2018-IEEE Conference on Computer Communications. IEEE, 2018, pp. 2717–2725
2018
-
[9]
Waze: App that combines Crowdsourcing with GPS naviga- tion,
N. Epson, “Waze: App that combines Crowdsourcing with GPS naviga- tion,” https://www.bluelabellabs.com/blog/waze-an-app-that-combines-c rowdsourcing-with-gps-navigation, 2021
2021
-
[10]
MyTSA Mobile Application,
MyTSA, “MyTSA Mobile Application,” https://www.dhs.gov/publicat ion/dhstsapia-028-mytsa-mobile-application, 2017
2017
-
[11]
Introduction to multi-armed bandits,
A. Slivkins, “Introduction to multi-armed bandits,”Foundations and Trends® in Machine Learning, vol. 12, no. 1-2, pp. 1–286, 2019
2019
-
[12]
Nonstationary bandit learning via pre- dictive sampling,
Y . Liu, B. Van Roy, and K. Xu, “Nonstationary bandit learning via pre- dictive sampling,” inInternational Conference on Artificial Intelligence and Statistics. PMLR, 2023, pp. 6215–6244
2023
-
[13]
Restless-ucb, an efficient and low- complexity algorithm for online restless bandits,
S. Wang, L. Huang, and J. Lui, “Restless-ucb, an efficient and low- complexity algorithm for online restless bandits,”Advances in Neural Information Processing Systems, vol. 33, pp. 11 878–11 889, 2020
2020
-
[14]
Learning un- known service rates in queues: A multiarmed bandit approach,
S. Krishnasamy, R. Sen, R. Johari, and S. Shakkottai, “Learning un- known service rates in queues: A multiarmed bandit approach,”Opera- tions Research, vol. 69, no. 1, pp. 315–330, 2021
2021
-
[15]
When congestion games meet mobile crowd- sourcing: Selective information disclosure,
H. Li and L. Duan, “When congestion games meet mobile crowd- sourcing: Selective information disclosure,” inProceedings of the AAAI Conference on Artificial Intelligence, vol. 37, no. 5, 2023, pp. 5739–5746
2023
-
[16]
Multi-armed- bandit-based spectrum scheduling algorithms in wireless networks: A survey,
F. Li, D. Yu, H. Yang, J. Yu, K. Holger, and X. Cheng, “Multi-armed- bandit-based spectrum scheduling algorithms in wireless networks: A survey,”IEEE Wireless Communications, vol. 27, no. 1, pp. 24-30, 2020
2020
-
[17]
Competitive Multi-armed Bandit Games for Resource Sharing,
H. Li, and L. Duan, “Competitive Multi-armed Bandit Games for Resource Sharing,”IEEE Transactions on Mobile Computing, to appear, 2025
2025
-
[18]
Distributed learning for dynamic congestion games,
H. Li, and L. Duan, “Distributed learning for dynamic congestion games,”IEEE International Symposium on Information Theory (ISIT), pp. 3654–3659, 2024
2024
-
[19]
Optimal control of admission to a queueing system,
S. Stidham, “Optimal control of admission to a queueing system,”IEEE Transactions on Automatic Control, vol. 30, no. 8, pp. 705–713, 1985
1985
-
[20]
M/g/1 queue with event-dependent arrival rates,
B. Legros, “M/g/1 queue with event-dependent arrival rates,”Queueing Systems, vol. 89, pp. 269–301, 2018
2018
-
[21]
Recommending paths: Follow or not follow?
Y . Li, C. Courcoubetis, and L. Duan, “Recommending paths: Follow or not follow?” inIEEE INFOCOM 2019-IEEE Conference on Computer Communications. IEEE, 2019, pp. 928–936
2019
-
[22]
Virtues of patience in strategic queuing systems
J. Gaitonde, and E. Tardos, “Virtues of patience in strategic queuing systems” inProceedings of the 22nd ACM Conference on Economics and Computation. IEEE, 2021, pp. 520–540
2021
-
[23]
Sharing delay information in service systems: a literature survey,
R. Ibrahim, “Sharing delay information in service systems: a literature survey,” inQueueing Systems. Springer, vol. 89, no. 1, 2018, pp. 49–79
2018
-
[24]
Dynamic information design: a simple problem on optimal sequential information disclosure,
F. Farhadi and D. Teneketzis, “Dynamic information design: a simple problem on optimal sequential information disclosure,”Dynamic Games and Applications, vol. 12, no. 2, pp. 443–484, 2022
2022
-
[25]
Bayesian exploration: Incentivizing exploration in bayesian games,
Y . Mansour, A. Slivkins, V . Syrgkanis, and Z. S. Wu, “Bayesian exploration: Incentivizing exploration in bayesian games,”Operations Research, vol. 70, no. 2, pp. 1105–1127, 2022
2022
-
[26]
Signaling service quality through queue disclosure,
P. Guo, M. Haviv, Z. Luo, and Y . Wang, “Signaling service quality through queue disclosure,”Manufacturing & Service Operations Man- agement, 2022
2022
-
[27]
B ¨orgers and D
T. B ¨orgers and D. Krahmer,An introduction to the theory of mechanism design. Oxford University Press, USA, 2015
2015
-
[28]
Dynamic routing for social in- formation sharing,
Y . Li, C. A. Courcoubetis, and L. Duan, “Dynamic routing for social in- formation sharing,”IEEE Journal on Selected Areas in Communications, vol. 35, no. 3, pp. 571–585, 2017
2017
-
[29]
The effectiveness of subsidies and tolls in congestion games,
B. L. Ferguson, P. N. Brown, and J. R. Marden, “The effectiveness of subsidies and tolls in congestion games,”IEEE Transactions on Automatic Control, 2021
2021
-
[30]
Spatio-temporal pricing for rideshar- ing platforms,
H. Ma, F. Fang, and D. C. Parkes, “Spatio-temporal pricing for rideshar- ing platforms,”Operations Research, 2021
2021
-
[31]
Informational incentives for congestion games,
H. Tavafoghi and D. Teneketzis, “Informational incentives for congestion games,” in2017 55th Annual Allerton Conference on Communication, Control, and Computing (Allerton). IEEE, 2017, pp. 1285–1292
2017
-
[32]
A traffic congestion analysis by user equilibrium and system optimum with incomplete information,
Q. Zhang, S. Q. Liu, and M. Masoud, “A traffic congestion analysis by user equilibrium and system optimum with incomplete information,” Journal of Combinatorial Optimization, pp. 1–14, 2020
2020
-
[33]
Optimal control of a queueing system with two heterogeneous servers,
W. Lin and P. Kumar, “Optimal control of a queueing system with two heterogeneous servers,”IEEE Transactions on Automatic control, vol. 29, no. 8, pp. 696–703, 1984
1984
-
[34]
Discrete-time queuing theory,
T. Meisling, “Discrete-time queuing theory,”Operations Research, vol. 6, no. 1, pp. 96–105, 1958
1958
-
[35]
Hassin,Rational queueing
R. Hassin,Rational queueing. CRC press, 2016
2016
-
[36]
Minimizing the age of information through queues,
A. Bedewy, Y . Sun and N. B. Shroff, “Minimizing the age of information through queues,”IEEE Transactions on Information Theory, vol. 65, no. 8, pp. 5215–5232, 2019
2019
-
[37]
Online pricing incentive to sample fresh informa- tion,
H. Li and L. Duan, “Online pricing incentive to sample fresh informa- tion,”IEEE Transactions on Network Science and Engineering, vol. 10, no. 1, pp. 514–526, 2022
2022
-
[38]
Applying a new device in the optimization of exponen- tial queuing systems,
S. A. Lippman, “Applying a new device in the optimization of exponen- tial queuing systems,”Operations Research, vol. 23, no. 4, pp. 687–710, 1975
1975
-
[39]
Dynamic programming,
R. Bellman, “Dynamic programming,”Science, vol. 153, no. 3731, pp. 34–37, 1966
1966
-
[40]
Approximate Dynamic Programming: Solving the curses of dimensionality,
W. B. Powell, “Approximate Dynamic Programming: Solving the curses of dimensionality,”John Wiley & Sons, vol. 703, 2007
2007
-
[41]
Reinforcement learning: An introduction,
R. Sutton, and A. G. Barto, “Reinforcement learning: An introduction,” MIT press Cambridge, vol. 1, no. 1, 1998. IEEE TRANSACTIONS ON NETWORKING 12
1998
-
[42]
Worst-case equilibria,
E. Koutsoupias and C. Papadimitriou, “Worst-case equilibria,” inAnnual symposium on theoretical aspects of computer science. Springer, 1999, pp. 404–413
1999
-
[43]
Individually rational, budget-balanced mechanisms and allocation of surplus,
G. Kosenok and S. Severinov, “Individually rational, budget-balanced mechanisms and allocation of surplus,”Journal of Economic Theory, vol. 140, no. 1, pp. 126–161, 2008
2008
-
[44]
The impossibility of strategy-proof, pareto efficient, and individually rational rules for fractional matching,
S. Alva and V . Manjunath, “The impossibility of strategy-proof, pareto efficient, and individually rational rules for fractional matching,”Games and Economic Behavior, vol. 119, pp. 15–29, 2020
2020
-
[45]
Talabat Dubai detailed data for Feb 2023,
Talabat, “Talabat Dubai detailed data for Feb 2023,” https://www.kagg le.com/datasets/zivaank/data-feb-2023?resource=download, 2023
2023
-
[46]
To Analyze and Regulate Human-in-the-Loop Learning for Congestion Games,
H. Li, and L. Duan, “To Analyze and Regulate Human-in-the-Loop Learning for Congestion Games,”IEEE Transactions on Networking, to appear, 2025. Hongbo Li(M’24) received the B.Sc. degree from Shanghai Jiao Tong University (SJTU), China, in 2019, and the Ph.D. degree from Singapore Univer- sity of Technology and Design (SUTD), Singapore, in 2024. He is curr...
2025
-
[47]
He currently serves as the Institute Director of the NSF AI Institute for Future Edge Networks and Distributed Intelligence (AI-EDGE)
In 2007, he joined The Ohio State University, Columbus, OH, USA, where he holds the Ohio Eminent Scholar Endowed Chair in networking and communications, with the Departments of ECE and CSE. He currently serves as the Institute Director of the NSF AI Institute for Future Edge Networks and Distributed Intelligence (AI-EDGE). He is also the director of the n...
2007
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.