Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

Linan Huang; Quanyan Zhu

arxiv: 1907.01396 · v1 · pith:B2RT3GU6new · submitted 2019-07-01 · 💻 cs.CR

Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense

Linan Huang , Quanyan Zhu This is my paper

Pith reviewed 2026-05-25 12:20 UTC · model grok-4.3

classification 💻 cs.CR

keywords strategic learningcyber defensedefensive deceptionmoving target defensehoneypot engagementuncertainty reductionpolicy convergenceautonomous adaptation

0 comments

The pith

Strategic learning enables three cyber defense schemes to converge to optimal policies under progressive uncertainty.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes three defense schemes that actively engage attackers to raise their costs and collect information, forming an active, autonomous, and adaptive cyber defense approach. These schemes—defensive deception, feedback-driven moving target defense, and adaptive honeypot engagement—operate under three increasing levels of uncertainty about parameters, payoffs, and the full environment. To address the unknowns, the paper applies three tailored strategic learning schemes that all rely on the same feedback loop of sensation, estimation, and actions. This loop reinforces policies that yield better outcomes until they reach optimal performance in an autonomous way. The work also aims to quantify the resulting security gains against the costs of running the defenses.

Core claim

The paper claims that the three defense schemes, facing parameter uncertainty, payoff uncertainty, and environmental uncertainty in turn, each use a distinct strategic learning scheme, yet all three schemes share an identical feedback structure of sensation, estimation, and actions. Through this structure the most rewarding policies are reinforced repeatedly, allowing them to converge autonomously and adaptively to the optimal policies even when the defender lacks complete knowledge of the attacker.

What carries the argument

The shared feedback structure of sensation, estimation, and actions that reinforces rewarding policies until convergence in each of the three learning schemes.

If this is right

Defensive deception can detect and counter attacks while handling parameter uncertainty.
Moving target defense can adapt dynamically under payoff uncertainty through feedback.
Honeypot engagement can manage environmental uncertainty by learning from interactions.
All schemes increase attacker costs while gathering threat information without full prior knowledge.
The security benefits of the defenses can be weighed against their implementation costs.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The common feedback structure may allow the same learning pattern to be reused in other security settings that involve partial information about opponents.
Convergence under uncertainty could support defenses that update in real time as attacker tactics evolve.
The quantified security-cost tradeoff might guide decisions on when to deploy active measures versus static ones.

Load-bearing premise

The learning schemes can estimate the unknowns and reduce uncertainty enough for the policies to converge to the optimal ones.

What would settle it

A simulation or deployment test in which policies under one of the three schemes fail to converge when uncertainty stays high despite repeated application of the sensation-estimation-action feedback.

Figures

Figures reproduced from arXiv: 1907.01396 by Linan Huang, Quanyan Zhu.

**Figure 5.** Figure 5: a has an attack surface π1(c1,1) = {v1,1, v1,2} while configuration c1,2 in Fig. 5b reveals two vulnerabilities v1,2, v1,3 ∈ π1(c1,2). Then, if the attacker takes action a1,1 and the defender changes the configuration from c1,1 to c1,2, the attack is deterred at the first layer. v1,1 v1,2 v1,3 v2,1 v2,2 v2,3 v3,1 v3,2 v3,3 v4,1 v4,2 v4,3 c1,1 c2,1 c3,1 c4,2 (a) Attack surface π1(c1,1) = {v1,1, v1,2 }. v1,1… view at source ↗

**Figure 8.** Figure 8: The honeynet in red emulates and shares the same structure as the targeted [PITH_FULL_IMAGE:figures/full_fig_p016_8.png] view at source ↗

**Figure 11.** Figure 11: Convergence results of Q-learning over SMDP. 5 Conclusion and Discussion This chapter has introduced three defense schemes, i.e., defensive deception to detect and counter adversarial deception, feedback-driven Moving Target Defense (MTD) to increase the attacker’s probing and reconnaissance costs, and adaptive honeypot engagement to gather fundamental threat information. These schemes satisfy the Princip… view at source ↗

read the original abstract

The increasing instances of advanced attacks call for a new defense paradigm that is active, autonomous, and adaptive, named as the \texttt{`3A'} defense paradigm. This chapter introduces three defense schemes that actively interact with attackers to increase the attack cost and gather threat information, i.e., defensive deception for detection and counter-deception, feedback-driven Moving Target Defense (MTD), and adaptive honeypot engagement. Due to the cyber deception, external noise, and the absent knowledge of the other players' behaviors and goals, these schemes possess three progressive levels of information restrictions, i.e., from the parameter uncertainty, the payoff uncertainty, to the environmental uncertainty. To estimate the unknown and reduce uncertainty, we adopt three different strategic learning schemes that fit the associated information restrictions. All three learning schemes share the same feedback structure of sensation, estimation, and actions so that the most rewarding policies get reinforced and converge to the optimal ones in autonomous and adaptive fashions. This work aims to shed lights on proactive defense strategies, lay a solid foundation for strategic learning under incomplete information, and quantify the tradeoff between the security and costs.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

The paper organizes three cyber defenses around increasing uncertainty levels and pairs each with a learning scheme that uses a shared feedback loop to converge, but the convergence details are asserted rather than derived in the provided text.

read the letter

The core contribution is a clean mapping: defensive deception to parameter uncertainty with one learning scheme, feedback-driven MTD to payoff uncertainty with another, and adaptive honeypots to environmental uncertainty with a third. All three are said to share a sensation-estimation-action loop that reinforces rewarding policies until they reach optima. This framing of the 3A defense paradigm and the explicit tie to progressive uncertainty levels is the new piece; it gives a structured way to think about active defenses that gather information while raising attacker costs. The paper does a decent job laying out the motivation and the high-level unification without overclaiming domain-wide impact. The shared feedback structure is a reasonable organizing principle drawn from standard reinforcement ideas. The soft spot is the convergence claim. The abstract states that the schemes converge to optimal policies under the respective uncertainties, yet supplies no equations, fixed-point arguments, or simulation results to show why the loop actually produces that outcome rather than cycling or settling at suboptimal points. If the full text contains reproducible derivations or falsifiable checks, that would strengthen it; without them the claim stays at the level of a plausible extension of existing RL under incomplete information. The citation pattern and free-parameter count look typical for this subfield, with no obvious circularity in the abstract. This paper is aimed at researchers already working on game-theoretic or learning-based cyber defense who want a taxonomy that links specific techniques to uncertainty regimes. A reader looking for new theorems or large-scale validation will find less here. It is coherent enough on its own terms to merit a serious referee who can check the technical backing in the full manuscript.

Referee Report

1 major / 0 minor

Summary. The paper introduces a '3A' (active, adaptive, autonomous) cyber defense paradigm and presents three schemes—defensive deception for detection/counter-deception, feedback-driven Moving Target Defense (MTD), and adaptive honeypot engagement—under progressive uncertainty levels (parameter, payoff, environmental). It adopts three strategic learning schemes that share a sensation-estimation-action feedback structure to reinforce rewarding policies and achieve convergence to optimal policies in an autonomous, adaptive manner, with the goal of increasing attack costs, gathering threat information, and quantifying security-cost tradeoffs.

Significance. If the convergence claims under incomplete information are rigorously established, the work would provide a conceptual foundation for proactive defense strategies and strategic learning in cyber settings with uncertainty, potentially influencing research on adaptive MTD and deception-based defenses.

major comments (1)

[Abstract] Abstract: The central claim that the three learning schemes 'converge to the optimal ones' via the shared feedback structure is asserted without any derivations, equations, convergence proofs, simulation results, or validation steps; this is load-bearing for the paper's contribution on strategic learning under uncertainty.

Simulated Author's Rebuttal

1 responses · 0 unresolved

We thank the referee for their review and the opportunity to clarify the manuscript. The concern regarding the abstract's assertion of convergence is addressed point-by-point below. The full paper provides the supporting analysis as summarized in the abstract.

read point-by-point responses

Referee: [Abstract] Abstract: The central claim that the three learning schemes 'converge to the optimal ones' via the shared feedback structure is asserted without any derivations, equations, convergence proofs, simulation results, or validation steps; this is load-bearing for the paper's contribution on strategic learning under uncertainty.

Authors: The abstract is a concise summary of the paper's contributions and does not contain the full technical details, which is standard. The three strategic learning schemes are developed in detail in Sections 3 (defensive deception under parameter uncertainty), 4 (feedback-driven MTD under payoff uncertainty), and 5 (adaptive honeypot engagement under environmental uncertainty). Each section derives the sensation-estimation-action feedback loop, presents the specific learning algorithm (e.g., Q-learning variants or regret-matching), provides convergence proofs to Nash equilibria or optimal policies under the respective uncertainty model using stochastic approximation or Lyapunov analysis, and includes equations for value estimation and policy updates. Section 6 presents simulation results across multiple scenarios validating convergence rates, attack cost increases, and security-cost tradeoffs. If the referee prefers, we can revise the abstract to explicitly reference these sections. revision: partial

Circularity Check

0 steps flagged

No significant circularity identified

full rationale

The paper describes three defense schemes under progressive uncertainty levels and adopts corresponding strategic learning schemes that share a sensation-estimation-action feedback loop, with the claim that rewarding policies are reinforced to converge to optima. This is a high-level assertion aligned with standard reinforcement learning under incomplete information. No equations, fitted parameters presented as predictions, or load-bearing self-citations are exhibited in the text that would reduce the central claim to its inputs by construction. The derivation chain remains self-contained against external benchmarks such as RL convergence results.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the domain assumption that strategic learning under the three described uncertainty regimes will produce convergence to optimal policies via the shared sensation-estimation-action loop; no free parameters or invented entities are named in the abstract.

axioms (1)

domain assumption Strategic learning schemes can estimate unknown parameters, payoffs, or environments in cyber defense interactions and thereby reduce uncertainty sufficiently for policy convergence.
Invoked in the abstract when stating that the schemes 'estimate the unknown and reduce uncertainty' and 'converge to the optimal ones'.

pith-pipeline@v0.9.0 · 5718 in / 1253 out tokens · 24285 ms · 2026-05-25T12:20:10.389865+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

70 extracted references · 70 canonical work pages · 2 internal anchors

[1]

Verizon 2019 data breach investigations report,

“Verizon 2019 data breach investigations report,” 2019

work page 2019
[2]

Combatting cyber risks in the supply chain,

D. Shackleford, “Combatting cyber risks in the supply chain,”SANS. org, 2015

work page 2015
[3]

Adaptive strategic cyber defense for advanced persistent threats in criticalinfrastructurenetworks,

L. Huang and Q. Zhu, “Adaptive strategic cyber defense for advanced persistent threats in criticalinfrastructurenetworks,” ACMSIGMETRICSPerformanceEvaluationReview ,vol.46, no. 2, pp. 52–56, 2019

work page 2019
[4]

Analysis and computation of adaptive defense strategies against advanced persistent threatsforcyber-physicalsystems,

——, “Analysis and computation of adaptive defense strategies against advanced persistent threatsforcyber-physicalsystems,”in InternationalConferenceonDecisionandGameTheory for Security. Springer, 2018, pp. 205–226

work page 2018
[5]

A Dynamic Games Approach to Proactive Defense Strategies against AdvancedPersistentThreatsinCyber-PhysicalSystems,

L. Huang and Q. Zhu, “A Dynamic Games Approach to Proactive Defense Strategies against AdvancedPersistentThreatsinCyber-PhysicalSystems,” arXive-prints,p.arXiv:1906.09687, Jun 2019

work page arXiv 1906
[6]

Game-theoretic approach to feedback-driven multi-stage moving target defense,

Q. Zhu and T. Başar, “Game-theoretic approach to feedback-driven multi-stage moving target defense,” inInternational Conference on Decision and Game Theory for Security. Springer, 2013, pp. 246–263. 22 Linan Huang and Quanyan Zhu

work page 2013
[7]

Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes,

L. Huang and Q. Zhu, “Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes,”arXiv e-prints, p. arXiv:1906.12182, Jun 2019

work page arXiv 1906
[8]

A Game-Theoretic Taxonomy and Survey of Defensive Deception for Cybersecurity and Privacy

J. Pawlick, E. Colbert, and Q. Zhu, “A game-theoretic taxonomy and survey of defensive deception for cybersecurity and privacy,”arXiv preprint arXiv:1712.05441, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017
[9]

Integrating cyber-d&d into adversary modeling for active cyber defense,

F. J. Stech, K. E. Heckman, and B. E. Strom, “Integrating cyber-d&d into adversary modeling for active cyber defense,” inCyber deception. Springer, 2016, pp. 1–22

work page 2016
[10]

Activecyberdefensewithdenialanddeception:Acyber-wargameexperiment,

K. E. Heckman, M. J. Walsh, F. J. Stech, T. A. O’boyle, S. R. DiCato, and A. F. Herber, “Activecyberdefensewithdenialanddeception:Acyber-wargameexperiment,” computers& security, vol. 37, pp. 72–77, 2013

work page 2013
[11]

R-locker:Thwartingran- somware action through a honeyﬁle-based approach,

J.Gómez-Hernández,L.Álvarez-González,andP.García-Teodoro,“R-locker:Thwartingran- somware action through a honeyﬁle-based approach,”Computers & Security, vol. 73, pp. 389–398, 2018

work page 2018
[12]

Changing the game: The art of deceiving sophisticated attackers,

N. Virvilis, B. Vanautgaerden, and O. S. Serrano, “Changing the game: The art of deceiving sophisticated attackers,” in2014 6th International Conference On Cyber Conﬂict (CyCon 2014). IEEE, 2014, pp. 87–97

work page 2014
[13]

Modelingandanalysisofleakydeceptionusingsignaling games with evidence,

J.Pawlick,E.Colbert,andQ.Zhu,“Modelingandanalysisofleakydeceptionusingsignaling games with evidence,”IEEE Transactions on Information Forensics and Security, 2018

work page 2018
[14]

SpringerScience&BusinessMedia,2011,vol.54

S.Jajodia,A.K.Ghosh,V.Swarup,C.Wang,andX.S.Wang, Movingtargetdefense:creating asymmetricuncertaintyforcyberthreats . SpringerScience&BusinessMedia,2011,vol.54

work page 2011
[15]

Countering code-injection attacks with instruction-set randomization,

G. S. Kc, A. D. Keromytis, and V. Prevelakis, “Countering code-injection attacks with instruction-set randomization,” inProceedings of the 10th ACM conference on Computer and communications security. ACM, 2003, pp. 272–280

work page 2003
[16]

Deceptive routing in relay networks,

A. Clark, Q. Zhu, R. Poovendran, and T. Başar, “Deceptive routing in relay networks,” in International Conference on Decision and Game Theory for Security. Springer, 2012, pp. 171–185

work page 2012
[17]

Markov modeling of moving target defense games,

H. Maleki, S. Valizadeh, W. Koch, A. Bestavros, and M. van Dijk, “Markov modeling of moving target defense games,” inProceedings of the 2016 ACM Workshop on Moving Target Defense. ACM, 2016, pp. 81–92

work page 2016
[18]

A methodology for intelligent honeypot deployment and active engagement of attackers,

C. R. Hecker, “A methodology for intelligent honeypot deployment and active engagement of attackers,” Ph.D. dissertation, 2012

work page 2012
[19]

Deceptive attack and defense game in honeypot-enablednetworksfortheinternetofthings,

Q. D. La, T. Q. Quek, J. Lee, S. Jin, and H. Zhu, “Deceptive attack and defense game in honeypot-enablednetworksfortheinternetofthings,” IEEEInternetofThingsJournal ,vol.3, no. 6, pp. 1025–1035, 2016

work page 2016
[20]

Optimal Timing in Dynamic and Robust Attacker Engagement During Advanced Persistent Threats

J. Pawlick, T. T. H. Nguyen, and Q. Zhu, “Optimal timing in dynamic and robust attacker engagement during advanced persistent threats,”CoRR, vol. abs/1707.08031, 2017. [Online]. Available: http://arxiv.org/abs/1707.08031

work page internal anchor Pith review Pith/arXiv arXiv 2017
[21]

A Stackelberg game perspective on the conﬂict between machine learning and data obfuscation,

J. Pawlick and Q. Zhu, “A Stackelberg game perspective on the conﬂict between machine learning and data obfuscation,” in Information Forensics and Security (WIFS), 2016 IEEE International Workshop on . IEEE, 2016, pp. 1–6. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7823893/

work page arXiv 2016
[22]

Deployment and exploitation of deceptive honeybots in social networks,

Q. Zhu, A. Clark, R. Poovendran, and T. Basar, “Deployment and exploitation of deceptive honeybots in social networks,” inDecision and Control (CDC), 2013 IEEE 52nd Annual Conference on. IEEE, 2013, pp. 212–219

work page 2013
[23]

Hybrid learning in stochastic games and its applications in network security,

Q. Zhu, H. Tembine, and T. Basar, “Hybrid learning in stochastic games and its applications in network security,”Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, pp. 305–329, 2013

work page 2013
[24]

Interference aware routing game for cognitive radio multi-hop networks,

Q. Zhu, Z. Yuan, J. B. Song, Z. Han, and T. Başar, “Interference aware routing game for cognitive radio multi-hop networks,”Selected Areas in Communications, IEEE Journal on, vol. 30, no. 10, pp. 2006–2015, 2012

work page 2006
[25]

Game-theoreticanalysisofnodecaptureandcloningattack with multiple attackers in wireless sensor networks,

Q.Zhu,L.Bushnell,andT.Basar,“Game-theoreticanalysisofnodecaptureandcloningattack with multiple attackers in wireless sensor networks,” inDecision and Control (CDC), 2012 IEEE 51st Annual Conference on. IEEE, 2012, pp. 3404–3411

work page 2012
[26]

Deceptive routing games,

Q. Zhu, A. Clark, R. Poovendran, and T. Başar, “Deceptive routing games,” inDecision and Control (CDC), 2012 IEEE 51st Annual Conference on. IEEE, 2012, pp. 2704–2711. Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense 23

work page 2012
[27]

A stochastic game model for jamming in multi-channel cognitive radio systems

Q. Zhu, H. Li, Z. Han, and T. Basar, “A stochastic game model for jamming in multi-channel cognitive radio systems.” inICC, 2010, pp. 1–6

work page 2010
[28]

Secure and practical output feedback control for cloud-enabled cyber- physical systems,

Z. Xu and Q. Zhu, “Secure and practical output feedback control for cloud-enabled cyber- physical systems,” inCommunications and Network Security (CNS), 2017 IEEE Conference on. IEEE, 2017, pp. 416–420

work page 2017
[29]

A Game-Theoretic Approach to Secure Control of Communication-Based Train Control Systems Under Jamming Attacks,

——, “A Game-Theoretic Approach to Secure Control of Communication-Based Train Control Systems Under Jamming Attacks,” inProceedings of the 1st International Workshop on Safe Control of Connected and Autonomous Vehicles. ACM, 2017, pp. 27–34. [Online]. Available: http://dl.acm.org/citation.cfm?id=3055381

work page 2017
[30]

Cross-layer secure cyber-physical control system design for networked 3d printers,

——, “Cross-layer secure cyber-physical control system design for networked 3d printers,” in American Control Conference (ACC), 2016. IEEE, 2016, pp. 1191–1196. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7525079/

work page arXiv 2016
[31]

Modeling, analysis, and mitigation of dynamic botnet formation in wireless iot networks,

M. J. Farooq and Q. Zhu, “Modeling, analysis, and mitigation of dynamic botnet formation in wireless iot networks,”IEEE Transactions on Information Forensics and Security, 2019

work page 2019
[32]

A cyber-physical game framework for secure and resilient multi-agent autonomous systems,

Z. Xu and Q. Zhu, “A cyber-physical game framework for secure and resilient multi-agent autonomous systems,” inDecision and Control (CDC), 2015 IEEE 54th Annual Conference on. IEEE, 2015, pp. 5156–5161

work page 2015
[33]

Alarge-scalemarkovgameapproachtodynamicprotectionof interdependent infrastructure networks,

L.Huang,J.Chen,andQ.Zhu,“Alarge-scalemarkovgameapproachtodynamicprotectionof interdependent infrastructure networks,” inInternational Conference on Decision and Game Theory for Security. Springer, 2017, pp. 357–376

work page 2017
[34]

Adynamicgameanalysisanddesignofinfrastructurenetwork protection and recovery,

J.Chen,C.Touati,andQ.Zhu,“Adynamicgameanalysisanddesignofinfrastructurenetwork protection and recovery,”ACM SIGMETRICS Performance Evaluation Review, vol. 45, no. 2, p. 128, 2017

work page 2017
[35]

A hybrid stochastic game for secure control of cyber-physical systems,

F. Miao, Q. Zhu, M. Pajic, and G. J. Pappas, “A hybrid stochastic game for secure control of cyber-physical systems,”Automatica, vol. 93, pp. 55–63, 2018

work page 2018
[36]

Resilient control of cyber-physical systems againstdenial-of-serviceattacks,

Y. Yuan, Q. Zhu, F. Sun, Q. Wang, and T. Basar, “Resilient control of cyber-physical systems againstdenial-of-serviceattacks,”in ResilientControlSystems(ISRCS),20136thInternational Symposium on. IEEE, 2013, pp. 54–59

work page 2013
[37]

GADAPT: A Sequential Game-Theoretic Framework for Designing Defense-in-Depth Strategies Against Advanced Persistent Threats,

S. Rass and Q. Zhu, “GADAPT: A Sequential Game-Theoretic Framework for Designing Defense-in-Depth Strategies Against Advanced Persistent Threats,” inDecision and Game TheoryforSecurity ,ser.LectureNotesinComputerScience,Q.Zhu,T.Alpcan,E.Panaousis, M. Tambe, and W. Casey, Eds. Cham: Springer International Publishing, 2016, vol. 9996, pp. 314–326

work page 2016
[38]

Dynamicinterferenceminimizationrouting game for on-demand cognitive pilot channel,

Q.Zhu,Z.Yuan,J.B.Song,Z.Han,andT.Basar,“Dynamicinterferenceminimizationrouting game for on-demand cognitive pilot channel,” inGlobal Telecommunications Conference (GLOBECOM 2010), 2010 IEEE. IEEE, 2010, pp. 1–6

work page 2010
[39]

Strategic defense against deceptive civilian gps spooﬁng of unmanned aerial vehicles,

T. Zhang and Q. Zhu, “Strategic defense against deceptive civilian gps spooﬁng of unmanned aerial vehicles,” inInternational Conference on Decision and Game Theory for Security. Springer, 2017, pp. 213–233

work page 2017
[40]

Analysis and computation of adaptive defense strategies against ad- vancedpersistentthreatsforcyber-physicalsystems,

L. Huang and Q. Zhu, “Analysis and computation of adaptive defense strategies against ad- vancedpersistentthreatsforcyber-physicalsystems,”in InternationalConferenceonDecision and Game Theory for Security, 2018

work page 2018
[41]

Adaptivestrategiccyberdefenseforadvancedpersistentthreatsincriticalinfrastructure networks,

——,“Adaptivestrategiccyberdefenseforadvancedpersistentthreatsincriticalinfrastructure networks,” inACM SIGMETRICS Performance Evaluation Review, 2018

work page 2018
[42]

Flip the cloud: Cyber-physical signaling games in the presenceofadvancedpersistentthreats,

J. Pawlick, S. Farhang, and Q. Zhu, “Flip the cloud: Cyber-physical signaling games in the presenceofadvancedpersistentthreats,”in DecisionandGameTheoryforSecurity . Springer, 2015, pp. 289–308

work page 2015
[43]

A dynamic bayesian security game framework for strategic defense mechanism design,

S. Farhang, M. H. Manshaei, M. N. Esfahani, and Q. Zhu, “A dynamic bayesian security game framework for strategic defense mechanism design,” inDecision and Game Theory for Security. Springer, 2014, pp. 319–328

work page 2014
[44]

Dynamicpolicy-basedidsconﬁguration,

Q.ZhuandT.Başar,“Dynamicpolicy-basedidsconﬁguration,”in DecisionandControl,2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on. IEEE, 2009, pp. 8600–8605

work page 2009
[45]

Networksecurityconﬁgurations:Anonzero-sumstochastic gameapproach,

Q.Zhu,H.Tembine,andT.Basar,“Networksecurityconﬁgurations:Anonzero-sumstochastic gameapproach,”in AmericanControlConference(ACC),2010 . IEEE,2010,pp.1059–1064. 24 Linan Huang and Quanyan Zhu

work page 2010
[46]

Heterogeneouslearninginzero-sumstochasticgameswith incomplete information,

Q.Zhu,H.Tembine,andT.Başar,“Heterogeneouslearninginzero-sumstochasticgameswith incomplete information,” in49th IEEE conference on decision and control (CDC). IEEE, 2010, pp. 219–224

work page 2010
[47]

Security as a Service for Cloud-Enabled Internet of Controlled Things under Advanced Persistent Threats: A Contract Design Approach,

J. Chen and Q. Zhu, “Security as a Service for Cloud-Enabled Internet of Controlled Things under Advanced Persistent Threats: A Contract Design Approach,” IEEE Transactions on Information Forensics and Security , 2017. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7954676/

work page arXiv 2017
[48]

ABi-LevelGameApproachtoAttack-AwareCyberInsurance ofComputerNetworks,

R.Zhang,Q.Zhu,andY.Hayel,“ABi-LevelGameApproachtoAttack-AwareCyberInsurance ofComputerNetworks,” IEEEJournalonSelectedAreasinCommunications ,vol.35,no.3,pp. 779–794, 2017. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7859343/

work page arXiv 2017
[49]

Attack-aware cyber insurance of interdependent computer networks,

R. Zhang and Q. Zhu, “Attack-aware cyber insurance of interdependent computer networks,” 2016

work page 2016
[50]

Compliance control: Managed vulnera- bility surface in social-technological systems via signaling games,

W. A. Casey, Q. Zhu, J. A. Morales, and B. Mishra, “Compliance control: Managed vulnera- bility surface in social-technological systems via signaling games,” inProceedings of the 7th ACM CCS International Workshop on Managing Insider Security Threats. ACM, 2015, pp. 53–62

work page 2015
[51]

Attack-aware cyber insurance for risk sharing in computer networks,

Y. Hayel and Q. Zhu, “Attack-aware cyber insurance for risk sharing in computer networks,” in Decision and Game Theory for Security. Springer, 2015, pp. 22–34

work page 2015
[52]

Epidemic protection over heterogeneous networks using evolutionary poisson games,

——, “Epidemic protection over heterogeneous networks using evolutionary poisson games,” IEEETransactionsonInformationForensicsandSecurity ,vol.12,no.8,pp.1786–1800,2017

work page 2017
[53]

Guidex:Agame-theoreticincentive-basedmech- anismforintrusiondetectionnetworks,

Q.Zhu,C.Fung,R.Boutaba,andT.Başar,“Guidex:Agame-theoreticincentive-basedmech- anismforintrusiondetectionnetworks,” SelectedAreasinCommunications,IEEEJournalon , vol. 30, no. 11, pp. 2220–2230, 2012

work page 2012
[54]

Tragedy of anticommons in digital right management of medical records

Q. Zhu, C. A. Gunter, and T. Basar, “Tragedy of anticommons in digital right management of medical records.” inHealthSec, 2012

work page 2012
[55]

Agame-theoreticalapproachtoincentivedesignin collaborative intrusion detection networks,

Q.Zhu,C.Fung,R.Boutaba,andT.Başar,“Agame-theoreticalapproachtoincentivedesignin collaborative intrusion detection networks,” inGame Theory for Networks, 2009. GameNets’

work page 2009
[56]

IEEE, 2009, pp

International Conference on. IEEE, 2009, pp. 384–392

work page 2009
[57]

A game theoretic investigation of deception in network security,

T. E. Carroll and D. Grosu, “A game theoretic investigation of deception in network security,” Security and Commun. Nets., vol. 4, no. 10, pp. 1162–1172, 2011

work page 2011
[58]

A Stackelberg game perspective on the conﬂict between machine learninganddataobfuscation,

J. Pawlick and Q. Zhu, “A Stackelberg game perspective on the conﬂict between machine learninganddataobfuscation,” IEEEIntl.WorkshoponInform.ForensicsandSecurity ,2016

work page 2016
[59]

DynamicdiﬀerentialprivacyforADMM-baseddistributedclassiﬁcation learning,

T.ZhangandQ.Zhu,“DynamicdiﬀerentialprivacyforADMM-baseddistributedclassiﬁcation learning,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 1, pp. 172–187, 2017. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7563366/

work page arXiv 2017
[60]

Phy-layerlocationprivacy-preservingaccesspointselection mechanism in next-generation wireless networks,

S.Farhang,Y.Hayel,andQ.Zhu,“Phy-layerlocationprivacy-preservingaccesspointselection mechanism in next-generation wireless networks,” inCommunications and Network Security (CNS), 2015 IEEE Conference on. IEEE, 2015, pp. 263–271

work page 2015
[61]

Distributedprivacy-preservingcollaborativeintrusiondetectionsystems for vanets,

T.ZhangandQ.Zhu,“Distributedprivacy-preservingcollaborativeintrusiondetectionsystems for vanets,”IEEE Transactions on Signal and Information Processing over Networks, vol. 4, no. 1, pp. 148–161, 2018

work page 2018
[62]

gPath: A game-theoretic path selection algorithm to protect tor’s anonymity,

N. Zhang, W. Yu, X. Fu, and S. K. Das, “gPath: A game-theoretic path selection algorithm to protect tor’s anonymity,” inDecision and Game Theory for Security. Springer, 2010, pp. 58–71

work page 2010
[63]

Security games with unknown adversarial strategies,

A. Garnaev, M. Baykal-Gursoy, and H. V. Poor, “Security games with unknown adversarial strategies,”IEEE transactions on cybernetics, vol. 46, no. 10, pp. 2291–2299, 2015

work page 2015
[64]

Distributed strategic learning with application to network security,

Q. Zhu, H. Tembine, and T. Başar, “Distributed strategic learning with application to network security,” inProceedings of the 2011 American Control Conference. IEEE, 2011, pp. 4057– 4062

work page 2011
[65]

Multi-agentreinforcementlearningforintrusiondetection:Acase study and evaluation,

A.ServinandD.Kudenko,“Multi-agentreinforcementlearningforintrusiondetection:Acase study and evaluation,” inGerman Conference on Multiagent System Technologies. Springer, 2008, pp. 159–170

work page 2008
[66]

Distributedbayesianlearninginmultiagentsystems:Improvingour understanding of its capabilities and limitations,

P.M.DjurićandY.Wang,“Distributedbayesianlearninginmultiagentsystems:Improvingour understanding of its capabilities and limitations,”IEEE Signal Processing Magazine, vol. 29, no. 2, pp. 65–76, 2012. Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense 25

work page 2012
[67]

Coordination in multiagent reinforcement learning: a bayesianapproach,

G. Chalkiadakis and C. Boutilier, “Coordination in multiagent reinforcement learning: a bayesianapproach,”in ProceedingsofthesecondinternationaljointconferenceonAutonomous agents and multiagent systems. ACM, 2003, pp. 709–716

work page 2003
[68]

Distributedreinforcementlearningforpowerlimitedmany-core system performance optimization,

Z.ChenandD.Marculescu,“Distributedreinforcementlearningforpowerlimitedmany-core system performance optimization,” inProceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition. EDA Consortium, 2015, pp. 1521–1526

work page 2015
[69]

Games with incomplete information played by “bayesian

J. C. Harsanyi, “Games with incomplete information played by “bayesian” players, i–iii part i. the basic model,”Management science, vol. 14, no. 3, pp. 159–182, 1967

work page 1967
[70]

Transfer learning for reinforcement learning domains: A survey,

M. E. Taylor and P. Stone, “Transfer learning for reinforcement learning domains: A survey,” Journal of Machine Learning Research, vol. 10, no. Jul, pp. 1633–1685, 2009

work page 2009

[1] [1]

Verizon 2019 data breach investigations report,

“Verizon 2019 data breach investigations report,” 2019

work page 2019

[2] [2]

Combatting cyber risks in the supply chain,

D. Shackleford, “Combatting cyber risks in the supply chain,”SANS. org, 2015

work page 2015

[3] [3]

Adaptive strategic cyber defense for advanced persistent threats in criticalinfrastructurenetworks,

L. Huang and Q. Zhu, “Adaptive strategic cyber defense for advanced persistent threats in criticalinfrastructurenetworks,” ACMSIGMETRICSPerformanceEvaluationReview ,vol.46, no. 2, pp. 52–56, 2019

work page 2019

[4] [4]

Analysis and computation of adaptive defense strategies against advanced persistent threatsforcyber-physicalsystems,

——, “Analysis and computation of adaptive defense strategies against advanced persistent threatsforcyber-physicalsystems,”in InternationalConferenceonDecisionandGameTheory for Security. Springer, 2018, pp. 205–226

work page 2018

[5] [5]

A Dynamic Games Approach to Proactive Defense Strategies against AdvancedPersistentThreatsinCyber-PhysicalSystems,

L. Huang and Q. Zhu, “A Dynamic Games Approach to Proactive Defense Strategies against AdvancedPersistentThreatsinCyber-PhysicalSystems,” arXive-prints,p.arXiv:1906.09687, Jun 2019

work page arXiv 1906

[6] [6]

Game-theoretic approach to feedback-driven multi-stage moving target defense,

Q. Zhu and T. Başar, “Game-theoretic approach to feedback-driven multi-stage moving target defense,” inInternational Conference on Decision and Game Theory for Security. Springer, 2013, pp. 246–263. 22 Linan Huang and Quanyan Zhu

work page 2013

[7] [7]

Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes,

L. Huang and Q. Zhu, “Adaptive Honeypot Engagement through Reinforcement Learning of Semi-Markov Decision Processes,”arXiv e-prints, p. arXiv:1906.12182, Jun 2019

work page arXiv 1906

[8] [8]

A Game-Theoretic Taxonomy and Survey of Defensive Deception for Cybersecurity and Privacy

J. Pawlick, E. Colbert, and Q. Zhu, “A game-theoretic taxonomy and survey of defensive deception for cybersecurity and privacy,”arXiv preprint arXiv:1712.05441, 2017

work page internal anchor Pith review Pith/arXiv arXiv 2017

[9] [9]

Integrating cyber-d&d into adversary modeling for active cyber defense,

F. J. Stech, K. E. Heckman, and B. E. Strom, “Integrating cyber-d&d into adversary modeling for active cyber defense,” inCyber deception. Springer, 2016, pp. 1–22

work page 2016

[10] [10]

Activecyberdefensewithdenialanddeception:Acyber-wargameexperiment,

K. E. Heckman, M. J. Walsh, F. J. Stech, T. A. O’boyle, S. R. DiCato, and A. F. Herber, “Activecyberdefensewithdenialanddeception:Acyber-wargameexperiment,” computers& security, vol. 37, pp. 72–77, 2013

work page 2013

[11] [11]

R-locker:Thwartingran- somware action through a honeyﬁle-based approach,

J.Gómez-Hernández,L.Álvarez-González,andP.García-Teodoro,“R-locker:Thwartingran- somware action through a honeyﬁle-based approach,”Computers & Security, vol. 73, pp. 389–398, 2018

work page 2018

[12] [12]

Changing the game: The art of deceiving sophisticated attackers,

N. Virvilis, B. Vanautgaerden, and O. S. Serrano, “Changing the game: The art of deceiving sophisticated attackers,” in2014 6th International Conference On Cyber Conﬂict (CyCon 2014). IEEE, 2014, pp. 87–97

work page 2014

[13] [13]

Modelingandanalysisofleakydeceptionusingsignaling games with evidence,

J.Pawlick,E.Colbert,andQ.Zhu,“Modelingandanalysisofleakydeceptionusingsignaling games with evidence,”IEEE Transactions on Information Forensics and Security, 2018

work page 2018

[14] [14]

SpringerScience&BusinessMedia,2011,vol.54

S.Jajodia,A.K.Ghosh,V.Swarup,C.Wang,andX.S.Wang, Movingtargetdefense:creating asymmetricuncertaintyforcyberthreats . SpringerScience&BusinessMedia,2011,vol.54

work page 2011

[15] [15]

Countering code-injection attacks with instruction-set randomization,

G. S. Kc, A. D. Keromytis, and V. Prevelakis, “Countering code-injection attacks with instruction-set randomization,” inProceedings of the 10th ACM conference on Computer and communications security. ACM, 2003, pp. 272–280

work page 2003

[16] [16]

Deceptive routing in relay networks,

A. Clark, Q. Zhu, R. Poovendran, and T. Başar, “Deceptive routing in relay networks,” in International Conference on Decision and Game Theory for Security. Springer, 2012, pp. 171–185

work page 2012

[17] [17]

Markov modeling of moving target defense games,

H. Maleki, S. Valizadeh, W. Koch, A. Bestavros, and M. van Dijk, “Markov modeling of moving target defense games,” inProceedings of the 2016 ACM Workshop on Moving Target Defense. ACM, 2016, pp. 81–92

work page 2016

[18] [18]

A methodology for intelligent honeypot deployment and active engagement of attackers,

C. R. Hecker, “A methodology for intelligent honeypot deployment and active engagement of attackers,” Ph.D. dissertation, 2012

work page 2012

[19] [19]

Deceptive attack and defense game in honeypot-enablednetworksfortheinternetofthings,

Q. D. La, T. Q. Quek, J. Lee, S. Jin, and H. Zhu, “Deceptive attack and defense game in honeypot-enablednetworksfortheinternetofthings,” IEEEInternetofThingsJournal ,vol.3, no. 6, pp. 1025–1035, 2016

work page 2016

[20] [20]

Optimal Timing in Dynamic and Robust Attacker Engagement During Advanced Persistent Threats

J. Pawlick, T. T. H. Nguyen, and Q. Zhu, “Optimal timing in dynamic and robust attacker engagement during advanced persistent threats,”CoRR, vol. abs/1707.08031, 2017. [Online]. Available: http://arxiv.org/abs/1707.08031

work page internal anchor Pith review Pith/arXiv arXiv 2017

[21] [21]

A Stackelberg game perspective on the conﬂict between machine learning and data obfuscation,

J. Pawlick and Q. Zhu, “A Stackelberg game perspective on the conﬂict between machine learning and data obfuscation,” in Information Forensics and Security (WIFS), 2016 IEEE International Workshop on . IEEE, 2016, pp. 1–6. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7823893/

work page arXiv 2016

[22] [22]

Deployment and exploitation of deceptive honeybots in social networks,

Q. Zhu, A. Clark, R. Poovendran, and T. Basar, “Deployment and exploitation of deceptive honeybots in social networks,” inDecision and Control (CDC), 2013 IEEE 52nd Annual Conference on. IEEE, 2013, pp. 212–219

work page 2013

[23] [23]

Hybrid learning in stochastic games and its applications in network security,

Q. Zhu, H. Tembine, and T. Basar, “Hybrid learning in stochastic games and its applications in network security,”Reinforcement Learning and Approximate Dynamic Programming for Feedback Control, pp. 305–329, 2013

work page 2013

[24] [24]

Interference aware routing game for cognitive radio multi-hop networks,

Q. Zhu, Z. Yuan, J. B. Song, Z. Han, and T. Başar, “Interference aware routing game for cognitive radio multi-hop networks,”Selected Areas in Communications, IEEE Journal on, vol. 30, no. 10, pp. 2006–2015, 2012

work page 2006

[25] [25]

Game-theoreticanalysisofnodecaptureandcloningattack with multiple attackers in wireless sensor networks,

Q.Zhu,L.Bushnell,andT.Basar,“Game-theoreticanalysisofnodecaptureandcloningattack with multiple attackers in wireless sensor networks,” inDecision and Control (CDC), 2012 IEEE 51st Annual Conference on. IEEE, 2012, pp. 3404–3411

work page 2012

[26] [26]

Deceptive routing games,

Q. Zhu, A. Clark, R. Poovendran, and T. Başar, “Deceptive routing games,” inDecision and Control (CDC), 2012 IEEE 51st Annual Conference on. IEEE, 2012, pp. 2704–2711. Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense 23

work page 2012

[27] [27]

A stochastic game model for jamming in multi-channel cognitive radio systems

Q. Zhu, H. Li, Z. Han, and T. Basar, “A stochastic game model for jamming in multi-channel cognitive radio systems.” inICC, 2010, pp. 1–6

work page 2010

[28] [28]

Secure and practical output feedback control for cloud-enabled cyber- physical systems,

Z. Xu and Q. Zhu, “Secure and practical output feedback control for cloud-enabled cyber- physical systems,” inCommunications and Network Security (CNS), 2017 IEEE Conference on. IEEE, 2017, pp. 416–420

work page 2017

[29] [29]

A Game-Theoretic Approach to Secure Control of Communication-Based Train Control Systems Under Jamming Attacks,

——, “A Game-Theoretic Approach to Secure Control of Communication-Based Train Control Systems Under Jamming Attacks,” inProceedings of the 1st International Workshop on Safe Control of Connected and Autonomous Vehicles. ACM, 2017, pp. 27–34. [Online]. Available: http://dl.acm.org/citation.cfm?id=3055381

work page 2017

[30] [30]

Cross-layer secure cyber-physical control system design for networked 3d printers,

——, “Cross-layer secure cyber-physical control system design for networked 3d printers,” in American Control Conference (ACC), 2016. IEEE, 2016, pp. 1191–1196. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7525079/

work page arXiv 2016

[31] [31]

Modeling, analysis, and mitigation of dynamic botnet formation in wireless iot networks,

M. J. Farooq and Q. Zhu, “Modeling, analysis, and mitigation of dynamic botnet formation in wireless iot networks,”IEEE Transactions on Information Forensics and Security, 2019

work page 2019

[32] [32]

A cyber-physical game framework for secure and resilient multi-agent autonomous systems,

Z. Xu and Q. Zhu, “A cyber-physical game framework for secure and resilient multi-agent autonomous systems,” inDecision and Control (CDC), 2015 IEEE 54th Annual Conference on. IEEE, 2015, pp. 5156–5161

work page 2015

[33] [33]

Alarge-scalemarkovgameapproachtodynamicprotectionof interdependent infrastructure networks,

L.Huang,J.Chen,andQ.Zhu,“Alarge-scalemarkovgameapproachtodynamicprotectionof interdependent infrastructure networks,” inInternational Conference on Decision and Game Theory for Security. Springer, 2017, pp. 357–376

work page 2017

[34] [34]

Adynamicgameanalysisanddesignofinfrastructurenetwork protection and recovery,

J.Chen,C.Touati,andQ.Zhu,“Adynamicgameanalysisanddesignofinfrastructurenetwork protection and recovery,”ACM SIGMETRICS Performance Evaluation Review, vol. 45, no. 2, p. 128, 2017

work page 2017

[35] [35]

A hybrid stochastic game for secure control of cyber-physical systems,

F. Miao, Q. Zhu, M. Pajic, and G. J. Pappas, “A hybrid stochastic game for secure control of cyber-physical systems,”Automatica, vol. 93, pp. 55–63, 2018

work page 2018

[36] [36]

Resilient control of cyber-physical systems againstdenial-of-serviceattacks,

Y. Yuan, Q. Zhu, F. Sun, Q. Wang, and T. Basar, “Resilient control of cyber-physical systems againstdenial-of-serviceattacks,”in ResilientControlSystems(ISRCS),20136thInternational Symposium on. IEEE, 2013, pp. 54–59

work page 2013

[37] [37]

GADAPT: A Sequential Game-Theoretic Framework for Designing Defense-in-Depth Strategies Against Advanced Persistent Threats,

S. Rass and Q. Zhu, “GADAPT: A Sequential Game-Theoretic Framework for Designing Defense-in-Depth Strategies Against Advanced Persistent Threats,” inDecision and Game TheoryforSecurity ,ser.LectureNotesinComputerScience,Q.Zhu,T.Alpcan,E.Panaousis, M. Tambe, and W. Casey, Eds. Cham: Springer International Publishing, 2016, vol. 9996, pp. 314–326

work page 2016

[38] [38]

Dynamicinterferenceminimizationrouting game for on-demand cognitive pilot channel,

Q.Zhu,Z.Yuan,J.B.Song,Z.Han,andT.Basar,“Dynamicinterferenceminimizationrouting game for on-demand cognitive pilot channel,” inGlobal Telecommunications Conference (GLOBECOM 2010), 2010 IEEE. IEEE, 2010, pp. 1–6

work page 2010

[39] [39]

Strategic defense against deceptive civilian gps spooﬁng of unmanned aerial vehicles,

T. Zhang and Q. Zhu, “Strategic defense against deceptive civilian gps spooﬁng of unmanned aerial vehicles,” inInternational Conference on Decision and Game Theory for Security. Springer, 2017, pp. 213–233

work page 2017

[40] [40]

Analysis and computation of adaptive defense strategies against ad- vancedpersistentthreatsforcyber-physicalsystems,

L. Huang and Q. Zhu, “Analysis and computation of adaptive defense strategies against ad- vancedpersistentthreatsforcyber-physicalsystems,”in InternationalConferenceonDecision and Game Theory for Security, 2018

work page 2018

[41] [41]

Adaptivestrategiccyberdefenseforadvancedpersistentthreatsincriticalinfrastructure networks,

——,“Adaptivestrategiccyberdefenseforadvancedpersistentthreatsincriticalinfrastructure networks,” inACM SIGMETRICS Performance Evaluation Review, 2018

work page 2018

[42] [42]

Flip the cloud: Cyber-physical signaling games in the presenceofadvancedpersistentthreats,

J. Pawlick, S. Farhang, and Q. Zhu, “Flip the cloud: Cyber-physical signaling games in the presenceofadvancedpersistentthreats,”in DecisionandGameTheoryforSecurity . Springer, 2015, pp. 289–308

work page 2015

[43] [43]

A dynamic bayesian security game framework for strategic defense mechanism design,

S. Farhang, M. H. Manshaei, M. N. Esfahani, and Q. Zhu, “A dynamic bayesian security game framework for strategic defense mechanism design,” inDecision and Game Theory for Security. Springer, 2014, pp. 319–328

work page 2014

[44] [44]

Dynamicpolicy-basedidsconﬁguration,

Q.ZhuandT.Başar,“Dynamicpolicy-basedidsconﬁguration,”in DecisionandControl,2009 held jointly with the 2009 28th Chinese Control Conference. CDC/CCC 2009. Proceedings of the 48th IEEE Conference on. IEEE, 2009, pp. 8600–8605

work page 2009

[45] [45]

Networksecurityconﬁgurations:Anonzero-sumstochastic gameapproach,

Q.Zhu,H.Tembine,andT.Basar,“Networksecurityconﬁgurations:Anonzero-sumstochastic gameapproach,”in AmericanControlConference(ACC),2010 . IEEE,2010,pp.1059–1064. 24 Linan Huang and Quanyan Zhu

work page 2010

[46] [46]

Heterogeneouslearninginzero-sumstochasticgameswith incomplete information,

Q.Zhu,H.Tembine,andT.Başar,“Heterogeneouslearninginzero-sumstochasticgameswith incomplete information,” in49th IEEE conference on decision and control (CDC). IEEE, 2010, pp. 219–224

work page 2010

[47] [47]

Security as a Service for Cloud-Enabled Internet of Controlled Things under Advanced Persistent Threats: A Contract Design Approach,

J. Chen and Q. Zhu, “Security as a Service for Cloud-Enabled Internet of Controlled Things under Advanced Persistent Threats: A Contract Design Approach,” IEEE Transactions on Information Forensics and Security , 2017. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7954676/

work page arXiv 2017

[48] [48]

ABi-LevelGameApproachtoAttack-AwareCyberInsurance ofComputerNetworks,

R.Zhang,Q.Zhu,andY.Hayel,“ABi-LevelGameApproachtoAttack-AwareCyberInsurance ofComputerNetworks,” IEEEJournalonSelectedAreasinCommunications ,vol.35,no.3,pp. 779–794, 2017. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7859343/

work page arXiv 2017

[49] [49]

Attack-aware cyber insurance of interdependent computer networks,

R. Zhang and Q. Zhu, “Attack-aware cyber insurance of interdependent computer networks,” 2016

work page 2016

[50] [50]

Compliance control: Managed vulnera- bility surface in social-technological systems via signaling games,

W. A. Casey, Q. Zhu, J. A. Morales, and B. Mishra, “Compliance control: Managed vulnera- bility surface in social-technological systems via signaling games,” inProceedings of the 7th ACM CCS International Workshop on Managing Insider Security Threats. ACM, 2015, pp. 53–62

work page 2015

[51] [51]

Attack-aware cyber insurance for risk sharing in computer networks,

Y. Hayel and Q. Zhu, “Attack-aware cyber insurance for risk sharing in computer networks,” in Decision and Game Theory for Security. Springer, 2015, pp. 22–34

work page 2015

[52] [52]

Epidemic protection over heterogeneous networks using evolutionary poisson games,

——, “Epidemic protection over heterogeneous networks using evolutionary poisson games,” IEEETransactionsonInformationForensicsandSecurity ,vol.12,no.8,pp.1786–1800,2017

work page 2017

[53] [53]

Guidex:Agame-theoreticincentive-basedmech- anismforintrusiondetectionnetworks,

Q.Zhu,C.Fung,R.Boutaba,andT.Başar,“Guidex:Agame-theoreticincentive-basedmech- anismforintrusiondetectionnetworks,” SelectedAreasinCommunications,IEEEJournalon , vol. 30, no. 11, pp. 2220–2230, 2012

work page 2012

[54] [54]

Tragedy of anticommons in digital right management of medical records

Q. Zhu, C. A. Gunter, and T. Basar, “Tragedy of anticommons in digital right management of medical records.” inHealthSec, 2012

work page 2012

[55] [55]

Agame-theoreticalapproachtoincentivedesignin collaborative intrusion detection networks,

Q.Zhu,C.Fung,R.Boutaba,andT.Başar,“Agame-theoreticalapproachtoincentivedesignin collaborative intrusion detection networks,” inGame Theory for Networks, 2009. GameNets’

work page 2009

[56] [56]

IEEE, 2009, pp

International Conference on. IEEE, 2009, pp. 384–392

work page 2009

[57] [57]

A game theoretic investigation of deception in network security,

T. E. Carroll and D. Grosu, “A game theoretic investigation of deception in network security,” Security and Commun. Nets., vol. 4, no. 10, pp. 1162–1172, 2011

work page 2011

[58] [58]

A Stackelberg game perspective on the conﬂict between machine learninganddataobfuscation,

J. Pawlick and Q. Zhu, “A Stackelberg game perspective on the conﬂict between machine learninganddataobfuscation,” IEEEIntl.WorkshoponInform.ForensicsandSecurity ,2016

work page 2016

[59] [59]

DynamicdiﬀerentialprivacyforADMM-baseddistributedclassiﬁcation learning,

T.ZhangandQ.Zhu,“DynamicdiﬀerentialprivacyforADMM-baseddistributedclassiﬁcation learning,” IEEE Transactions on Information Forensics and Security, vol. 12, no. 1, pp. 172–187, 2017. [Online]. Available: http://ieeexplore.ieee.org/abstract/document/7563366/

work page arXiv 2017

[60] [60]

Phy-layerlocationprivacy-preservingaccesspointselection mechanism in next-generation wireless networks,

S.Farhang,Y.Hayel,andQ.Zhu,“Phy-layerlocationprivacy-preservingaccesspointselection mechanism in next-generation wireless networks,” inCommunications and Network Security (CNS), 2015 IEEE Conference on. IEEE, 2015, pp. 263–271

work page 2015

[61] [61]

Distributedprivacy-preservingcollaborativeintrusiondetectionsystems for vanets,

T.ZhangandQ.Zhu,“Distributedprivacy-preservingcollaborativeintrusiondetectionsystems for vanets,”IEEE Transactions on Signal and Information Processing over Networks, vol. 4, no. 1, pp. 148–161, 2018

work page 2018

[62] [62]

gPath: A game-theoretic path selection algorithm to protect tor’s anonymity,

N. Zhang, W. Yu, X. Fu, and S. K. Das, “gPath: A game-theoretic path selection algorithm to protect tor’s anonymity,” inDecision and Game Theory for Security. Springer, 2010, pp. 58–71

work page 2010

[63] [63]

Security games with unknown adversarial strategies,

A. Garnaev, M. Baykal-Gursoy, and H. V. Poor, “Security games with unknown adversarial strategies,”IEEE transactions on cybernetics, vol. 46, no. 10, pp. 2291–2299, 2015

work page 2015

[64] [64]

Distributed strategic learning with application to network security,

Q. Zhu, H. Tembine, and T. Başar, “Distributed strategic learning with application to network security,” inProceedings of the 2011 American Control Conference. IEEE, 2011, pp. 4057– 4062

work page 2011

[65] [65]

Multi-agentreinforcementlearningforintrusiondetection:Acase study and evaluation,

A.ServinandD.Kudenko,“Multi-agentreinforcementlearningforintrusiondetection:Acase study and evaluation,” inGerman Conference on Multiagent System Technologies. Springer, 2008, pp. 159–170

work page 2008

[66] [66]

Distributedbayesianlearninginmultiagentsystems:Improvingour understanding of its capabilities and limitations,

P.M.DjurićandY.Wang,“Distributedbayesianlearninginmultiagentsystems:Improvingour understanding of its capabilities and limitations,”IEEE Signal Processing Magazine, vol. 29, no. 2, pp. 65–76, 2012. Strategic Learning for Active, Adaptive, and Autonomous Cyber Defense 25

work page 2012

[67] [67]

Coordination in multiagent reinforcement learning: a bayesianapproach,

G. Chalkiadakis and C. Boutilier, “Coordination in multiagent reinforcement learning: a bayesianapproach,”in ProceedingsofthesecondinternationaljointconferenceonAutonomous agents and multiagent systems. ACM, 2003, pp. 709–716

work page 2003

[68] [68]

Distributedreinforcementlearningforpowerlimitedmany-core system performance optimization,

Z.ChenandD.Marculescu,“Distributedreinforcementlearningforpowerlimitedmany-core system performance optimization,” inProceedings of the 2015 Design, Automation & Test in Europe Conference & Exhibition. EDA Consortium, 2015, pp. 1521–1526

work page 2015

[69] [69]

Games with incomplete information played by “bayesian

J. C. Harsanyi, “Games with incomplete information played by “bayesian” players, i–iii part i. the basic model,”Management science, vol. 14, no. 3, pp. 159–182, 1967

work page 1967

[70] [70]

Transfer learning for reinforcement learning domains: A survey,

M. E. Taylor and P. Stone, “Transfer learning for reinforcement learning domains: A survey,” Journal of Machine Learning Research, vol. 10, no. Jul, pp. 1633–1685, 2009

work page 2009