Inductive general game playing
Pith reviewed 2026-05-25 17:45 UTC · model grok-4.3
The pith
Inductive general game playing inverts the GGP task so that a learner must recover game rules from traces alone.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The central discovery is that the IGGP problem, obtained by reversing the GGP protocol, cannot be solved correctly by existing ILP systems on the majority of 50 automatically generated tasks. The best system recovers perfect rules for only 40 percent of the games, while most games remain unsolved.
What carries the argument
The automatic IGGP task generator that converts any GGP game description into a set of positive and negative game traces together with background knowledge, thereby producing the 50-game benchmark.
If this is right
- IGGP supplies a growing collection of relational learning problems whose difficulty scales with the annual GGP competition.
- Any ILP advance that improves performance on the benchmark directly advances the ability to learn game rules from demonstration traces.
- Because each task is derived from an existing GGP game, the same rule set can be used both for induction and for subsequent play evaluation.
- The 40 percent ceiling shows that standard ILP bias and search strategies are insufficient for theories that must capture turn-based dynamics and win conditions.
Where Pith is reading between the lines
- Systems that combine ILP with explicit search or planning modules may be needed to handle the long-horizon consistency required by full game rules.
- The same generator could be applied to other domains where an agent must recover transition rules from observed trajectories, such as robotics or process mining.
- If the benchmark grows, it offers a natural testbed for measuring progress in learning from partial observations without hand-crafted background knowledge.
Load-bearing premise
The generated traces contain sufficient information and lack systematic biases that would prevent even an ideal learner from recovering the correct rules.
What would settle it
A single run of any ILP system on the released 50-game dataset that returns correct rules for more than 40 percent of the games would falsify the empirical claim.
read the original abstract
General game playing (GGP) is a framework for evaluating an agent's general intelligence across a wide range of tasks. In the GGP competition, an agent is given the rules of a game (described as a logic program) that it has never seen before. The task is for the agent to play the game, thus generating game traces. The winner of the GGP competition is the agent that gets the best total score over all the games. In this paper, we invert this task: a learner is given game traces and the task is to learn the rules that could produce the traces. This problem is central to inductive general game playing (IGGP). We introduce a technique that automatically generates IGGP tasks from GGP games. We introduce an IGGP dataset which contains traces from 50 diverse games, such as Sudoku, Sokoban, and Checkers. We claim that IGGP is difficult for existing inductive logic programming (ILP) approaches. To support this claim, we evaluate existing ILP systems on our dataset. Our empirical results show that most of the games cannot be correctly learned by existing systems. The best performing system solves only 40% of the tasks perfectly. Our results suggest that IGGP poses many challenges to existing approaches. Furthermore, because we can automatically generate IGGP tasks from GGP games, our dataset will continue to grow with the GGP competition, as new games are added every year. We therefore think that the IGGP problem and dataset will be valuable for motivating and evaluating future research.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The paper inverts the GGP competition by defining the IGGP problem: given game traces generated from unknown logic-program rules, learn the rules. It describes an automatic procedure to generate IGGP tasks from existing GGP games, releases a dataset of traces from 50 games (Sudoku, Sokoban, Checkers, etc.), and reports an empirical evaluation in which off-the-shelf ILP systems are applied to the tasks; the best system solves only 40 % of tasks perfectly. The authors conclude that IGGP exposes limitations of current ILP methods and that the dataset will grow with future GGP games.
Significance. If the generated tasks are verifiably solvable by the original rules and the traces supply sufficient positive/negative examples, the 40 % figure would constitute a concrete, extensible benchmark that isolates relational induction difficulties beyond toy domains. The automatic generation pipeline and the promise of perpetual growth with the GGP competition are genuine strengths.
major comments (2)
- [IGGP task generation section] § describing the IGGP task generation procedure: the central empirical claim (best ILP system solves 40 % perfectly) presupposes that each generated task is solvable in principle, i.e., that the original GGP rules achieve perfect fit on the supplied traces and that the traces distinguish the target theory from alternatives. No verification of this property (coverage of state transitions, action preconditions, or negative examples) is reported; without it the 40 % result cannot be attributed to ILP limitations rather than benchmark construction.
- [Evaluation section] Evaluation section: the abstract states an empirical result (40 % perfect solutions) yet supplies no details on which ILP systems were tested, how traces were split into training/test, error bars across games, or failure modes (e.g., timeouts, memory exhaustion, or incorrect but consistent hypotheses). These omissions make the headline number impossible to interpret or reproduce.
minor comments (2)
- The paper should explicitly list the ILP systems, their parameter settings, and the exact success criterion (perfect reconstruction of all rules vs. partial correctness).
- Clarify whether the 50-game dataset is released with the paper and, if so, provide a persistent URL or repository reference.
Simulated Author's Rebuttal
We thank the referee for their thorough review and constructive comments on our paper. We address each of the major comments below.
read point-by-point responses
-
Referee: [IGGP task generation section] § describing the IGGP task generation procedure: the central empirical claim (best ILP system solves 40 % perfectly) presupposes that each generated task is solvable in principle, i.e., that the original GGP rules achieve perfect fit on the supplied traces and that the traces distinguish the target theory from alternatives. No verification of this property (coverage of state transitions, action preconditions, or negative examples) is reported; without it the 40 % result cannot be attributed to ILP limitations rather than benchmark construction.
Authors: We agree that explicit verification of the solvability of each IGGP task by the original rules is crucial for validating the benchmark. While the tasks are automatically generated from the GGP rules, ensuring that the traces provide sufficient positive and negative examples and that the rules achieve perfect coverage was not explicitly demonstrated in the manuscript. We will revise the IGGP task generation section to include such verification, for example by reporting the accuracy of the original rules on the generated traces for all 50 games. This will strengthen the claim that the results reflect limitations of ILP systems. revision: yes
-
Referee: [Evaluation section] Evaluation section: the abstract states an empirical result (40 % perfect solutions) yet supplies no details on which ILP systems were tested, how traces were split into training/test, error bars across games, or failure modes (e.g., timeouts, memory exhaustion, or incorrect but consistent hypotheses). These omissions make the headline number impossible to interpret or reproduce.
Authors: We acknowledge the lack of detail in the evaluation section regarding the experimental setup. The manuscript mentions the best performing system but does not provide the full list of tested systems, the precise train/test splits used for the traces, statistical measures such as error bars, or breakdowns of failure cases. We will expand the evaluation section to include these details, ensuring the results are reproducible and the 40% figure can be properly interpreted. This revision will address the concerns about interpretability. revision: yes
Circularity Check
No circularity: empirical evaluation on externally generated tasks
full rationale
The paper introduces an automatic procedure to generate IGGP tasks directly from existing GGP games (whose rules are known independently) and then measures the performance of off-the-shelf ILP systems on the resulting dataset. The headline result (best system solves 40% perfectly) is a direct empirical measurement on these new tasks; it does not reduce to any fitted parameter, self-definition, or self-citation chain. The generation step ensures consistency between traces and original rules by construction, but the evaluation tests recovery by external systems and therefore supplies independent evidence. No load-bearing step matches any of the enumerated circularity patterns.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Game traces produced by GGP rule sets are informationally sufficient for an ideal learner to recover the original rules.
Forward citations
Cited by 1 Pith paper
-
Logical reduction of metarules
Derivation reduction produces smaller equivalent metarule sets that outperform subsumption and entailment reductions on ILP tasks in accuracy and speed.
Reference graph
Works this paper leans on
-
[1]
Krzysztof R. Apt, Howard A. Blair, and Adrian Walker. Towards a theory of declarative knowledge. In Jack Minker, editor, Foundations of Deductive Databases and Logic Programming , pages 89–148. Morgan Kaufmann, 1988
work page 1988
-
[2]
Learning logical exceptions in chess
Michael Bain. Learning logical exceptions in chess. PhD thesis, University of Strathclyde, 1994
work page 1994
-
[3]
Structure learning of probabilistic logic programs by searching the clause space
Elena Bellodi and Fabrizio Riguzzi. Structure learning of probabilistic logic programs by searching the clause space. Theory and Practice of Logic Programming, 15(02):169–212, 2015
work page 2015
-
[4]
Learning rules of simplified boardgames by observing
Yngvi Björnsson. Learning rules of simplified boardgames by observing. In ECAI, pages 175–180, 2012
work page 2012
-
[5]
Kay Henning Brodersen, Cheng Soon Ong, Klaas Enno Stephan, and Joachim M. Buhmann. The balanced accuracy and its posterior distribution. In 20th International Conference on Pattern Recog- nition, ICPR 2010, Istanbul, Turkey, 23-26 August 2010, pages 3121–3124. IEEE Computer Society , 2010
work page 2010
-
[6]
Joseph Hoane Jr., and Feng-hsiung Hsu
Murray Campbell, A. Joseph Hoane Jr., and Feng-hsiung Hsu. Deep blue. Artif. Intell., 134(1-2):57– 83, 2002
work page 2002
-
[7]
Learning minesweeper with multirelational learning
Lourdes Peña Castillo and Stefan Wrobel. Learning minesweeper with multirelational learning. In IJCAI, pages 533–540. Morgan Kaufmann, 2003. Inductive general game playing 37
work page 2003
-
[8]
Inductive logic programming in answer set programming
Domenico Corapi, Alessandra Russo, and Emil Lupu. Inductive logic programming in answer set programming. In International Conference on Inductive Logic Programming , pages 91–97. Springer, 2011
work page 2011
-
[9]
Vítor Santos Costa, Ricardo Rocha, and Luís Damas. The YAP prolog system. TPLP, 12(1-2):5–34, 2012
work page 2012
-
[10]
Acquisition of object-centred domain models from planning examples
Stephen Cresswell, Thomas Leo McCluskey , and Margaret Mary West. Acquisition of object-centred domain models from planning examples. In ICAPS, 2009
work page 2009
-
[11]
Efficiently learning efficient programs
Andrew Cropper. Efficiently learning efficient programs. PhD thesis, Imperial College London, UK, 2017
work page 2017
-
[12]
Andrew Cropper and Stephen H. Muggleton. Logical minimisation of meta-rules within meta- interpretive learning. In Jesse Davis and Jan Ramon, editors, Inductive Logic Programming - 24th International Conference, ILP 2014, Nancy, France, September 14-16, 2014, Revised Selected Papers , volume 9046 of Lecture Notes in Computer Science, pages 62–75. Springer, 2014
work page 2014
-
[13]
Andrew Cropper and Stephen H. Muggleton. Learning higher-order logic programs through abstrac- tion and invention. In Subbarao Kambhampati, editor, Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI 2016, New York, NY, USA, 9-15 July 2016 , pages 1418–1424. IJCAI/AAAI Press, 2016
work page 2016
- [14]
- [15]
-
[16]
Derivation reduction of metarules in meta-interpretive learn- ing
Andrew Cropper and Sophie Tourret. Derivation reduction of metarules in meta-interpretive learn- ing. In ILP, volume 11105 of Lecture Notes in Computer Science, pages 1–21. Springer, 2018
work page 2018
-
[17]
Logical minimisation of metarules
Andrew Cropper and Sophie Tourret. Logical minimisation of metarules. Machine Learning, 2019. To appear
work page 2019
-
[18]
Complexity and expressive power of logic programming
Evgeny Dantsin, Thomas Eiter, Georg Gottlob, and Andrei Voronkov. Complexity and expressive power of logic programming. ACM Computing Surveys (CSUR), 33(3):374–425, 2001
work page 2001
-
[19]
Problog: A probabilistic prolog and its appli- cation in link discovery
Luc De Raedt, Angelika Kimmig, and Hannu Toivonen. Problog: A probabilistic prolog and its appli- cation in link discovery . InIJCAI, volume 7, pages 2462–2467, 2007
work page 2007
-
[20]
Luc De Raedt and Ingo Thon. Probabilistic rule learning. In International Conference on Inductive Logic Programming, pages 47–58. Springer, 2010
work page 2010
-
[21]
Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro com- pounds
Asim Kumar Debnath, Rosa L Lopez de Compadre, Gargi Debnath, Alan J Shusterman, and Cor- win Hansch. Structure-activity relationship of mutagenic aromatic and heteroaromatic nitro com- pounds. correlation with molecular orbital energies and hydrophobicity .Journal of medicinal chem- istry, 34(2):786–797, 1991
work page 1991
-
[22]
Optimal Learning: Computational procedures for Bayes- adaptive Markov decision processes
Michael O’Gordon Duff and Andrew Barto. Optimal Learning: Computational procedures for Bayes- adaptive Markov decision processes. PhD thesis, University of Massachusetts at Amherst, 2002
work page 2002
-
[23]
Relational reinforcement learning
Sašo Džeroski, Luc De Raedt, and Kurt Driessens. Relational reinforcement learning. Machine learn- ing, 43(1-2):7–52, 2001
work page 2001
-
[24]
Learning explanatory rules from noisy data
Richard Evans and Edward Grefenstette. Learning explanatory rules from noisy data. J. Artif. Intell. Res., 61:1–64, 2018
work page 2018
-
[25]
Simulation-based general game playing
Hilmar Finnsson et al. Simulation-based general game playing. Doctor of philosophy, School of Computer Science, Reykjavík University, 2012
work page 2012
-
[26]
Clingo = ASP + Control: Preliminary Report
Martin Gebser, Roland Kaminski, Benjamin Kaufmann, and Torsten Schaub. Clingo= ASP + control: Preliminary report. CoRR, abs/1405.3694, 2014
work page internal anchor Pith review Pith/arXiv arXiv 2014
-
[27]
Potassco: The potsdam answer set solving collection.Ai Communications, 24(2):107–124, 2011
Martin Gebser, Benjamin Kaufmann, Roland Kaminski, Max Ostrowski, Torsten Schaub, and Marius Schneider. Potassco: The potsdam answer set solving collection.Ai Communications, 24(2):107–124, 2011
work page 2011
-
[28]
Michael Genesereth and Michael Thielscher. General game playing. Synthesis Lectures on Artificial Intelligence and Machine Learning, 8(2):1–229, 2014
work page 2014
-
[29]
Genesereth and Yngvi Björnsson
Michael R. Genesereth and Yngvi Björnsson. The international general game playing competition. AI Magazine, 34(2):107–111, 2013
work page 2013
-
[30]
Inductive learning of chess rules using Progol
John Goodacre. Inductive learning of chess rules using Progol. PhD thesis, University of Oxford, 1996
work page 1996
-
[31]
The grl system: learning board game rules with piece-move interactions
Peter Gregory , Henrique Coli Schumann, Yngvi Björnsson, and Stephan Schiffel. The grl system: learning board game rules with piece-move interactions. In Computer Games , pages 130–148. Springer, 2015
work page 2015
-
[32]
Learning first-order definable concepts over structures of small degree
Martin Grohe and Martin Ritzert. Learning first-order definable concepts over structures of small degree. In Logic in Computer Science (LICS), 2017 32nd Annual ACM /IEEE Symposium on , pages 1–12. IEEE, 2017
work page 2017
-
[33]
Efficient bayes-adaptive reinforcement learning using sample-based search
Arthur Guez, David Silver, and Peter Dayan. Efficient bayes-adaptive reinforcement learning using sample-based search. In Advances in Neural Information Processing Systems, pages 1025–1033, 2012. 38 Andrew Cropper et al
work page 2012
-
[34]
Learning distributed representations of concepts
Geoffrey E Hinton. Learning distributed representations of concepts. In Proceedings of the eighth annual conference of the cognitive science society, volume 1, page 12. Amherst, MA, 1986
work page 1986
-
[35]
Completing causal networks by meta- level abduction
Katsumi Inoue, Andrei Doncescu, and Hidetomo Nabeshima. Completing causal networks by meta- level abduction. Machine learning, 91(2):239–277, 2013
work page 2013
-
[36]
The international SAT solver competitions
Matti Järvisalo, Daniel Le Berre, Olivier Roussel, and Laurent Simon. The international SAT solver competitions. AI Magazine, 33(1), 2012
work page 2012
-
[37]
Learning games from videos guided by descriptive complexity
Lukasz Kaiser. Learning games from videos guided by descriptive complexity . In AAAI, 2012
work page 2012
-
[38]
Exploiting answer set programming with exter- nal sources for meta-interpretive learning
Tobias Kaminski, Thomas Eiter, and Katsumi Inoue. Exploiting answer set programming with exter- nal sources for meta-interpretive learning. TPLP, 18(3-4):571–588, 2018
work page 2018
-
[39]
Finite-sample convergence rates for q-learning and indirect algorithms
Michael J Kearns and Satinder P Singh. Finite-sample convergence rates for q-learning and indirect algorithms. In Advances in neural information processing systems, pages 996–1002, 1999
work page 1999
-
[40]
Stochastic constraint program- ming for general game playing with imperfect information
Frédéric Koriche, Sylvain Lagrue, Éric Piette, and Sébastien Tabary . Stochastic constraint program- ming for general game playing with imperfect information. In General Intelligence in Game-Playing Agents (GIGA?16) at the 25th International Joint Conference on Artificial Intelligence (IJCAI?16), pages, 2016
work page 2016
-
[41]
Woodstock: un programme- joueur générique dirigé par les contraintes stochastiques
Frédéric Koriche, Sylvain Lagrue, Éric Piette, and Sébastien Tabary . Woodstock: un programme- joueur générique dirigé par les contraintes stochastiques. Revue d?intelligence artificielle–no, 307:336, 2017
work page 2017
-
[42]
J. Larson and Ryszard S. Michalski. Inductive inference of VL decision rules. SIGART Newsletter, 63:38–44, 1977
work page 1977
-
[43]
Inductive Learning of Answer Set Programs
Mark Law. Inductive Learning of Answer Set Programs . PhD thesis, Imperial College London, UK, 2018
work page 2018
-
[44]
Inductive learning of answer set programs
Mark Law, Alessandra Russo, and Krysia Broda. Inductive learning of answer set programs. In Logics in Artificial Intelligence - 14th European Conference, JELIA 2014, Funchal, Madeira, Portugal, September 24-26, 2014. Proceedings, pages 311–325, 2014
work page 2014
-
[45]
The ILASP system for learning answer set programs
Mark Law, Alessandra Russo, and Krysia Broda. The ILASP system for learning answer set programs. https://www.doc.ic.ac.uk/~ml1909/ILASP, 2015
work page 2015
-
[46]
Learning weak constraints in answer set program- ming
Mark Law, Alessandra Russo, and Krysia Broda. Learning weak constraints in answer set program- ming. Theory and Practice of Logic Programming, 15(4-5):511–525, 2015
work page 2015
-
[47]
Iterative learning of answer set programs from context dependent examples
Mark Law, Alessandra Russo, and Krysia Broda. Iterative learning of answer set programs from context dependent examples. Theory and Practice of Logic Programming, 16(5-6):834–848, 2016
work page 2016
-
[48]
The complexity and generality of learning answer set programs
Mark Law, Alessandra Russo, and Krysia Broda. The complexity and generality of learning answer set programs. Artificial Intelligence, 259:110–146, 2018
work page 2018
-
[49]
Inductive learning of answer set programs from noisy examples
Mark Law, Alessandra Russo, and Krysia Broda. Inductive learning of answer set programs from noisy examples. Advances in Cognitive Systems, 2018
work page 2018
-
[50]
The 2016 competition on Inductive Logic Programming
Mark Law, Alessandra Russo, James Cussens, and Krysia Broda. The 2016 competition on Inductive Logic Programming. http: //ilp16.doc.ic.ac.uk/competition/, 2016
work page 2016
-
[51]
What is answer set programming?
Vladimir Lifschitz. What is answer set programming?. In AAAI, volume 8, pages 1594–1597, 2008
work page 2008
-
[52]
Tenenbaum, and Stephen Muggleton
Dianhuan Lin, Eyal Dechter, Kevin Ellis, Joshua B. Tenenbaum, and Stephen Muggleton. Bias re- formulation for one-shot function induction. In Torsten Schaub, Gerhard Friedrich, and Barry O’Sullivan, editors, ECAI 2014 - 21st European Conference on Artificial Intelligence, 18-22 August 2014, Prague, Czech Republic - Including Prestigious Applications of Int...
work page 2014
-
[53]
Markov games as a framework for multi-agent reinforcement learning
Michael L Littman. Markov games as a framework for multi-agent reinforcement learning. InMachine Learning Proceedings 1994, pages 157–163. Elsevier, 1994
work page 1994
-
[54]
General game playing: Game description language specification
Nathaniel Love, Timothy Hinrichs, David Haley , Eric Schkufza, and Michael Genesereth. General game playing: Game description language specification. Technical report, Stanford Logic Group Computer Science Department Stanford University , Technical Report LG-2006-01, 2008
work page 2006
-
[55]
Eduardo M. Morales. Learning playing strategies in chess. Computational Intelligence, 12:65–87, 1996
work page 1996
-
[56]
Stephen Muggleton. Inverse entailment and progol. New Generation Comput., 13(3&4):245–286, 1995
work page 1995
-
[57]
An experimental compar- ison of human and machine learning formalisms
Stephen Muggleton, Michael Bain, Jean Hayes Michie, and Donald Michie. An experimental compar- ison of human and machine learning formalisms. In Alberto Maria Segre, editor, Proceedings of the Sixth International Workshop on Machine Learning (ML 1989), Cornell University, Ithaca, New York, USA, June 26-27, 1989, pages 113–118. Morgan Kaufmann, 1989
work page 1989
-
[58]
Chess revision: Acquiring the rules of chess variants through FOL theory revision from examples
Stephen Muggleton, Aline Paes, Vítor Santos Costa, and Gerson Zaverucha. Chess revision: Acquiring the rules of chess variants through FOL theory revision from examples. In Luc De Raedt, editor, Inductive Logic Programming, 19th International Conference, ILP 2009, Leuven, Belgium, July 02-04,
work page 2009
-
[59]
Revised Papers, volume 5989 of Lecture Notes in Computer Science , pages 123–130. Springer, 2009. Inductive general game playing 39
work page 2009
-
[60]
Flach, Katsumi Inoue, and Ashwin Srinivasan
Stephen Muggleton, Luc De Raedt, David Poole, Ivan Bratko, Peter A. Flach, Katsumi Inoue, and Ashwin Srinivasan. ILP turns 20 - biography and future challenges. Machine Learning, 86(1):3–23, 2012
work page 2012
-
[61]
Muggleton, Dianhuan Lin, Niels Pahlavi, and Alireza Tamaddoni-Nezhad
Stephen H. Muggleton, Dianhuan Lin, Niels Pahlavi, and Alireza Tamaddoni-Nezhad. Meta- interpretive learning: application to grammatical inference. Machine Learning, 94(1):25–49, 2014
work page 2014
-
[62]
Muggleton, Dianhuan Lin, and Alireza Tamaddoni-Nezhad
Stephen H. Muggleton, Dianhuan Lin, and Alireza Tamaddoni-Nezhad. Meta-interpretive learning of higher-order dyadic datalog: predicate invention revisited. Machine Learning, 100(1):49–73, 2015
work page 2015
-
[63]
Learning from noisy data using a non-covering ILP algorithm
Andrej Oblak and Ivan Bratko. Learning from noisy data using a non-covering ILP algorithm. In International Conference on Inductive Logic Programming, pages 190–197. Springer, 2010
work page 2010
-
[64]
Ramón P Otero. Induction of stable models. In Inductive Logic Programming , pages 193–205. Springer, 2001
work page 2001
-
[65]
J. Ross Quinlan. Learning logical definitions from relations. Machine Learning, 5:239–266, 1990
work page 1990
-
[66]
Logical and relational learning
Luc De Raedt. Logical and relational learning. Cognitive Technologies. Springer, 2008
work page 2008
-
[67]
A history of probabilistic inductive logic pro- gramming
Fabrizio Riguzzi, Elena Bellodi, and Riccardo Zese. A history of probabilistic inductive logic pro- gramming. Frontiers in Robotics and AI, 1:6, 2014
work page 2014
-
[68]
Scaling struc- ture learning of probabilistic logic programs by mapreduce
Fabrizio Riguzzi, Elena Bellodi, Riccardo Zese, Giuseppe Cota, and Evelina Lamma. Scaling struc- ture learning of probabilistic logic programs by mapreduce. In European Conference on Artificial Intelligence, 2016
work page 2016
-
[69]
CHINOOK: the world man-machine checkers champion
Jonathan Schaeffer, Robert Lake, Paul Lu, and Martin Bryant. CHINOOK: the world man-machine checkers champion. AI Magazine, 17(1):21–29, 1996
work page 1996
-
[70]
Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
David Silver, Thomas Hubert, Julian Schrittwieser, Ioannis Antonoglou, Matthew Lai, Arthur Guez, Marc Lanctot, Laurent Sifre, Dharshan Kumaran, Thore Graepel, et al. Mastering chess and shogi by self-play with a general reinforcement learning algorithm. arXiv preprint arXiv:1712.01815, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[71]
A. Srinivasan. The ALEPH manual. Machine Learning at the Computing Laboratory, Oxford University, 2001
work page 2001
-
[72]
Ashwin Srinivasan, Ross Donald King, S. H Muggleton, and M.J.E. Sternberg. Carcinogenesis pre- dictions using ILP. Inductive Logic Programming, 1297:273–287, 1997
work page 1997
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.