pith. sign in

arxiv: 1907.09273 · v2 · pith:HFJWCXHOnew · submitted 2019-07-22 · 💻 cs.AI · cs.CL

Why Build an Assistant in Minecraft?

Pith reviewed 2026-05-24 18:19 UTC · model grok-4.3

classification 💻 cs.AI cs.CL
keywords Minecraftnatural language understandingdialogue learningAI assistantlanguage groundinggame environmentinteractive learning
0
0 comments X

The pith

Building an open assistant in Minecraft advances natural language understanding and learning from dialogue.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper lays out a rationale for a research program centered on constructing an open assistant inside the game Minecraft. This setup is intended to tackle longstanding issues in natural language understanding and acquiring skills through dialogue. Minecraft supplies a shared, modifiable environment where language can be tied directly to actions, observations, and creative tasks. A reader following the argument would see the game as a controlled yet open-ended testbed that could yield transferable insights into how agents interpret and respond to instructions. The authors position this effort as a practical step toward more capable dialogue-based AI systems.

Core claim

The authors maintain that an open assistant developed within Minecraft supplies a workable route to measurable progress on natural language understanding and the capacity to learn from ongoing dialogue with humans or other agents.

What carries the argument

An open assistant inside Minecraft that uses language to interact with the game's dynamic, creative world and to learn from dialogue.

Load-bearing premise

The interactive and creative features of Minecraft act as a suitable stand-in for real-world language use that will transfer to wider natural language understanding problems.

What would settle it

An experiment showing that an assistant trained to converse and act inside Minecraft produces no measurable gains on standard dialogue or instruction-following benchmarks outside the game.

read the original abstract

In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

0 major / 2 minor

Summary. The manuscript presents a rationale for a research program to build an open assistant in the game Minecraft, with the aim of advancing natural language understanding and learning from dialogue through interactive gameplay.

Significance. If the rationale holds and the program is executed, Minecraft could serve as an accessible, rich testbed for developing dialogue-capable AI systems that learn through creative interaction, potentially complementing existing NLU benchmarks with more open-ended scenarios.

minor comments (2)
  1. The rationale would be strengthened by explicit discussion of evaluation metrics or milestones that would demonstrate progress on NLU goals within the Minecraft setting.
  2. Consider adding references to prior work on game-based AI environments (e.g., other sandbox games or dialogue agents) to better situate the proposal.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for their positive summary, significance assessment, and recommendation of minor revision. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity in proposal document

full rationale

The paper is explicitly a high-level rationale for a proposed research program rather than a derivation of results, theorems, or empirical findings. It contains no equations, fitted parameters, quantitative predictions, or load-bearing self-citations that reduce claims to inputs by construction. The central argument is prospective and conditional on future work, with no internal steps that can be shown to be equivalent to their own premises.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

This is a position paper outlining reasons for a research direction. It does not introduce or rely on new free parameters, axioms, or invented entities in a technical sense.

pith-pipeline@v0.9.0 · 5590 in / 925 out tokens · 25350 ms · 2026-05-24T18:19:19.751780+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

102 extracted references · 102 canonical work pages · 36 internal anchors

  1. [1]

    https://www.gamesindustry.biz/articles/ 2018-10-02-minecraft-exceeds-90-million-monthly-active-users

    Minecraft exceeds 90 million monthly active users. https://www.gamesindustry.biz/articles/ 2018-10-02-minecraft-exceeds-90-million-monthly-active-users

  2. [2]

    https://web.archive.org/web/20190625193739/http://minerl.io/

    Minerl. https://web.archive.org/web/20190625193739/http://minerl.io/

  3. [3]

    Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft

    Stephan Alaniz. Deep reinforcement learning with model learning and monte carlo tree search in minecraft. arXiv preprint arXiv:1803.08456, 2018

  4. [4]

    How players speak to an intelligent game character using natural language messages

    Fraser Allison, Ewa Luger, and Katja Hofmann. How players speak to an intelligent game character using natural language messages. Transactions of the Digital Games Research Association, 4(2), 2018

  5. [5]

    Neural module networks

    Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. Neural module networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 39–48, 2016

  6. [6]

    Vqa: Visual question answering

    Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. Vqa: Visual question answering. InProceedings of the IEEE international conference on computer vision, pages 2425–2433, 2015

  7. [7]

    Weakly supervised learning of semantic parsers for mapping instructions to actions

    Yoav Artzi and Luke Zettlemoyer. Weakly supervised learning of semantic parsers for mapping instructions to actions. Transactions of the Association for Computational Linguistics, 1:49–62, 2013

  8. [8]

    DeepMind Lab

    Charles Beattie, Joel Z Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich K ¨uttler, Andrew Lefrancq, Simon Green, V ´ıctor Vald´es, Amir Sadik, et al. Deepmind lab. arXiv preprint arXiv:1612.03801, 2016

  9. [9]

    The arcade learning environment: An evaluation platform for general agents

    Marc G Bellemare, Yavar Naddaf, Joel Veness, and Michael Bowling. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013

  10. [10]

    Learning interpretable spatial operations in a rich 3d blocks world

    Yonatan Bisk, Kevin J Shih, Yejin Choi, and Daniel Marcu. Learning interpretable spatial operations in a rich 3d blocks world. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018

  11. [11]

    Natural language communication with robots

    Yonatan Bisk, Deniz Yuret, and Daniel Marcu. Natural language communication with robots. InProceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 751–761, 2016

  12. [12]

    Learning end-to-end goal-oriented dialog

    Antoine Bordes, Y-Lan Boureau, and Jason Weston. Learning end-to-end goal-oriented dialog. In Proceedings of the International Conference on Learning Representations (ICLR), 2017

  13. [13]

    Programming with a differentiable forth interpreter

    Matko Bo ˇsnjak, Tim Rockt¨aschel, Jason Naradowsky, and Sebastian Riedel. Programming with a differentiable forth interpreter. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 547–556. JMLR. org, 2017

  14. [14]

    HoME: a Household Multimodal Environment

    Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, and Aaron Courville. Home: A household multimodal environment. arXiv preprint arXiv:1711.11017, 2017

  15. [15]

    Learning actions from human-robot dialogues

    Rehj Cantrell, Paul Schermerhorn, and Matthias Scheutz. Learning actions from human-robot dialogues. In RO-MAN, 2011 IEEE, pages 125–130. IEEE, 2011

  16. [16]

    Matterport3D: Learning from RGB-D Data in Indoor Environments

    Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158, 2017

  17. [17]

    H., and Ben- gio, Y

    Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, and Yoshua Bengio. Babyai: First steps towards grounded language learning with a human in the loop. arXiv preprint arXiv:1810.08272, 2018

  18. [18]

    Textworld: A learning environment for text-based games.arXiv preprint arXiv:1806.11532, 2018

    Marc-Alexandre C ˆot´e, ´Akos K ´ad´ar, Xingdi Yuan, Ben Kybartas, Tavian Barnes, Emery Fine, James Moore, Matthew Hausknecht, Layla El Asri, Mahmoud Adada, et al. Textworld: A learning environment for text-based games. arXiv preprint arXiv:1806.11532, 2018

  19. [19]

    Embodied question answering

    Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, and Dhruv Batra. Embodied question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , volume 5, page 14, 2018

  20. [20]

    Visual dialog

    Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, Jos ´e MF Moura, Devi Parikh, and Dhruv Batra. Visual dialog. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, volume 2, 2017

  21. [21]

    Language to Logical Form with Neural Attention

    Li Dong and Mirella Lapata. Language to logical form with neural attention. arXiv preprint arXiv:1601.01280, 2016

  22. [22]

    Frames: a corpus for adding memory to goal-oriented dialogue systems

    Layla El Asri, Hannes Schulz, Shikhar Sharma, Jeremie Zumer, Justin Harris, Emery Fine, Rahul Mehrotra, and Kaheer Suleman. Frames: a corpus for adding memory to goal-oriented dialogue systems. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pages 207–219, Saarbr¨ucken, Germany, August

  23. [23]

    Association for Computational Linguistics

  24. [24]

    Write a classifier: Zero-shot learning using purely textual descriptions

    Mohamed Elhoseiny, Babak Saleh, and Ahmed Elgammal. Write a classifier: Zero-shot learning using purely textual descriptions. In Proceedings of the IEEE International Conference on Computer Vision , pages 2584– 2591, 2013

  25. [25]

    Lifelong perceptual programming by example

    Alexander L Gaunt, Marc Brockschmidt, Nate Kushman, and Daniel Tarlow. Lifelong perceptual programming by example. 2016

  26. [26]

    Differentiable programs with neural libraries

    Alexander L Gaunt, Marc Brockschmidt, Nate Kushman, and Daniel Tarlow. Differentiable programs with neural libraries. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1213–1222. JMLR. org, 2017

  27. [27]

    TerpreT: A Probabilistic Programming Language for Program Induction

    Alexander L Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, and Daniel Tarlow. Terpret: A probabilistic programming language for program induction. arXiv preprint arXiv:1608.04428, 2016

  28. [28]

    kinematic fitting

    Jonas Gehring, Zeming Lin, Daniel Haziza, Vegard Mella, Daniel Gant, Nicolas Carion, Dexter Ju, Danielle Rothermel, Laura Gustafson, Eugene Kharitonov, Vasil Khalidov, Florentin Guth, Nantas Nardelli, Nicolas Usunier, and Gabriel Synnaeve. TorchCraftAI v1.1. https://torchcraft.github.io/ TorchCraftAI/docs/core-abstractions.html. Accessed: 2019-07-18, DOI:...

  29. [29]

    CraftAssist: A Framework for Dialogue-enabled Interactive Agents

    Jonathan Gray, Kavya Srinet, Yacine Jernite, Haonan Yu, Zhuoyuan Chen, Demi Guo, Siddharth Goyal, C. Lawrence Zitnick, and Arthur Szlam. Craftassist: A framework for dialogue-enabled interactive agents. arXiv preprint arXiv:1907.08584, 2019

  30. [30]

    Spreadsheet data manipulation using examples

    Sumit Gulwani, William R Harris, and Rishabh Singh. Spreadsheet data manipulation using examples. Com- munications of the ACM, 55(8):97–105, 2012

  31. [31]

    Program synthesis

    Sumit Gulwani, Oleksandr Polozov, Rishabh Singh, et al. Program synthesis. Foundations and Trends R⃝ in Programming Languages, 4(1-2):1–119, 2017

  32. [32]

    Dialog-to-action: conversational question an- swering over a large-scale knowledge base

    Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, and Jian Yin. Dialog-to-action: conversational question an- swering over a large-scale knowledge base. In Advances in Neural Information Processing Systems , pages 2942–2951, 2018

  33. [33]

    From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood

    Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, and Percy Liang. From language to programs: Bridging reinforcement learning and maximum marginal likelihood. arXiv preprint arXiv:1704.07926, 2017

  34. [34]

    Learning from Dialogue after Deployment: Feed Yourself, Chatbot!

    Braden Hancock, Antoine Bordes, Pierre-Emmanuel Mazare, and Jason Weston. Learning from dialogue after deployment: Feed yourself, chatbot! arXiv preprint arXiv:1901.05415, 2019

  35. [35]

    Train- ing classifiers with natural language explanations

    Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher R´e. Train- ing classifiers with natural language explanations. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1884–1895. Association for Computational Lin- guistics, 2018

  36. [36]

    The second dialog state tracking challenge

    Matthew Henderson, Blaise Thomson, and Jason D Williams. The second dialog state tracking challenge. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), pages 263–272, 2014

  37. [37]

    Search-based neural structured learning for sequential ques- tion answering

    Mohit Iyyer, Wen-tau Yih, and Ming-Wei Chang. Search-based neural structured learning for sequential ques- tion answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1821–1831, 2017

  38. [38]

    Data Recombination for Neural Semantic Parsing

    Robin Jia and Percy Liang. Data recombination for neural semantic parsing. arXiv preprint arXiv:1606.03622, 2016

  39. [39]

    The malmo platform for artificial intelli- gence experimentation

    Matthew Johnson, Katja Hofmann, Tim Hutton, and David Bignell. The malmo platform for artificial intelli- gence experimentation. In IJCAI, pages 4246–4247, 2016

  40. [40]

    Exploring the Limits of Language Modeling

    Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016

  41. [41]

    Progressive Growing of GANs for Improved Quality, Stability, and Variation

    Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017

  42. [42]

    Vizdoom: A doom-based ai research platform for visual reinforcement learning

    Michał Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Ja ´skowski. Vizdoom: A doom-based ai research platform for visual reinforcement learning. In Computational Intelligence and Games (CIG), 2016 IEEE Conference on, pages 1–8. IEEE, 2016

  43. [43]

    Where is misty? interpreting spatial descriptors by modeling regions in space

    Nikita Kitaev and Dan Klein. Where is misty? interpreting spatial descriptors by modeling regions in space. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 157–166, 2017

  44. [44]

    The alexa meaning representation language

    Thomas Kollar, Danielle Berry, Lauren Stuart, Karolina Owczarzak, Tagyoung Chung, Lambert Mathias, Michael Kayser, Bradford Snow, and Spyros Matsoukas. The alexa meaning representation language. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, Volume 3 (Ind...

  45. [45]

    AI2-THOR: An Interactive 3D Environment for Visual AI

    Eric Kolve, Roozbeh Mottaghi, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:1712.05474, 2017

  46. [46]

    Human-level concept learning through probabilistic program induction

    Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015

  47. [47]

    Building machines that learn and think like people

    Brenden M Lake, Tomer D Ullman, Joshua B Tenenbaum, and Samuel J Gershman. Building machines that learn and think like people. Behavioral and brain sciences, 40, 2017

  48. [48]

    Modular architecture for starcraft ii with deep reinforcement learning

    Dennis Lee, Haoran Tang, Jeffrey O Zhang, Huazhe Xu, Trevor Darrell, and Pieter Abbeel. Modular architecture for starcraft ii with deep reinforcement learning. In Fourteenth Artificial Intelligence and Interactive Digital Entertainment Conference, 2018

  49. [49]

    J. Li, A. H. Miller, S. Chopra, M. Ranzato, and J. Weston. Dialogue learning with human-in-the-loop. arXiv preprint arXiv:1611.09823, 2016

  50. [50]

    J. Li, A. H. Miller, S. Chopra, M. Ranzato, and J. Weston. Learning through dialogue interactions. arXiv preprint arXiv:1612.04936, 2016

  51. [51]

    Towards deep conversational recommendations

    Raymond Li, Samira Ebrahimi Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, and Chris Pal. Towards deep conversational recommendations. In Advances in Neural Information Processing Systems, pages 9748–9758, 2018

  52. [52]

    Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision

    Chen Liang, Jonathan Berant, Quoc Le, Kenneth D Forbus, and Ni Lao. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. arXiv preprint arXiv:1611.00020, 2016

  53. [53]

    Learning executable semantic parsers for natural language understanding

    Percy Liang. Learning executable semantic parsers for natural language understanding. Commun. ACM , 59(9):68–76, August 2016

  54. [54]

    Learning dependency-based compositional semantics

    Percy Liang, Michael Jordan, and Dan Klein. Learning dependency-based compositional semantics. In Pro- ceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 590–599. Association for Computational Linguistics, 2011

  55. [55]

    Microsoft coco: Common objects in context

    Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll ´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014

  56. [56]

    Explore, exploit or listen: Combining human feed- back and policy model to speed up deep reinforcement learning in 3d worlds.arXiv preprint arXiv:1709.03969, 2017

    Zhiyu Lin, Brent Harrison, Aaron Keech, and Mark O Riedl. Explore, exploit or listen: Combining human feed- back and policy model to speed up deep reinforcement learning in 3d worlds.arXiv preprint arXiv:1709.03969, 2017

  57. [57]

    Teaching Machines to Describe Images via Natural Language Feedback

    Huan Ling and Sanja Fidler. Teaching machines to describe images via natural language feedback. arXiv preprint arXiv:1706.00130, 2017

  58. [58]

    Exploring the Limits of Weakly Supervised Pretraining

    Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens van der Maaten. Exploring the limits of weakly supervised pretraining. arXiv preprint arXiv:1805.00932, 2018

  59. [59]

    Habitat: A platform for embod- ied ai research

    Manolis Savva*, Abhishek Kadian*, Oleksandr Maksymets*, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, and Dhruv Batra. Habitat: A Platform for Em- bodied AI Research. arXiv preprint arXiv:1904.01201, 2019

  60. [60]

    Learning to parse natural language com- mands to a robot control system

    Cynthia Matuszek, Evan Herbst, Luke Zettlemoyer, and Dieter Fox. Learning to parse natural language com- mands to a robot control system. In Experimental Robotics, pages 403–415. Springer, 2013

  61. [61]

    User interaction models for disambiguation in programming by example

    Mika ¨el Mayer, Gustavo Soares, Maxim Grechkin, Vu Le, Mark Marron, Oleksandr Polozov, Rishabh Singh, Benjamin Zorn, and Sumit Gulwani. User interaction models for disambiguation in programming by example. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, pages 291–301. ACM, 2015

  62. [62]

    A roadmap towards machine intelligence

    Tomas Mikolov, Armand Joulin, and Marco Baroni. A roadmap towards machine intelligence. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 29–61. Springer, 2016

  63. [63]

    Learning by asking questions

    Ishan Misra, Ross Girshick, Rob Fergus, Martial Hebert, Abhinav Gupta, and Laurens van der Maaten. Learning by asking questions

  64. [64]

    Mitchell, W

    T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Set- tles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, and J. Welling. Never-ending learning. In Proceedings of the Twenty-Ni...

  65. [65]

    Playing Atari with Deep Reinforcement Learning

    V olodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013

  66. [66]

    Learning a Natural Language Interface with Neural Programmer

    Arvind Neelakantan, Quoc V Le, Martin Abadi, Andrew McCallum, and Dario Amodei. Learning a natural language interface with neural programmer. arXiv preprint arXiv:1611.08945, 2016

  67. [67]

    Control of Memory, Active Perception, and Action in Minecraft

    Junhyuk Oh, Valliappa Chockalingam, Satinder Singh, and Honglak Lee. Control of memory, active perception, and action in minecraft. arXiv preprint arXiv:1605.09128, 2016

  68. [68]

    Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

    Junhyuk Oh, Satinder Singh, Honglak Lee, and Pushmeet Kohli. Zero-shot task generalization with multi-task deep reinforcement learning. arXiv preprint arXiv:1706.05064, 2017

  69. [69]

    Learning to Ask Good Questions: Ranking Clarification Questions using Neural Expected Value of Perfect Information

    Sudha Rao and Hal Daum ´e III. Learning to ask good questions: Ranking clarification questions using neural expected value of perfect information. arXiv preprint arXiv:1805.04655, 2018

  70. [70]

    Neural Programmer-Interpreters

    Scott Reed and Nando De Freitas. Neural programmer-interpreters. arXiv preprint arXiv:1511.06279, 2015

  71. [71]

    MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments

    Manolis Savva, Angel X Chang, Alexey Dosovitskiy, Thomas Funkhouser, and Vladlen Koltun. Minos: Multi- modal indoor simulator for navigation in complex environments. arXiv preprint arXiv:1712.03931, 2017

  72. [72]

    Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning

    Tianmin Shu, Caiming Xiong, and Richard Socher. Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. arXiv preprint arXiv:1712.07294, 2017

  73. [73]

    Engaging image chat: Modeling personality in grounded dialogue

    Kurt Shuster, Samuel Humeau, Antoine Bordes, and Jason Weston. Engaging image chat: Modeling personality in grounded dialogue. arXiv preprint arXiv:1811.00945, 2018

  74. [75]

    Mastering the game of go with deep neural networks and tree search

    David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484, 2016

  75. [76]

    Joint concept learning and semantic parsing from natural language explanations

    Shashank Srivastava, Igor Labutov, and Tom Mitchell. Joint concept learning and semantic parsing from natural language explanations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1527–1536. Association for Computational Linguistics, 2017

  76. [77]

    Zero-shot learning of classifiers from natural language quantification

    Shashank Srivastava, Igor Labutov, and Tom Mitchell. Zero-shot learning of classifiers from natural language quantification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 306–316, 2018

  77. [78]

    End-to-end optimization of goal-driven and visually grounded dialogue systems

    Florian Strub, Harm De Vries, Jeremie Mary, Bilal Piot, Aaron Courville, and Olivier Pietquin. End-to-end optimization of goal-driven and visually grounded dialogue systems. arXiv preprint arXiv:1703.05423, 2017

  78. [79]

    MazeBase: A Sandbox for Learning from Games

    Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, and Rob Fergus. Mazebase: A sandbox for learning from games. arXiv preprint arXiv:1511.07401, 2015

  79. [80]

    Complementme: weakly-supervised component suggestions for 3d modeling

    Minhyuk Sung, Hao Su, Vladimir G Kim, Siddhartha Chaudhuri, and Leonidas Guibas. Complementme: weakly-supervised component suggestions for 3d modeling. ACM Transactions on Graphics (TOG), 36(6):226, 2017

  80. [81]

    TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games

    Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timoth ´ee Lacroix, Zeming Lin, Florian Richoux, and Nicolas Usunier. Torchcraft: a library for machine learning research on real-time strategy games. arXiv preprint arXiv:1611.00625, 2016

Showing first 80 references.