Why Build an Assistant in Minecraft?
Pith reviewed 2026-05-24 18:19 UTC · model grok-4.3
The pith
Building an open assistant in Minecraft advances natural language understanding and learning from dialogue.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
The authors maintain that an open assistant developed within Minecraft supplies a workable route to measurable progress on natural language understanding and the capacity to learn from ongoing dialogue with humans or other agents.
What carries the argument
An open assistant inside Minecraft that uses language to interact with the game's dynamic, creative world and to learn from dialogue.
Load-bearing premise
The interactive and creative features of Minecraft act as a suitable stand-in for real-world language use that will transfer to wider natural language understanding problems.
What would settle it
An experiment showing that an assistant trained to converse and act inside Minecraft produces no measurable gains on standard dialogue or instruction-following benchmarks outside the game.
read the original abstract
In this document we describe a rationale for a research program aimed at building an open "assistant" in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. The manuscript presents a rationale for a research program to build an open assistant in the game Minecraft, with the aim of advancing natural language understanding and learning from dialogue through interactive gameplay.
Significance. If the rationale holds and the program is executed, Minecraft could serve as an accessible, rich testbed for developing dialogue-capable AI systems that learn through creative interaction, potentially complementing existing NLU benchmarks with more open-ended scenarios.
minor comments (2)
- The rationale would be strengthened by explicit discussion of evaluation metrics or milestones that would demonstrate progress on NLU goals within the Minecraft setting.
- Consider adding references to prior work on game-based AI environments (e.g., other sandbox games or dialogue agents) to better situate the proposal.
Simulated Author's Rebuttal
We thank the referee for their positive summary, significance assessment, and recommendation of minor revision. No major comments were raised in the report.
Circularity Check
No significant circularity in proposal document
full rationale
The paper is explicitly a high-level rationale for a proposed research program rather than a derivation of results, theorems, or empirical findings. It contains no equations, fitted parameters, quantitative predictions, or load-bearing self-citations that reduce claims to inputs by construction. The central argument is prospective and conditional on future work, with no internal steps that can be shown to be equivalent to their own premises.
Axiom & Free-Parameter Ledger
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/RealityFromDistinction.leanreality_from_one_distinction unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
rationale for a research program aimed at building an open “assistant” in the game Minecraft, in order to make progress on the problems of natural language understanding and learning from dialogue
-
IndisputableMonolith/Cost/FunctionalEquation.leanwashburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
modular ML systems that can improve themselves from data while keeping well defined interfaces
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
Reference graph
Works this paper leans on
-
[1]
https://www.gamesindustry.biz/articles/ 2018-10-02-minecraft-exceeds-90-million-monthly-active-users
Minecraft exceeds 90 million monthly active users. https://www.gamesindustry.biz/articles/ 2018-10-02-minecraft-exceeds-90-million-monthly-active-users
work page 2018
-
[2]
https://web.archive.org/web/20190625193739/http://minerl.io/
Minerl. https://web.archive.org/web/20190625193739/http://minerl.io/
-
[3]
Deep Reinforcement Learning with Model Learning and Monte Carlo Tree Search in Minecraft
Stephan Alaniz. Deep reinforcement learning with model learning and monte carlo tree search in minecraft. arXiv preprint arXiv:1803.08456, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[4]
How players speak to an intelligent game character using natural language messages
Fraser Allison, Ewa Luger, and Katja Hofmann. How players speak to an intelligent game character using natural language messages. Transactions of the Digital Games Research Association, 4(2), 2018
work page 2018
-
[5]
Jacob Andreas, Marcus Rohrbach, Trevor Darrell, and Dan Klein. Neural module networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pages 39–48, 2016
work page 2016
-
[6]
Vqa: Visual question answering
Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh. Vqa: Visual question answering. InProceedings of the IEEE international conference on computer vision, pages 2425–2433, 2015
work page 2015
-
[7]
Weakly supervised learning of semantic parsers for mapping instructions to actions
Yoav Artzi and Luke Zettlemoyer. Weakly supervised learning of semantic parsers for mapping instructions to actions. Transactions of the Association for Computational Linguistics, 1:49–62, 2013
work page 2013
-
[8]
Charles Beattie, Joel Z Leibo, Denis Teplyashin, Tom Ward, Marcus Wainwright, Heinrich K ¨uttler, Andrew Lefrancq, Simon Green, V ´ıctor Vald´es, Amir Sadik, et al. Deepmind lab. arXiv preprint arXiv:1612.03801, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[9]
The arcade learning environment: An evaluation platform for general agents
Marc G Bellemare, Yavar Naddaf, Joel Veness, and Michael Bowling. The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279, 2013
work page 2013
-
[10]
Learning interpretable spatial operations in a rich 3d blocks world
Yonatan Bisk, Kevin J Shih, Yejin Choi, and Daniel Marcu. Learning interpretable spatial operations in a rich 3d blocks world. In Thirty-Second AAAI Conference on Artificial Intelligence, 2018
work page 2018
-
[11]
Natural language communication with robots
Yonatan Bisk, Deniz Yuret, and Daniel Marcu. Natural language communication with robots. InProceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 751–761, 2016
work page 2016
-
[12]
Learning end-to-end goal-oriented dialog
Antoine Bordes, Y-Lan Boureau, and Jason Weston. Learning end-to-end goal-oriented dialog. In Proceedings of the International Conference on Learning Representations (ICLR), 2017
work page 2017
-
[13]
Programming with a differentiable forth interpreter
Matko Bo ˇsnjak, Tim Rockt¨aschel, Jason Naradowsky, and Sebastian Riedel. Programming with a differentiable forth interpreter. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 547–556. JMLR. org, 2017
work page 2017
-
[14]
HoME: a Household Multimodal Environment
Simon Brodeur, Ethan Perez, Ankesh Anand, Florian Golemo, Luca Celotti, Florian Strub, Jean Rouat, Hugo Larochelle, and Aaron Courville. Home: A household multimodal environment. arXiv preprint arXiv:1711.11017, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[15]
Learning actions from human-robot dialogues
Rehj Cantrell, Paul Schermerhorn, and Matthias Scheutz. Learning actions from human-robot dialogues. In RO-MAN, 2011 IEEE, pages 125–130. IEEE, 2011
work page 2011
-
[16]
Matterport3D: Learning from RGB-D Data in Indoor Environments
Angel Chang, Angela Dai, Thomas Funkhouser, Maciej Halber, Matthias Nießner, Manolis Savva, Shuran Song, Andy Zeng, and Yinda Zhang. Matterport3d: Learning from rgb-d data in indoor environments. arXiv preprint arXiv:1709.06158, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[17]
Maxime Chevalier-Boisvert, Dzmitry Bahdanau, Salem Lahlou, Lucas Willems, Chitwan Saharia, Thien Huu Nguyen, and Yoshua Bengio. Babyai: First steps towards grounded language learning with a human in the loop. arXiv preprint arXiv:1810.08272, 2018
-
[18]
Textworld: A learning environment for text-based games.arXiv preprint arXiv:1806.11532, 2018
Marc-Alexandre C ˆot´e, ´Akos K ´ad´ar, Xingdi Yuan, Ben Kybartas, Tavian Barnes, Emery Fine, James Moore, Matthew Hausknecht, Layla El Asri, Mahmoud Adada, et al. Textworld: A learning environment for text-based games. arXiv preprint arXiv:1806.11532, 2018
-
[19]
Abhishek Das, Samyak Datta, Georgia Gkioxari, Stefan Lee, Devi Parikh, and Dhruv Batra. Embodied question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR) , volume 5, page 14, 2018
work page 2018
-
[20]
Abhishek Das, Satwik Kottur, Khushi Gupta, Avi Singh, Deshraj Yadav, Jos ´e MF Moura, Devi Parikh, and Dhruv Batra. Visual dialog. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recogni- tion, volume 2, 2017
work page 2017
-
[21]
Language to Logical Form with Neural Attention
Li Dong and Mirella Lapata. Language to logical form with neural attention. arXiv preprint arXiv:1601.01280, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[22]
Frames: a corpus for adding memory to goal-oriented dialogue systems
Layla El Asri, Hannes Schulz, Shikhar Sharma, Jeremie Zumer, Justin Harris, Emery Fine, Rahul Mehrotra, and Kaheer Suleman. Frames: a corpus for adding memory to goal-oriented dialogue systems. In Proceedings of the 18th Annual SIGdial Meeting on Discourse and Dialogue, pages 207–219, Saarbr¨ucken, Germany, August
-
[23]
Association for Computational Linguistics
-
[24]
Write a classifier: Zero-shot learning using purely textual descriptions
Mohamed Elhoseiny, Babak Saleh, and Ahmed Elgammal. Write a classifier: Zero-shot learning using purely textual descriptions. In Proceedings of the IEEE International Conference on Computer Vision , pages 2584– 2591, 2013
work page 2013
-
[25]
Lifelong perceptual programming by example
Alexander L Gaunt, Marc Brockschmidt, Nate Kushman, and Daniel Tarlow. Lifelong perceptual programming by example. 2016
work page 2016
-
[26]
Differentiable programs with neural libraries
Alexander L Gaunt, Marc Brockschmidt, Nate Kushman, and Daniel Tarlow. Differentiable programs with neural libraries. In Proceedings of the 34th International Conference on Machine Learning-Volume 70, pages 1213–1222. JMLR. org, 2017
work page 2017
-
[27]
TerpreT: A Probabilistic Programming Language for Program Induction
Alexander L Gaunt, Marc Brockschmidt, Rishabh Singh, Nate Kushman, Pushmeet Kohli, Jonathan Taylor, and Daniel Tarlow. Terpret: A probabilistic programming language for program induction. arXiv preprint arXiv:1608.04428, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[28]
Jonas Gehring, Zeming Lin, Daniel Haziza, Vegard Mella, Daniel Gant, Nicolas Carion, Dexter Ju, Danielle Rothermel, Laura Gustafson, Eugene Kharitonov, Vasil Khalidov, Florentin Guth, Nantas Nardelli, Nicolas Usunier, and Gabriel Synnaeve. TorchCraftAI v1.1. https://torchcraft.github.io/ TorchCraftAI/docs/core-abstractions.html. Accessed: 2019-07-18, DOI:...
-
[29]
CraftAssist: A Framework for Dialogue-enabled Interactive Agents
Jonathan Gray, Kavya Srinet, Yacine Jernite, Haonan Yu, Zhuoyuan Chen, Demi Guo, Siddharth Goyal, C. Lawrence Zitnick, and Arthur Szlam. Craftassist: A framework for dialogue-enabled interactive agents. arXiv preprint arXiv:1907.08584, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1907
-
[30]
Spreadsheet data manipulation using examples
Sumit Gulwani, William R Harris, and Rishabh Singh. Spreadsheet data manipulation using examples. Com- munications of the ACM, 55(8):97–105, 2012
work page 2012
-
[31]
Sumit Gulwani, Oleksandr Polozov, Rishabh Singh, et al. Program synthesis. Foundations and Trends R⃝ in Programming Languages, 4(1-2):1–119, 2017
work page 2017
-
[32]
Dialog-to-action: conversational question an- swering over a large-scale knowledge base
Daya Guo, Duyu Tang, Nan Duan, Ming Zhou, and Jian Yin. Dialog-to-action: conversational question an- swering over a large-scale knowledge base. In Advances in Neural Information Processing Systems , pages 2942–2951, 2018
work page 2018
-
[33]
From Language to Programs: Bridging Reinforcement Learning and Maximum Marginal Likelihood
Kelvin Guu, Panupong Pasupat, Evan Zheran Liu, and Percy Liang. From language to programs: Bridging reinforcement learning and maximum marginal likelihood. arXiv preprint arXiv:1704.07926, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[34]
Learning from Dialogue after Deployment: Feed Yourself, Chatbot!
Braden Hancock, Antoine Bordes, Pierre-Emmanuel Mazare, and Jason Weston. Learning from dialogue after deployment: Feed yourself, chatbot! arXiv preprint arXiv:1901.05415, 2019
work page internal anchor Pith review Pith/arXiv arXiv 1901
-
[35]
Train- ing classifiers with natural language explanations
Braden Hancock, Paroma Varma, Stephanie Wang, Martin Bringmann, Percy Liang, and Christopher R´e. Train- ing classifiers with natural language explanations. InProceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1884–1895. Association for Computational Lin- guistics, 2018
work page 2018
-
[36]
The second dialog state tracking challenge
Matthew Henderson, Blaise Thomson, and Jason D Williams. The second dialog state tracking challenge. In Proceedings of the 15th Annual Meeting of the Special Interest Group on Discourse and Dialogue (SIGDIAL), pages 263–272, 2014
work page 2014
-
[37]
Search-based neural structured learning for sequential ques- tion answering
Mohit Iyyer, Wen-tau Yih, and Ming-Wei Chang. Search-based neural structured learning for sequential ques- tion answering. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1821–1831, 2017
work page 2017
-
[38]
Data Recombination for Neural Semantic Parsing
Robin Jia and Percy Liang. Data recombination for neural semantic parsing. arXiv preprint arXiv:1606.03622, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[39]
The malmo platform for artificial intelli- gence experimentation
Matthew Johnson, Katja Hofmann, Tim Hutton, and David Bignell. The malmo platform for artificial intelli- gence experimentation. In IJCAI, pages 4246–4247, 2016
work page 2016
-
[40]
Exploring the Limits of Language Modeling
Rafal Jozefowicz, Oriol Vinyals, Mike Schuster, Noam Shazeer, and Yonghui Wu. Exploring the limits of language modeling. arXiv preprint arXiv:1602.02410, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[41]
Progressive Growing of GANs for Improved Quality, Stability, and Variation
Tero Karras, Timo Aila, Samuli Laine, and Jaakko Lehtinen. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[42]
Vizdoom: A doom-based ai research platform for visual reinforcement learning
Michał Kempka, Marek Wydmuch, Grzegorz Runc, Jakub Toczek, and Wojciech Ja ´skowski. Vizdoom: A doom-based ai research platform for visual reinforcement learning. In Computational Intelligence and Games (CIG), 2016 IEEE Conference on, pages 1–8. IEEE, 2016
work page 2016
-
[43]
Where is misty? interpreting spatial descriptors by modeling regions in space
Nikita Kitaev and Dan Klein. Where is misty? interpreting spatial descriptors by modeling regions in space. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing , pages 157–166, 2017
work page 2017
-
[44]
The alexa meaning representation language
Thomas Kollar, Danielle Berry, Lauren Stuart, Karolina Owczarzak, Tagyoung Chung, Lambert Mathias, Michael Kayser, Bradford Snow, and Spyros Matsoukas. The alexa meaning representation language. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Lin- guistics: Human Language Technologies, Volume 3 (Ind...
work page 2018
-
[45]
AI2-THOR: An Interactive 3D Environment for Visual AI
Eric Kolve, Roozbeh Mottaghi, Daniel Gordon, Yuke Zhu, Abhinav Gupta, and Ali Farhadi. Ai2-thor: An interactive 3d environment for visual ai. arXiv preprint arXiv:1712.05474, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[46]
Human-level concept learning through probabilistic program induction
Brenden M Lake, Ruslan Salakhutdinov, and Joshua B Tenenbaum. Human-level concept learning through probabilistic program induction. Science, 350(6266):1332–1338, 2015
work page 2015
-
[47]
Building machines that learn and think like people
Brenden M Lake, Tomer D Ullman, Joshua B Tenenbaum, and Samuel J Gershman. Building machines that learn and think like people. Behavioral and brain sciences, 40, 2017
work page 2017
-
[48]
Modular architecture for starcraft ii with deep reinforcement learning
Dennis Lee, Haoran Tang, Jeffrey O Zhang, Huazhe Xu, Trevor Darrell, and Pieter Abbeel. Modular architecture for starcraft ii with deep reinforcement learning. In Fourteenth Artificial Intelligence and Interactive Digital Entertainment Conference, 2018
work page 2018
-
[49]
J. Li, A. H. Miller, S. Chopra, M. Ranzato, and J. Weston. Dialogue learning with human-in-the-loop. arXiv preprint arXiv:1611.09823, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[50]
J. Li, A. H. Miller, S. Chopra, M. Ranzato, and J. Weston. Learning through dialogue interactions. arXiv preprint arXiv:1612.04936, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[51]
Towards deep conversational recommendations
Raymond Li, Samira Ebrahimi Kahou, Hannes Schulz, Vincent Michalski, Laurent Charlin, and Chris Pal. Towards deep conversational recommendations. In Advances in Neural Information Processing Systems, pages 9748–9758, 2018
work page 2018
-
[52]
Neural Symbolic Machines: Learning Semantic Parsers on Freebase with Weak Supervision
Chen Liang, Jonathan Berant, Quoc Le, Kenneth D Forbus, and Ni Lao. Neural symbolic machines: Learning semantic parsers on freebase with weak supervision. arXiv preprint arXiv:1611.00020, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[53]
Learning executable semantic parsers for natural language understanding
Percy Liang. Learning executable semantic parsers for natural language understanding. Commun. ACM , 59(9):68–76, August 2016
work page 2016
-
[54]
Learning dependency-based compositional semantics
Percy Liang, Michael Jordan, and Dan Klein. Learning dependency-based compositional semantics. In Pro- ceedings of the 49th Annual Meeting of the Association for Computational Linguistics: Human Language Technologies, pages 590–599. Association for Computational Linguistics, 2011
work page 2011
-
[55]
Microsoft coco: Common objects in context
Tsung-Yi Lin, Michael Maire, Serge Belongie, James Hays, Pietro Perona, Deva Ramanan, Piotr Doll ´ar, and C Lawrence Zitnick. Microsoft coco: Common objects in context. In European conference on computer vision, pages 740–755. Springer, 2014
work page 2014
-
[56]
Zhiyu Lin, Brent Harrison, Aaron Keech, and Mark O Riedl. Explore, exploit or listen: Combining human feed- back and policy model to speed up deep reinforcement learning in 3d worlds.arXiv preprint arXiv:1709.03969, 2017
-
[57]
Teaching Machines to Describe Images via Natural Language Feedback
Huan Ling and Sanja Fidler. Teaching machines to describe images via natural language feedback. arXiv preprint arXiv:1706.00130, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[58]
Exploring the Limits of Weakly Supervised Pretraining
Dhruv Mahajan, Ross Girshick, Vignesh Ramanathan, Kaiming He, Manohar Paluri, Yixuan Li, Ashwin Bharambe, and Laurens van der Maaten. Exploring the limits of weakly supervised pretraining. arXiv preprint arXiv:1805.00932, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[59]
Habitat: A platform for embod- ied ai research
Manolis Savva*, Abhishek Kadian*, Oleksandr Maksymets*, Yili Zhao, Erik Wijmans, Bhavana Jain, Julian Straub, Jia Liu, Vladlen Koltun, Jitendra Malik, Devi Parikh, and Dhruv Batra. Habitat: A Platform for Em- bodied AI Research. arXiv preprint arXiv:1904.01201, 2019
-
[60]
Learning to parse natural language com- mands to a robot control system
Cynthia Matuszek, Evan Herbst, Luke Zettlemoyer, and Dieter Fox. Learning to parse natural language com- mands to a robot control system. In Experimental Robotics, pages 403–415. Springer, 2013
work page 2013
-
[61]
User interaction models for disambiguation in programming by example
Mika ¨el Mayer, Gustavo Soares, Maxim Grechkin, Vu Le, Mark Marron, Oleksandr Polozov, Rishabh Singh, Benjamin Zorn, and Sumit Gulwani. User interaction models for disambiguation in programming by example. In Proceedings of the 28th Annual ACM Symposium on User Interface Software & Technology, pages 291–301. ACM, 2015
work page 2015
-
[62]
A roadmap towards machine intelligence
Tomas Mikolov, Armand Joulin, and Marco Baroni. A roadmap towards machine intelligence. In International Conference on Intelligent Text Processing and Computational Linguistics, pages 29–61. Springer, 2016
work page 2016
-
[63]
Ishan Misra, Ross Girshick, Rob Fergus, Martial Hebert, Abhinav Gupta, and Laurens van der Maaten. Learning by asking questions
-
[64]
T. Mitchell, W. Cohen, E. Hruschka, P. Talukdar, J. Betteridge, A. Carlson, B. Dalvi, M. Gardner, B. Kisiel, J. Krishnamurthy, N. Lao, K. Mazaitis, T. Mohamed, N. Nakashole, E. Platanios, A. Ritter, M. Samadi, B. Set- tles, R. Wang, D. Wijaya, A. Gupta, X. Chen, A. Saparov, M. Greaves, and J. Welling. Never-ending learning. In Proceedings of the Twenty-Ni...
work page 2015
-
[65]
Playing Atari with Deep Reinforcement Learning
V olodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602, 2013
work page internal anchor Pith review Pith/arXiv arXiv 2013
-
[66]
Learning a Natural Language Interface with Neural Programmer
Arvind Neelakantan, Quoc V Le, Martin Abadi, Andrew McCallum, and Dario Amodei. Learning a natural language interface with neural programmer. arXiv preprint arXiv:1611.08945, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[67]
Control of Memory, Active Perception, and Action in Minecraft
Junhyuk Oh, Valliappa Chockalingam, Satinder Singh, and Honglak Lee. Control of memory, active perception, and action in minecraft. arXiv preprint arXiv:1605.09128, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
-
[68]
Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning
Junhyuk Oh, Satinder Singh, Honglak Lee, and Pushmeet Kohli. Zero-shot task generalization with multi-task deep reinforcement learning. arXiv preprint arXiv:1706.05064, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[69]
Sudha Rao and Hal Daum ´e III. Learning to ask good questions: Ranking clarification questions using neural expected value of perfect information. arXiv preprint arXiv:1805.04655, 2018
work page internal anchor Pith review Pith/arXiv arXiv 2018
-
[70]
Neural Programmer-Interpreters
Scott Reed and Nando De Freitas. Neural programmer-interpreters. arXiv preprint arXiv:1511.06279, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[71]
MINOS: Multimodal Indoor Simulator for Navigation in Complex Environments
Manolis Savva, Angel X Chang, Alexey Dosovitskiy, Thomas Funkhouser, and Vladlen Koltun. Minos: Multi- modal indoor simulator for navigation in complex environments. arXiv preprint arXiv:1712.03931, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[72]
Hierarchical and Interpretable Skill Acquisition in Multi-task Reinforcement Learning
Tianmin Shu, Caiming Xiong, and Richard Socher. Hierarchical and interpretable skill acquisition in multi-task reinforcement learning. arXiv preprint arXiv:1712.07294, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[73]
Engaging image chat: Modeling personality in grounded dialogue
Kurt Shuster, Samuel Humeau, Antoine Bordes, and Jason Weston. Engaging image chat: Modeling personality in grounded dialogue. arXiv preprint arXiv:1811.00945, 2018
-
[75]
Mastering the game of go with deep neural networks and tree search
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, et al. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484, 2016
work page 2016
-
[76]
Joint concept learning and semantic parsing from natural language explanations
Shashank Srivastava, Igor Labutov, and Tom Mitchell. Joint concept learning and semantic parsing from natural language explanations. In Proceedings of the 2017 Conference on Empirical Methods in Natural Language Processing, pages 1527–1536. Association for Computational Linguistics, 2017
work page 2017
-
[77]
Zero-shot learning of classifiers from natural language quantification
Shashank Srivastava, Igor Labutov, and Tom Mitchell. Zero-shot learning of classifiers from natural language quantification. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 306–316, 2018
work page 2018
-
[78]
End-to-end optimization of goal-driven and visually grounded dialogue systems
Florian Strub, Harm De Vries, Jeremie Mary, Bilal Piot, Aaron Courville, and Olivier Pietquin. End-to-end optimization of goal-driven and visually grounded dialogue systems. arXiv preprint arXiv:1703.05423, 2017
work page internal anchor Pith review Pith/arXiv arXiv 2017
-
[79]
MazeBase: A Sandbox for Learning from Games
Sainbayar Sukhbaatar, Arthur Szlam, Gabriel Synnaeve, Soumith Chintala, and Rob Fergus. Mazebase: A sandbox for learning from games. arXiv preprint arXiv:1511.07401, 2015
work page internal anchor Pith review Pith/arXiv arXiv 2015
-
[80]
Complementme: weakly-supervised component suggestions for 3d modeling
Minhyuk Sung, Hao Su, Vladimir G Kim, Siddhartha Chaudhuri, and Leonidas Guibas. Complementme: weakly-supervised component suggestions for 3d modeling. ACM Transactions on Graphics (TOG), 36(6):226, 2017
work page 2017
-
[81]
TorchCraft: a Library for Machine Learning Research on Real-Time Strategy Games
Gabriel Synnaeve, Nantas Nardelli, Alex Auvolat, Soumith Chintala, Timoth ´ee Lacroix, Zeming Lin, Florian Richoux, and Nicolas Usunier. Torchcraft: a library for machine learning research on real-time strategy games. arXiv preprint arXiv:1611.00625, 2016
work page internal anchor Pith review Pith/arXiv arXiv 2016
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.