Solving the Rubik's Cube Without Human Knowledge

· 2018 · cs.AI · arXiv 1805.07470

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it

open full Pith review browse 2 citing papers arXiv PDF

abstract

A generally intelligent agent must be able to teach itself how to solve problems in complex domains with minimal human supervision. Recently, deep reinforcement learning algorithms combined with self-play have achieved superhuman proficiency in Go, Chess, and Shogi without human data or domain knowledge. In these environments, a reward is always received at the end of the game, however, for many combinatorial optimization environments, rewards are sparse and episodes are not guaranteed to terminate. We introduce Autodidactic Iteration: a novel reinforcement learning algorithm that is able to teach itself how to solve the Rubik's Cube with no human assistance. Our algorithm is able to solve 100% of randomly scrambled cubes while achieving a median solve length of 30 moves -- less than or equal to solvers that employ human domain knowledge.

representative citing papers

Learning Empirically Admissible Neural Heuristics for Combinatorial Search

cs.LG · 2026-06-03 · unverdicted · novelty 5.0

Presents a framework for training empirically admissible neural heuristics via underestimating Bellman operator, asymmetric loss, and validation calibration offset, reporting reduced node expansions with no observed admissibility violations on small puzzles.

Learning to Solve a Rubik's Cube with a Dexterous Hand

cs.RO · 2019-07-26 · unverdicted · novelty 5.0

Hierarchical RL combines a model-based cube solver with a model-free hand controller to solve Rubik's cubes in simulation, achieving 90.3% success on 1400 random scrambles.

citing papers explorer

Showing 2 of 2 citing papers.

Learning Empirically Admissible Neural Heuristics for Combinatorial Search cs.LG · 2026-06-03 · unverdicted · none · ref 5 · internal anchor
Presents a framework for training empirically admissible neural heuristics via underestimating Bellman operator, asymmetric loss, and validation calibration offset, reporting reduced node expansions with no observed admissibility violations on small puzzles.
Learning to Solve a Rubik's Cube with a Dexterous Hand cs.RO · 2019-07-26 · unverdicted · none · ref 21 · internal anchor
Hierarchical RL combines a model-based cube solver with a model-free hand controller to solve Rubik's cubes in simulation, achieving 90.3% success on 1400 random scrambles.

Solving the Rubik's Cube Without Human Knowledge

fields

years

verdicts

representative citing papers

citing papers explorer