arxiv: 2605.12077 · v1 · submitted 2026-05-12 · 💻 cs.CV · cs.AI

Recognition: 2 theorem links

· Lean Theorem

The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments

Ofir Itzhak Shahar , Gur Elkin , Ohad Ben-Shahar

Authors on Pith no claims yet

Pith reviewed 2026-05-13 05:54 UTC · model grok-4.3

classification 💻 cs.CV cs.AI

keywords jigsaw puzzle solvingarchaeological fragmentsflow matchingvision transformerirregular shapeseroded piecesGAP datasetfragment reassembly

0 comments

The pith

New GAP datasets and PuzzleFlow framework let computers reassemble irregular eroded fragments better than prior jigsaw solvers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates GAP, a collection of jigsaw puzzle datasets that use synthetic pieces of arbitrary shapes carrying erosion patterns drawn from a learned model of real archaeological fragments. It introduces PuzzleFlow, a method built on Vision Transformers and flow-matching that predicts how to fit these complex pieces together. Earlier work stayed inside the narrow setting of clean square pieces with straight cuts. A reader would care because actual broken artifacts almost never match that square ideal, so a method that works on eroded irregular shards opens a route to digital reconstruction of physical objects.

Core claim

We introduce GAP, a set of novel jigsaw puzzles datasets containing synthetic, heavily eroded pieces of unrestricted shapes, generated by a learned distribution of real-world archaeological fragments. We also introduce PuzzleFlow, a novel ViT and Flow-Matching based framework for jigsaw puzzle solving, capable of handling complex puzzle pieces and demonstrating superior performance on GAP when compared to both classic and recent prominent works in this domain.

What carries the argument

PuzzleFlow, a Vision Transformer paired with flow-matching that models the reassembly of irregular, eroded pieces; the GAP datasets supply the training and test distribution generated from archaeological fragment statistics.

Load-bearing premise

The learned distribution that produces the synthetic eroded pieces in GAP captures enough of the shape and damage variation present in actual physical archaeological shards.

What would settle it

PuzzleFlow would lose its reported performance advantage when tested on a held-out set of real, non-synthetic archaeological fragments whose erosion statistics were never seen during GAP generation.

Figures

Figures reproduced from arXiv: 2605.12077 by Gur Elkin, Ofir Itzhak Shahar, Ohad Ben-Shahar.

**Figure 1.** Figure 1: Archaeological Puzzle Reconstruction. A puzzle from GAP-5 dataset (left) features irregularly-shaped, heavily eroded fragments generated from real archaeological artifact distributions. PuzzleFlow (right) successfully reconstructs these challenging puzzles by learning holistic visual relationships across entire fragment surfaces, rather than relying on boundary continuity. two complementary contributions.… view at source ↗

**Figure 2.** Figure 2: Visual comparison of puzzle erosion patterns across [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗

**Figure 3.** Figure 3: Puzzle Generation Pipeline for GAP-3 and GAP-5 Datasets. Both datasets follow the same four-step generation process: (a) Source images from The Metropolitan Museum of Art Open Access collection (CC0 1.0 Universal Public Domain Dedication); (b) Grid overlay defining puzzle piece boundaries; (c) VAE-based fragment generation creating irregular, archaeologically-realistic piece shapes; (d) Random shuffling pr… view at source ↗

**Figure 4.** Figure 4: Qualitative comparison: real archaeological fragments [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗

**Figure 5.** Figure 5: PCA embedding of geometric features (63.2% vari [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗

**Figure 6.** Figure 6: Fragment Generator Architecture. Our VAE encodes 128×128 binary fragment masks through four convolutional layers into a 64-dimensional latent space, then reconstructs synthetic fragments via transposed convolutions. The reparameterization trick enables sampling diverse fragments during training while maintaining archaeological realism. 6. Metadata: CSV files with object ID, title, artist information, date… view at source ↗

**Figure 7.** Figure 7: Distribution comparison via box plots. Real (RePAIR) fragments shown in blue, synthetic (VAE) fragments in orange. Boxes indicate interquartile ranges (IQR), horizontal lines show medians, whiskers extend to 1.5×IQR, and circles represent outliers. Core shape properties (area, solidity) exhibit high similarity, while edge complexity metrics show expected smoothing effects from VAE reconstruction. • PC1 (45… view at source ↗

**Figure 8.** Figure 8: PuzzleFlow Architecture. Individual puzzle fragments are processed through a pretrained ViT backbone to extract 768- dimensional visual features. These features are combined with position embeddings (encoding current fragment placements) and time embeddings (encoding flow matching timestep), then passed through 4 additional transformer layers for cross-piece reasoning. The output head predicts logits over … view at source ↗

**Figure 9.** Figure 9: Qualitative Results. Representative examples of PuzzleFlow solving GAP puzzles. Top rows: Successful reconstructions on GAP-3 (left) and GAP-5 (right) with heavily eroded fragments. Bottom rows: Challenging failure cases where erosion or visual ambiguity leads to errors. PuzzleFlow successfully handles irregular fragment geometries and leverages global visual patterns, though some puzzles with extreme eros… view at source ↗

read the original abstract

Jigsaw puzzle solving has been an increasingly popular task in the computer vision research community. Recent works have utilized cutting-edge architectures and computational approaches to reassemble groups of pieces into a coherent image, while achieving increasingly good results on well established datasets. However, most of these approaches share a common, restricting setting: operating solely on strictly square puzzle pieces. In this work, we introduce GAP, a set of novel jigsaw puzzles datasets containing synthetic, heavily eroded pieces of unrestricted shapes, generated by a learned distribution of real-world archaeological fragments. We also introduce PuzzleFlow, a novel ViT and Flow-Matching based framework for jigsaw puzzle solving, capable of handling complex puzzle pieces and demonstrating superior performance on GAP when compared to both classic and recent prominent works in this domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GAP and PuzzleFlow extend square jigsaw work to irregular eroded pieces but stay inside synthetic data with no real fragment tests.

read the letter

This paper's main contribution is the GAP dataset of synthetic jigsaw pieces with irregular, eroded shapes drawn from real archaeological fragments, plus PuzzleFlow, which uses a Vision Transformer and flow matching to solve them. It does a solid job of shifting away from the standard square-piece setup that most prior work uses. Generating the pieces from a learned distribution of actual fragments is a good way to make the problem more realistic than random cuts or perfect squares. The claim that PuzzleFlow outperforms classic and recent methods on this data is the core result. The soft spot is the lack of any test on physical fragments. The title talks about handling real-world archaeological pieces, but all experiments are on the synthetic GAP set. If the learned erosion model misses important real-world variations in shape, texture, or damage, the performance advantage may not hold up outside the simulation. The abstract mentions superior performance without numbers or details on the protocol, which makes it hard to judge how strong the evidence is. This work is aimed at computer vision researchers focused on puzzle assembly or applications in cultural heritage. A reader working on similar reconstruction tasks would find the dataset useful to try. It shows clear thinking in adapting existing architectures to a new setting. I would send it for peer review. The new dataset alone is worth referee attention, and the authors can address the synthetic-to-real gap in revisions.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the GAP dataset of synthetic jigsaw puzzles featuring heavily eroded, irregularly shaped pieces generated from a learned distribution of real-world archaeological fragments. It proposes PuzzleFlow, a ViT and flow-matching framework for reassembling such pieces, and claims superior performance relative to classic and recent baselines on GAP.

Significance. If substantiated, the work extends jigsaw puzzle solving beyond square pieces to irregular eroded fragments, offering a new benchmark (GAP) and method (PuzzleFlow) with potential relevance to digital archaeology. The use of a learned distribution to synthesize realistic damage is a constructive idea, and the ViT/flow-matching combination is a fresh technical direction for the task. The primary limitation is that all claims remain within the synthetic regime.

major comments (2)

[Abstract] Abstract: the assertion of 'superior performance on GAP' is presented without any metrics, baselines, error bars, or experimental protocol. This is load-bearing for the central empirical claim and must be remedied with quantitative results in the experimental section.
[Introduction / Experiments] Introduction and Experiments sections: the title and abstract promise utility for 'real world archaeological fragments,' yet all reported results are confined to synthetic GAP data generated from a learned distribution. No direct evaluation on physical shards is described, leaving the transferability of the performance gap untested and the weakest assumption (faithful reproduction of real fracture, erosion, and imaging statistics) unaddressed.

minor comments (1)

[Abstract] Abstract: a single sentence summarizing the quantitative gains (e.g., accuracy or IoU improvement) would improve clarity for readers.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback. We address each major comment below and have revised the manuscript to improve clarity and completeness where feasible.

read point-by-point responses

Referee: [Abstract] Abstract: the assertion of 'superior performance on GAP' is presented without any metrics, baselines, error bars, or experimental protocol. This is load-bearing for the central empirical claim and must be remedied with quantitative results in the experimental section.

Authors: We agree that the abstract should quantitatively support the central claim. The experimental section already details the metrics, baselines, error bars, and protocol for PuzzleFlow versus prior methods on GAP. In the revised manuscript we have updated the abstract to include key quantitative results (e.g., accuracy and reconstruction metrics with comparisons) and a brief reference to the evaluation protocol. revision: yes
Referee: [Introduction / Experiments] Introduction and Experiments sections: the title and abstract promise utility for 'real world archaeological fragments,' yet all reported results are confined to synthetic GAP data generated from a learned distribution. No direct evaluation on physical shards is described, leaving the transferability of the performance gap untested and the weakest assumption (faithful reproduction of real fracture, erosion, and imaging statistics) unaddressed.

Authors: We acknowledge the limitation. GAP is synthesized from a learned distribution of real archaeological fragments to capture irregular shapes and erosion patterns, providing a scalable and controlled benchmark. We have revised the introduction and experiments to explicitly state that all quantitative results are on synthetic data, to describe the validation steps used when learning the fragment distribution, and to note that direct transfer to physical shards remains untested. Direct evaluation on real artifacts is not feasible in the current work due to limited access to physical shards and imaging equipment; we have added a dedicated limitations paragraph and future-work statement addressing this gap. revision: partial

standing simulated objections not resolved

Direct evaluation on physical archaeological shards cannot be added without new real-world data collection and imaging, which is outside the scope and resources of the present study.

Circularity Check

0 steps flagged

No significant circularity; claims rest on new dataset and independent benchmark comparisons

full rationale

The paper introduces GAP as a new synthetic dataset generated from a learned distribution of real archaeological fragments and proposes PuzzleFlow as a novel ViT + flow-matching architecture. Superiority is claimed via direct empirical comparisons against classic and recent baselines on held-out portions of GAP. No equations, self-citations, or fitted parameters are shown to reduce the reported performance metrics to the generation process or prior author work by construction. The evaluation remains statistically independent of the input distribution once the train/test split is performed, satisfying the criteria for a self-contained result against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

Review based on abstract only; no explicit free parameters, axioms, or invented entities beyond the high-level dataset and model names are described.

invented entities (2)

GAP dataset no independent evidence
purpose: Benchmark for jigsaw solving with irregular eroded pieces
Synthetic pieces generated from learned distribution of real archaeological fragments
PuzzleFlow framework no independent evidence
purpose: Solver for complex non-square puzzle pieces
Combines ViT and flow-matching

pith-pipeline@v0.9.0 · 5436 in / 1143 out tokens · 60149 ms · 2026-05-13T05:54:53.350716+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/RealityFromDistinction.lean reality_from_one_distinction unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

PuzzleFlow, a novel ViT and Flow-Matching based framework for jigsaw puzzle solving... superior performance on GAP
IndisputableMonolith/Cost/FunctionalEquation.lean washburn_uniqueness_aczel unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

V AE trained on 958 binary masks from the RePAIR dataset... irregular, archaeologically-realistic piece shapes

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

88 extracted references · 88 canonical work pages · 4 internal anchors

[1]

Dis- crete tabu search for graph matching

Kamil Adamczewski, Yumin Suh, and Kyoung Mu Lee. Dis- crete tabu search for graph matching. InProceedings of the IEEE international conference on computer vision, pages 109–117, 2015. 7

work page 2015
[2]

A generative flow for conditional sampling via optimal transport.arXiv preprint arXiv:2307.04102, 2023

Jason Alfonso, Ricardo Baptista, Anupam Bhakta, Noam Gal, Alfin Hou, Isa Lyubimova, Daniel Pocklington, Josef Sajonz, Giulio Trigila, and Ryan Tsai. A generative flow for conditional sampling via optimal transport.arXiv preprint arXiv:2307.04102, 2023. 5

work page arXiv 2023
[3]

Devel- opment of captcha system based on puzzle

Firkhan Ali Bin Hamid Ali and Farhana Bt Karim. Devel- opment of captcha system based on puzzle. In2014 interna- tional conference on computer, communications, and control technology (I4CT), pages 426–428. IEEE, 2014. 1

work page 2014
[4]

Solving jigsaw puz- zles with eroded boundaries

Dov Bridger, Dov Danon, and Ayellet Tal. Solving jigsaw puz- zles with eroded boundaries. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3526–3535, 2020. 2

work page 2020
[5]

Domain generalization by solving jigsaw puzzles

Fabio M Carlucci, Antonio D’Innocente, Silvia Bucci, Bar- bara Caputo, and Tatiana Tommasi. Domain generalization by solving jigsaw puzzles. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2229–2238, 2019. 1

work page 2019
[6]

Jigsaw-vit: Learning jigsaw puzzles in vision transformer.Pattern Recognition Letters, 166:53–60,

Yingyi Chen, Xi Shen, Yahui Liu, Qinghua Tao, and Jo- han AK Suykens. Jigsaw-vit: Learning jigsaw puzzles in vision transformer.Pattern Recognition Letters, 166:53–60,

work page
[7]

A prob- abilistic image jigsaw puzzle solver

Taeg Sang Cho, Shai Avidan, and William T Freeman. A prob- abilistic image jigsaw puzzle solver. In2010 IEEE Computer society conference on computer vision and pattern recogni- tion, pages 183–190. IEEE, 2010. 4, 6

work page 2010
[8]

A multiscale method for the reassembly of two-dimensional fragmented objects.IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9):1239–1251, 2002

Helena Cristina da Gama Leitao and Jorge Stolfi. A multiscale method for the reassembly of two-dimensional fragmented objects.IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9):1239–1251, 2002. 1

work page 2002
[9]

Jigsaw puzzles, edge matching, and polyomino packing: Connections and complexity.Graphs and Combinatorics, 23(Suppl 1):195– 208, 2007

Erik D Demaine and Martin L Demaine. Jigsaw puzzles, edge matching, and polyomino packing: Connections and complexity.Graphs and Combinatorics, 23(Suppl 1):195– 208, 2007. 2

work page 2007
[10]

Solving archae- ological puzzles.Pattern Recognition, 119:108065, 2021

Niv Derech, Ayellet Tal, and Ilan Shimshoni. Solving archae- ological puzzles.Pattern Recognition, 119:108065, 2021. 1

work page 2021
[11]

An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

Alexey Dosovitskiy. An image is worth 16x16 words: Trans- formers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020. 5, 4

work page internal anchor Pith review Pith/arXiv arXiv 2010
[12]

Seq2seq models reconstruct visual jigsaw puzzles without seeing them

Gur Elkin, Ofir Itzhak Shahar, and Ohad Ben-Shahar. Seq2seq models reconstruct visual jigsaw puzzles without seeing them. arXiv preprint arXiv:2511.06315, 2025. 3, 6, 7, 4

work page arXiv 2025
[13]

Freeman and L

H. Freeman and L. Garder. Apictorial jigsaw puzzles: The computer solution of a problem in pattern recognition.IEEE Transactions on Electronic Computers, EC-13(2):118–127,

work page
[14]

A novel image based captcha using jigsaw puzzle

Haichang Gao, Dan Yao, Honggang Liu, Xiyang Liu, and Liming Wang. A novel image based captcha using jigsaw puzzle. In2010 13th IEEE international conference on com- putational science and engineering, pages 351–356. IEEE,

work page
[15]

A test of the” jigsaw puzzle” model for protein folding by mul- tiple methionine substitutions within the core of t4 lysozyme

Nadine C Gassner, Walter A Baase, and Brian W Matthews. A test of the” jigsaw puzzle” model for protein folding by mul- tiple methionine substitutions within the core of t4 lysozyme. Proceedings of the National Academy of Sciences, 93(22): 12155–12158, 1996. 1

work page 1996
[16]

Positional diffusion: Graph-based diffusion models for set ordering.Pattern Recognition Letters, 186:272–278, 2024

Francesco Giuliari, Gianluca Scarpellini, Stefano Fiorini, Stu- art James, Pietro Morerio, Yiming Wang, and Alessio Del Bue. Positional diffusion: Graph-based diffusion models for set ordering.Pattern Recognition Letters, 186:272–278, 2024. 2

work page 2024
[17]

From square pieces to brick walls: The next challenge in solving jigsaw puzzles

Shir Gur and Ohad Ben-Shahar. From square pieces to brick walls: The next challenge in solving jigsaw puzzles. InPro- ceedings of the IEEE international conference on computer vision, pages 4029–4037, 2017. 6

work page 2017
[18]

Pic- torial and apictorial polygonal jigsaw puzzles from arbitrary number of crossing cuts.International Journal of Computer Vision, 132(9):3428–3462, 2024

Peleg Harel, Ofir Itzhak Shahar, and Ohad Ben-Shahar. Pic- torial and apictorial polygonal jigsaw puzzles from arbitrary number of crossing cuts.International Journal of Computer Vision, 132(9):3428–3462, 2024. 1

work page 2024
[19]

Solving jigsaw puzzles with vision transformers.Pattern Analysis and Applications, 28(2):110, 2025

Ga¨el Heck, Nicolas Lerm ´e, and Sylvie Le H ´egarat-Mascle. Solving jigsaw puzzles with vision transformers.Pattern Analysis and Applications, 28(2):110, 2025. 2

work page 2025
[20]

Reassemblenet: Learnable keypoints and diffusion for 2d fresco reconstruction.arXiv preprint arXiv:2505.21117, 2025

Adeela Islam, Stefano Fiorini, Stuart James, Pietro Morerio, and Alessio Del Bue. Reassemblenet: Learnable keypoints and diffusion for 2d fresco reconstruction.arXiv preprint arXiv:2505.21117, 2025. 1

work page arXiv 2025
[21]

Jigsaw puzzle solving as a consistent labeling problem

Marina Khoroshiltseva, Ben Vardi, Alessandro Torcinovich, Arianna Traviglia, Ohad Ben-Shahar, and Marcello Pelillo. Jigsaw puzzle solving as a consistent labeling problem. In International Conference on Computer Analysis of Images and Patterns, pages 392–402. Springer, 2021. 2

work page 2021
[22]

Solving jigsaw puzzles by predicting fragment’s coordinate based on vision transformer.Expert Systems with Applications, 272: 126776, 2025

Garam Kim, Hyeonseong Cho, and Hyoungsik Nam. Solving jigsaw puzzles by predicting fragment’s coordinate based on vision transformer.Expert Systems with Applications, 272: 126776, 2025. 2, 6, 7, 4

work page 2025
[23]

Auto-Encoding Variational Bayes

Diederik P Kingma and Max Welling. Auto-encoding vari- ational bayes.arXiv preprint arXiv:1312.6114, 2013. 3, 1

work page internal anchor Pith review Pith/arXiv arXiv 2013
[24]

Scientific puzzle solv- ing: Current techniques and applications

Florian Kleber and Robert Sablatnig. Scientific puzzle solv- ing: Current techniques and applications. InProceedings of the Computer Applications and Quantitative Methods in Archaeology Conference, 2009. 1

work page 2009
[25]

Jigsawnet: Shredded image reassem- bly using convolutional neural network and loop-based com- position.IEEE Transactions on Image Processing, 28(8): 4000–4015, 2019

Canyu Le and Xin Li. Jigsawnet: Shredded image reassem- bly using convolutional neural network and loop-based com- position.IEEE Transactions on Image Processing, 28(8): 4000–4015, 2019. 3

work page 2019
[26]

Jigsawgan: Auxiliary learning for solving jigsaw puzzles with generative adversarial networks.IEEE Transac- tions on Image Processing, 31:513–524, 2021

Ru Li, Shuaicheng Liu, Guangfu Wang, Guanghui Liu, and Bing Zeng. Jigsawgan: Auxiliary learning for solving jigsaw puzzles with generative adversarial networks.IEEE Transac- tions on Image Processing, 31:513–524, 2021. 2, 6, 7 9

work page 2021
[27]

Hi- erarchical fragmented image reassembly using a bundle-of- superpixel representation.Computer Aided Geometric Design, 71:220–230, 2019

Xin Li, Kang Xie, Wenxing Hong, and Celong Liu. Hi- erarchical fragmented image reassembly using a bundle-of- superpixel representation.Computer Aided Geometric Design, 71:220–230, 2019. 3

work page 2019
[28]

Flow Matching for Generative Modeling

Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling. arXiv preprint arXiv:2210.02747, 2022. 5

work page internal anchor Pith review Pith/arXiv arXiv 2022
[29]

Solving masked jigsaw puzzles with diffusion vision transformers

Jinyang Liu, Wondmgezahu Teshome, Sandesh Ghimire, Mario Sznaier, and Octavia Camps. Solving masked jigsaw puzzles with diffusion vision transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23009–23018, 2024. 2, 6, 7, 4

work page 2024
[30]

Decoupled Weight Decay Regularization

Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 6, 5

work page internal anchor Pith review Pith/arXiv arXiv 2017
[31]

A survey on com- putational solutions for reconstructing complete objects by reassembling their fractured parts

Jiaxin Lu, Yongqing Liang, Huijun Han, Jiacheng Hua, Jun- feng Jiang, Xin Li, and Qixing Huang. A survey on com- putational solutions for reconstructing complete objects by reassembling their fractured parts. InComputer Graphics Forum, page e70081. Wiley Online Library, 2025. 2

work page 2025
[32]

Mitochondrial dna as a genomic jigsaw puzzle.Science, 318(5849):415–415, 2007

William Marande and Gertraud Burger. Mitochondrial dna as a genomic jigsaw puzzle.Science, 318(5849):415–415, 2007. 1

work page 2007
[33]

Jigsaw puzzle solving techniques and applications: a survey.The Visual Computer, 39(10):4405–4421, 2023

Smaragda Markaki and Costas Panagiotakis. Jigsaw puzzle solving techniques and applications: a survey.The Visual Computer, 39(10):4405–4421, 2023. 2

work page 2023
[34]

Self-supervised learning of pretext-invariant representations

Ishan Misra and Laurens van der Maaten. Self-supervised learning of pretext-invariant representations. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6707–6717, 2020. 1

work page 2020
[35]

Unsupervised learning of visual representations by solving jigsaw puzzles

Mehdi Noroozi and Paolo Favaro. Unsupervised learning of visual representations by solving jigsaw puzzles. InEuropean conference on computer vision, pages 69–84. Springer, 2016. 1

work page 2016
[36]

The metropolitan museum of art open access dataset

The Metropolitan Museum of Art. The metropolitan museum of art open access dataset. https : / / www . metmuseum . org / about - the - met / policies - and- documents/open- access , 2017. Dataset li- censed under Creative Commons Zero (CC0). 3, 4, 1

work page 2017
[37]

Solving convex partition visual jigsaw puzzles.arXiv preprint arXiv:2511.04450, 2025

Yaniv Ohayon, Ofir Itzhak Shahar, and Ohad Ben-Shahar. Solving convex partition visual jigsaw puzzles.arXiv preprint arXiv:2511.04450, 2025. 1

work page arXiv 2025
[38]

Deepzzle: Solving visual jigsaw puzzles with deep learn- ing and shortest path optimization.IEEE Transactions on Image Processing, 29:3569–3581, 2020

Marie-Morgane Paumard, David Picard, and Hedi Tabia. Deepzzle: Solving visual jigsaw puzzles with deep learn- ing and shortest path optimization.IEEE Transactions on Image Processing, 29:3569–3581, 2020. 2, 3, 4, 6, 7

work page 2020
[39]

A fully automated greedy square jigsaw puzzle solver

Dolev Pomeranz, Michal Shemesh, and Ohad Ben-Shahar. A fully automated greedy square jigsaw puzzle solver. InCVPR 2011, pages 9–16. IEEE, 2011. 2, 4, 6, 7, 5

work page 2011
[40]

Masked jigsaw puzzle: A versatile position embedding for vision transformers

Bin Ren, Yahui Liu, Yue Song, Wei Bi, Rita Cucchiara, Nicu Sebe, and Wei Wang. Masked jigsaw puzzle: A versatile position embedding for vision transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20382–20391, 2023. 2

work page 2023
[41]

A novel hybrid scheme using genetic algorithms and deep learning for the reconstruction of portuguese tile panels

Daniel Rika, Dror Sholomon, Eli David, and Nathan S Ne- tanyahu. A novel hybrid scheme using genetic algorithms and deep learning for the reconstruction of portuguese tile panels. InProceedings of the genetic and evolutionary computation conference, pages 1319–1327, 2019. 1

work page 2019
[42]

Ten: Twin embedding networks for the jigsaw puzzle problem with eroded boundaries.arXiv preprint arXiv:2203.06488, 2022

Daniel Rika, Dror Sholomon, Eli David, and Nathan S Ne- tanyahu. Ten: Twin embedding networks for the jigsaw puzzle problem with eroded boundaries.arXiv preprint arXiv:2203.06488, 2022. 2

work page arXiv 2022
[43]

Solv- ing jigsaw puzzles in the wild: Human-guided reconstruction of cultural heritage fragments

Omidreza Safaei, Sinem Aslan, Sebastiano Vascon, Luca Palmieri, Marina Khoroshiltseva, and Marcello Pelillo. Solv- ing jigsaw puzzles in the wild: Human-guided reconstruction of cultural heritage fragments. In2025 IEEE 35th Interna- tional Workshop on Machine Learning for Signal Processing (MLSP), pages 1–6. IEEE, 2025. 1

work page 2025
[44]

Diffassemble: A unified graph-diffusion model for 2d and 3d reassembly

Gianluca Scarpellini, Stefano Fiorini, Francesco Giuliari, Pietro Moreiro, and Alessio Del Bue. Diffassemble: A unified graph-diffusion model for 2d and 3d reassembly. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 28098–28108, 2024. 2, 6, 7

work page 2024
[45]

Pair- wise alignment & compatibility for arbitrarily irregular image fragments.arXiv preprint arXiv:2507.09767, 2025

Ofir Itzhak Shahar, Gur Elkin, and Ohad Ben-Shahar. Pair- wise alignment & compatibility for arbitrarily irregular image fragments.arXiv preprint arXiv:2507.09767, 2025. 1, 2

work page arXiv 2025
[46]

A genetic algorithm-based solver for very large jigsaw puzzles

Dror Sholomon, Omid David, and Nathan S Netanyahu. A genetic algorithm-based solver for very large jigsaw puzzles. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1767–1774, 2013. 2, 4, 6, 7

work page 2013
[47]

Dnn-buddies: A deep neural network-based estimation metric for the jigsaw puzzle problem

Dror Sholomon, Omid E David, and Nathan S Netanyahu. Dnn-buddies: A deep neural network-based estimation metric for the jigsaw puzzle problem. InInternational Conference on Artificial Neural Networks, pages 170–178. Springer, 2016. 2

work page 2016
[48]

Wall painting recon- struction using a genetic algorithm.Journal on Computing and Cultural Heritage (JOCCH), 11(1):1–17, 2017

Elena Sizikova and Thomas Funkhouser. Wall painting recon- struction using a genetic algorithm.Journal on Computing and Cultural Heritage (JOCCH), 11(1):1–17, 2017. 1

work page 2017
[49]

Siamese-discriminant deep re- inforcement learning for solving jigsaw puzzles with large eroded gaps

Xingke Song, Jiahuan Jin, Chenglin Yao, Shihe Wang, Jian- feng Ren, and Ruibin Bai. Siamese-discriminant deep re- inforcement learning for solving jigsaw puzzles with large eroded gaps. InProceedings of the AAAI Conference on Ar- tificial Intelligence, pages 2303–2311, 2023. 1, 2, 3, 4, 6, 7

work page 2023
[50]

Solving jigsaw puzzle of large eroded gaps using puzzlet discriminant network

Xingke Song, Xiaoying Yang, Jianfeng Ren, Ruibin Bai, and Xudong Jiang. Solving jigsaw puzzle of large eroded gaps using puzzlet discriminant network. InICASSP 2023-2023 IEEE international conference on acoustics, speech and sig- nal processing (ICASSP), pages 1–5. IEEE, 2023. 2, 7

work page 2023
[51]

Ceari: Co-evolutionary agents for reassembling and inpainting puz- zles with gaps and missing pieces

Xingke Song, Jianxu Shangguan, Yiran Li, Jialu Zhang, Jian- feng Ren, Ruibin Bai, Xin Chen, and Xudong Jiang. Ceari: Co-evolutionary agents for reassembling and inpainting puz- zles with gaps and missing pieces. InProceedings of the 33rd ACM International Conference on Multimedia, pages 2634–2642, 2025. 2

work page 2025
[52]

ERL-MPP: Evolu- tionary reinforcement learning with multi-head puzzle percep- tion for solving large-scale jigsaw puzzles of eroded gaps

Xingke Song, Xiaoying Yang, Chenglin Yao, Jianfeng Ren, Ruibin Bai, Xin Chen, and Xudong Jiang. ERL-MPP: Evolu- tionary reinforcement learning with multi-head puzzle percep- tion for solving large-scale jigsaw puzzles of eroded gaps. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 6968–6977, 2025. 2, 7

work page 2025
[53]

Ganzzle: Reframing jigsaw puzzle solving as a retrieval task using a generative mental image

Davide Talon, Alessio Del Bue, and Stuart James. Ganzzle: Reframing jigsaw puzzle solving as a retrieval task using a generative mental image. In2022 IEEE international confer- ence on image processing (ICIP), pages 4083–4087. IEEE,

work page
[54]

Ganzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations.Pattern Recognition Letters, 187:35–41, 2025

Davide Talon, Alessio Del Bue, and Stuart James. Ganzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations.Pattern Recognition Letters, 187:35–41, 2025. 2

work page 2025
[55]

Re-assembling the past: The repair dataset and benchmark for real world 2d and 3d puzzle solving.Advances in Neural Information Process- ing Systems, 37:30076–30105, 2024

Theodore Tsesmelis, Luca Palmieri, Marina Khoroshiltseva, Adeela Islam, Gur Elkin, Ofir I Shahar, Gianluca Scarpellini, Stefano Fiorini, Yaniv Ohayon, Nadav Alali, Sinem Aslan, Pietro Morerio, Sebastiano Vascon, Elena gravina, Maria Napolitano, Giuseppe Scarpati, Gabriel zuchtriegel, Alexan- dra Sp¨uhler, Michel Fuchs, Stuart James, Ohad Ben-Shahar, Marce...

work page 2024
[56]

Shredded doc- ument reconstruction using mpeg-7 standard descriptors

Anna Ukovich, Giovanni Ramponi, Haralambos Doulaver- akis, Yiannis Kompatsiaris, and MG Strintzis. Shredded doc- ument reconstruction using mpeg-7 standard descriptors. In Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004., pages 334–337. IEEE, 2004. 1

work page 2004
[57]

Multi-phase relax- ation labeling for square jigsaw puzzle solving.arXiv preprint arXiv:2303.14793, 2023

Ben Vardi, Alessandro Torcinovich, Marina Khoroshiltseva, Marcello Pelillo, and Ohad Ben-Shahar. Multi-phase relax- ation labeling for square jigsaw puzzle solving.arXiv preprint arXiv:2303.14793, 2023. 2

work page arXiv 2023
[58]

Attention is all you need.Advances in neural information processing systems, 30, 2017

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 6

work page 2017
[59]

The puzzle assembled: Ediacaran guide fossil cloudina reveals an old proto-gondwana seaway.Geology, 42(5):391–394, 2014

Lucas V Warren, Fernanda Quaglio, Claudio Riccomini, Mar- cello Guimar˜aes Sim˜oes, Daniel Gustavo Poire, Nicol´as Mis- ailidis Strikis, Luis E Anelli, and Pedro Carlos Strikis. The puzzle assembled: Ediacaran guide fossil cloudina reveals an old proto-gondwana seaway.Geology, 42(5):391–394, 2014. 1

work page 2014
[60]

Iterative reorganization with weak spatial constraints: Solving arbitrary jigsaw puz- zles for unsupervised representation learning

Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiay- ing Liu, Qi Tian, and Alan L Yuille. Iterative reorganization with weak spatial constraints: Solving arbitrary jigsaw puz- zles for unsupervised representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1910–1919, 2019. 1

work page 1910
[61]

Computational re- construction of ancient artifacts.IEEE Signal processing magazine, 25(4):65–83, 2008

Andrew R Willis and David B Cooper. Computational re- construction of ancient artifacts.IEEE Signal processing magazine, 25(4):65–83, 2008. 1

work page 2008
[62]

A solution to reconstruct cross-cut shredded text documents based on character recognition and genetic algorithm

Hedong Xu, Jing Zheng, Ziwei Zhuang, and Suohai Fan. A solution to reconstruct cross-cut shredded text documents based on character recognition and genetic algorithm. In Abstract and applied analysis, page 829602. Wiley Online Library, 2014. 1

work page 2014
[63]

Vlhsa: Vision-language hier- archical semantic alignment for jigsaw puzzle solving with eroded gaps.arXiv preprint arXiv:2509.25202, 2025

Zhuoning Xu and Xinyan Liu. Vlhsa: Vision-language hier- archical semantic alignment for jigsaw puzzle solving with eroded gaps.arXiv preprint arXiv:2509.25202, 2025. 2, 7

work page arXiv 2025
[64]

Solving jigsaw puzzles with linear programming.arXiv preprint arXiv:1511.04472, 2015

Rui Yu, Chris Russell, and Lourdes Agapito. Solving jigsaw puzzles with linear programming.arXiv preprint arXiv:1511.04472, 2015. 2

work page arXiv 2015
[65]

A graph-based optimization algo- rithm for fragmented image reassembly.Graphical Models, 76(5):484–495, 2014

Kang Zhang and Xin Li. A graph-based optimization algo- rithm for fragmented image reassembly.Graphical Models, 76(5):484–495, 2014. 3

work page 2014
[66]

A jigsaw puzzle in- spired algorithm for solving large-scale no-wait flow shop scheduling problems.Applied Intelligence, 50:87–100, 2020

Fuqing Zhao, Xuan He, Yi Zhang, Wenchang Lei, Weimin Ma, Chuck Zhang, and Houbin Song. A jigsaw puzzle in- spired algorithm for solving large-scale no-wait flow shop scheduling problems.Applied Intelligence, 50:87–100, 2020. 1 11 The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments Supplementary Material

work page 2020
[67]

Introduction This supplementary material provides comprehensive techni- cal documentation for all components of our work. We orga- nize the content into three main sections: (1) the GAP dataset generation pipeline and statistical validation (Section 8), (2) complete implementation details and training configurations for PuzzleFlow and all baseline methods...

work page
[68]

GAP Dataset: Generation and Validation This section details the complete pipeline for generating the GAP (Generated Archaeological-fragments Puzzles) datasets, including the fragment generator architecture, train- ing procedure, source data collection, and comprehensive statistical validation against real archaeological fragments. 8.1. Fragment Generator ...

work page
[69]

API Query: Query collectionapi.metmuseum.org for public domain objects

work page
[70]

Filtering: Apply isPublicDomain=True AND title NOT LIKE ’%fragment%’ , in order to as- sure selected images are indeed categorized as public domain, while filtering out images of already fragmented artifacts

work page
[71]

Sampling: Random selection of 40,000 unique object IDs (20,000 for GAP-3, 20,000 for GAP-5)

work page
[72]

Download: Parallel retrieval with 20 workers and retry logic for failed requests

work page
[73]

The reparameterization trick enables sampling diverse fragments during training while maintaining archaeological realism

Storage: Full-resolution primary images with complete metadata 1 Figure 6.Fragment Generator Architecture.Our V AE encodes 128×128 binary fragment masks through four convolutional layers into a 64-dimensional latent space, then reconstructs synthetic fragments via transposed convolutions. The reparameterization trick enables sampling diverse fragments dur...

work page
[74]

Metadata: CSV files with object ID, title, artist infor- mation, date/period, department, culture, medium, and dimensions (where available in the MET’s original meta- data) 8.2.2. Collection Diversity Statistics Analysis of the 40,000 collected images reveals exceptional temporal, geographical, and medium diversity: Departmental Distribution: • 19 unique ...

work page 2000
[75]

Area A= P i,j M(i, j) : Total number of foreground pixels, providing an absolute size measure in px2

work page
[76]

Captures edge extent and complexity

Perimeter P : Length of the fragment boundary com- puted via contour tracing, measured in pixels. Captures edge extent and complexity

work page
[77]

Aspect Ratio r=w bbox/hbbox: Ratio of minimum bound- ing rectangle width to height. Note that original frag- ments were normalized to square bounding boxes (aspect ratio ≈1 ) pre-training to ensure consistent input dimen- sions (128×128 pixels), resulting in distributions centered near unity

work page
[78]

S= 1 for convex fragments; S <1 quantifies boundary concavity depth

Solidity S=A/A hull: Ratio of fragment area to its convex hull area. S= 1 for convex fragments; S <1 quantifies boundary concavity depth. Formally, Ahull = Area(ConvexHull(M))

work page
[79]

C= 1 for perfect circles; C <1 for irregular shapes

Circularity C= 4πA/P 2: Isoperimetric quotient com- paring shape to a circle. C= 1 for perfect circles; C <1 for irregular shapes. Invariant to scaling

work page
[80]

Lower values indicate more compact shapes; higher values reflect irregular boundaries

Compactness K=P 2/A: Inverse measure of shape efficiency. Lower values indicate more compact shapes; higher values reflect irregular boundaries. Related to circularity byK= 4π/C

work page

Showing first 80 references.