pith. machine review for the scientific record. sign in

arxiv: 2605.12077 · v1 · submitted 2026-05-12 · 💻 cs.CV · cs.AI

Recognition: 2 theorem links

· Lean Theorem

The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments

Authors on Pith no claims yet

Pith reviewed 2026-05-13 05:54 UTC · model grok-4.3

classification 💻 cs.CV cs.AI
keywords jigsaw puzzle solvingarchaeological fragmentsflow matchingvision transformerirregular shapeseroded piecesGAP datasetfragment reassembly
0
0 comments X

The pith

New GAP datasets and PuzzleFlow framework let computers reassemble irregular eroded fragments better than prior jigsaw solvers.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper creates GAP, a collection of jigsaw puzzle datasets that use synthetic pieces of arbitrary shapes carrying erosion patterns drawn from a learned model of real archaeological fragments. It introduces PuzzleFlow, a method built on Vision Transformers and flow-matching that predicts how to fit these complex pieces together. Earlier work stayed inside the narrow setting of clean square pieces with straight cuts. A reader would care because actual broken artifacts almost never match that square ideal, so a method that works on eroded irregular shards opens a route to digital reconstruction of physical objects.

Core claim

We introduce GAP, a set of novel jigsaw puzzles datasets containing synthetic, heavily eroded pieces of unrestricted shapes, generated by a learned distribution of real-world archaeological fragments. We also introduce PuzzleFlow, a novel ViT and Flow-Matching based framework for jigsaw puzzle solving, capable of handling complex puzzle pieces and demonstrating superior performance on GAP when compared to both classic and recent prominent works in this domain.

What carries the argument

PuzzleFlow, a Vision Transformer paired with flow-matching that models the reassembly of irregular, eroded pieces; the GAP datasets supply the training and test distribution generated from archaeological fragment statistics.

Load-bearing premise

The learned distribution that produces the synthetic eroded pieces in GAP captures enough of the shape and damage variation present in actual physical archaeological shards.

What would settle it

PuzzleFlow would lose its reported performance advantage when tested on a held-out set of real, non-synthetic archaeological fragments whose erosion statistics were never seen during GAP generation.

Figures

Figures reproduced from arXiv: 2605.12077 by Gur Elkin, Ofir Itzhak Shahar, Ohad Ben-Shahar.

Figure 1
Figure 1. Figure 1: Archaeological Puzzle Reconstruction. A puzzle from GAP-5 dataset (left) features irregularly-shaped, heavily eroded fragments generated from real archaeological artifact distributions. PuzzleFlow (right) successfully reconstructs these challenging puz￾zles by learning holistic visual relationships across entire fragment surfaces, rather than relying on boundary continuity. two complementary contributions.… view at source ↗
Figure 2
Figure 2. Figure 2: Visual comparison of puzzle erosion patterns across [PITH_FULL_IMAGE:figures/full_fig_p002_2.png] view at source ↗
Figure 3
Figure 3. Figure 3: Puzzle Generation Pipeline for GAP-3 and GAP-5 Datasets. Both datasets follow the same four-step generation process: (a) Source images from The Metropolitan Museum of Art Open Access collection (CC0 1.0 Universal Public Domain Dedication); (b) Grid overlay defining puzzle piece boundaries; (c) VAE-based fragment generation creating irregular, archaeologically-realistic piece shapes; (d) Random shuffling pr… view at source ↗
Figure 4
Figure 4. Figure 4: Qualitative comparison: real archaeological fragments [PITH_FULL_IMAGE:figures/full_fig_p005_4.png] view at source ↗
Figure 5
Figure 5. Figure 5: PCA embedding of geometric features (63.2% vari [PITH_FULL_IMAGE:figures/full_fig_p005_5.png] view at source ↗
Figure 6
Figure 6. Figure 6: Fragment Generator Architecture. Our VAE encodes 128×128 binary fragment masks through four convolutional layers into a 64-dimensional latent space, then reconstructs synthetic fragments via transposed convolutions. The reparameterization trick enables sampling diverse fragments during training while maintaining archaeological realism. 6. Metadata: CSV files with object ID, title, artist infor￾mation, date… view at source ↗
Figure 7
Figure 7. Figure 7: Distribution comparison via box plots. Real (RePAIR) fragments shown in blue, synthetic (VAE) fragments in orange. Boxes indicate interquartile ranges (IQR), horizontal lines show medians, whiskers extend to 1.5×IQR, and circles represent outliers. Core shape properties (area, solidity) exhibit high similarity, while edge complexity metrics show expected smoothing effects from VAE reconstruction. • PC1 (45… view at source ↗
Figure 8
Figure 8. Figure 8: PuzzleFlow Architecture. Individual puzzle fragments are processed through a pretrained ViT backbone to extract 768- dimensional visual features. These features are combined with position embeddings (encoding current fragment placements) and time embeddings (encoding flow matching timestep), then passed through 4 additional transformer layers for cross-piece reasoning. The output head predicts logits over … view at source ↗
Figure 9
Figure 9. Figure 9: Qualitative Results. Representative examples of PuzzleFlow solving GAP puzzles. Top rows: Successful reconstructions on GAP-3 (left) and GAP-5 (right) with heavily eroded fragments. Bottom rows: Challenging failure cases where erosion or visual ambiguity leads to errors. PuzzleFlow successfully handles irregular fragment geometries and leverages global visual patterns, though some puzzles with extreme eros… view at source ↗
read the original abstract

Jigsaw puzzle solving has been an increasingly popular task in the computer vision research community. Recent works have utilized cutting-edge architectures and computational approaches to reassemble groups of pieces into a coherent image, while achieving increasingly good results on well established datasets. However, most of these approaches share a common, restricting setting: operating solely on strictly square puzzle pieces. In this work, we introduce GAP, a set of novel jigsaw puzzles datasets containing synthetic, heavily eroded pieces of unrestricted shapes, generated by a learned distribution of real-world archaeological fragments. We also introduce PuzzleFlow, a novel ViT and Flow-Matching based framework for jigsaw puzzle solving, capable of handling complex puzzle pieces and demonstrating superior performance on GAP when compared to both classic and recent prominent works in this domain.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces the GAP dataset of synthetic jigsaw puzzles featuring heavily eroded, irregularly shaped pieces generated from a learned distribution of real-world archaeological fragments. It proposes PuzzleFlow, a ViT and flow-matching framework for reassembling such pieces, and claims superior performance relative to classic and recent baselines on GAP.

Significance. If substantiated, the work extends jigsaw puzzle solving beyond square pieces to irregular eroded fragments, offering a new benchmark (GAP) and method (PuzzleFlow) with potential relevance to digital archaeology. The use of a learned distribution to synthesize realistic damage is a constructive idea, and the ViT/flow-matching combination is a fresh technical direction for the task. The primary limitation is that all claims remain within the synthetic regime.

major comments (2)
  1. [Abstract] Abstract: the assertion of 'superior performance on GAP' is presented without any metrics, baselines, error bars, or experimental protocol. This is load-bearing for the central empirical claim and must be remedied with quantitative results in the experimental section.
  2. [Introduction / Experiments] Introduction and Experiments sections: the title and abstract promise utility for 'real world archaeological fragments,' yet all reported results are confined to synthetic GAP data generated from a learned distribution. No direct evaluation on physical shards is described, leaving the transferability of the performance gap untested and the weakest assumption (faithful reproduction of real fracture, erosion, and imaging statistics) unaddressed.
minor comments (1)
  1. [Abstract] Abstract: a single sentence summarizing the quantitative gains (e.g., accuracy or IoU improvement) would improve clarity for readers.

Simulated Author's Rebuttal

2 responses · 1 unresolved

We thank the referee for the constructive feedback. We address each major comment below and have revised the manuscript to improve clarity and completeness where feasible.

read point-by-point responses
  1. Referee: [Abstract] Abstract: the assertion of 'superior performance on GAP' is presented without any metrics, baselines, error bars, or experimental protocol. This is load-bearing for the central empirical claim and must be remedied with quantitative results in the experimental section.

    Authors: We agree that the abstract should quantitatively support the central claim. The experimental section already details the metrics, baselines, error bars, and protocol for PuzzleFlow versus prior methods on GAP. In the revised manuscript we have updated the abstract to include key quantitative results (e.g., accuracy and reconstruction metrics with comparisons) and a brief reference to the evaluation protocol. revision: yes

  2. Referee: [Introduction / Experiments] Introduction and Experiments sections: the title and abstract promise utility for 'real world archaeological fragments,' yet all reported results are confined to synthetic GAP data generated from a learned distribution. No direct evaluation on physical shards is described, leaving the transferability of the performance gap untested and the weakest assumption (faithful reproduction of real fracture, erosion, and imaging statistics) unaddressed.

    Authors: We acknowledge the limitation. GAP is synthesized from a learned distribution of real archaeological fragments to capture irregular shapes and erosion patterns, providing a scalable and controlled benchmark. We have revised the introduction and experiments to explicitly state that all quantitative results are on synthetic data, to describe the validation steps used when learning the fragment distribution, and to note that direct transfer to physical shards remains untested. Direct evaluation on real artifacts is not feasible in the current work due to limited access to physical shards and imaging equipment; we have added a dedicated limitations paragraph and future-work statement addressing this gap. revision: partial

standing simulated objections not resolved
  • Direct evaluation on physical archaeological shards cannot be added without new real-world data collection and imaging, which is outside the scope and resources of the present study.

Circularity Check

0 steps flagged

No significant circularity; claims rest on new dataset and independent benchmark comparisons

full rationale

The paper introduces GAP as a new synthetic dataset generated from a learned distribution of real archaeological fragments and proposes PuzzleFlow as a novel ViT + flow-matching architecture. Superiority is claimed via direct empirical comparisons against classic and recent baselines on held-out portions of GAP. No equations, self-citations, or fitted parameters are shown to reduce the reported performance metrics to the generation process or prior author work by construction. The evaluation remains statistically independent of the input distribution once the train/test split is performed, satisfying the criteria for a self-contained result against external benchmarks.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 2 invented entities

Review based on abstract only; no explicit free parameters, axioms, or invented entities beyond the high-level dataset and model names are described.

invented entities (2)
  • GAP dataset no independent evidence
    purpose: Benchmark for jigsaw solving with irregular eroded pieces
    Synthetic pieces generated from learned distribution of real archaeological fragments
  • PuzzleFlow framework no independent evidence
    purpose: Solver for complex non-square puzzle pieces
    Combines ViT and flow-matching

pith-pipeline@v0.9.0 · 5436 in / 1143 out tokens · 60149 ms · 2026-05-13T05:54:53.350716+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

88 extracted references · 88 canonical work pages · 4 internal anchors

  1. [1]

    Dis- crete tabu search for graph matching

    Kamil Adamczewski, Yumin Suh, and Kyoung Mu Lee. Dis- crete tabu search for graph matching. InProceedings of the IEEE international conference on computer vision, pages 109–117, 2015. 7

  2. [2]

    A generative flow for conditional sampling via optimal transport.arXiv preprint arXiv:2307.04102, 2023

    Jason Alfonso, Ricardo Baptista, Anupam Bhakta, Noam Gal, Alfin Hou, Isa Lyubimova, Daniel Pocklington, Josef Sajonz, Giulio Trigila, and Ryan Tsai. A generative flow for conditional sampling via optimal transport.arXiv preprint arXiv:2307.04102, 2023. 5

  3. [3]

    Devel- opment of captcha system based on puzzle

    Firkhan Ali Bin Hamid Ali and Farhana Bt Karim. Devel- opment of captcha system based on puzzle. In2014 interna- tional conference on computer, communications, and control technology (I4CT), pages 426–428. IEEE, 2014. 1

  4. [4]

    Solving jigsaw puz- zles with eroded boundaries

    Dov Bridger, Dov Danon, and Ayellet Tal. Solving jigsaw puz- zles with eroded boundaries. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 3526–3535, 2020. 2

  5. [5]

    Domain generalization by solving jigsaw puzzles

    Fabio M Carlucci, Antonio D’Innocente, Silvia Bucci, Bar- bara Caputo, and Tatiana Tommasi. Domain generalization by solving jigsaw puzzles. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 2229–2238, 2019. 1

  6. [6]

    Jigsaw-vit: Learning jigsaw puzzles in vision transformer.Pattern Recognition Letters, 166:53–60,

    Yingyi Chen, Xi Shen, Yahui Liu, Qinghua Tao, and Jo- han AK Suykens. Jigsaw-vit: Learning jigsaw puzzles in vision transformer.Pattern Recognition Letters, 166:53–60,

  7. [7]

    A prob- abilistic image jigsaw puzzle solver

    Taeg Sang Cho, Shai Avidan, and William T Freeman. A prob- abilistic image jigsaw puzzle solver. In2010 IEEE Computer society conference on computer vision and pattern recogni- tion, pages 183–190. IEEE, 2010. 4, 6

  8. [8]

    A multiscale method for the reassembly of two-dimensional fragmented objects.IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9):1239–1251, 2002

    Helena Cristina da Gama Leitao and Jorge Stolfi. A multiscale method for the reassembly of two-dimensional fragmented objects.IEEE Transactions on Pattern Analysis and Machine Intelligence, 24(9):1239–1251, 2002. 1

  9. [9]

    Jigsaw puzzles, edge matching, and polyomino packing: Connections and complexity.Graphs and Combinatorics, 23(Suppl 1):195– 208, 2007

    Erik D Demaine and Martin L Demaine. Jigsaw puzzles, edge matching, and polyomino packing: Connections and complexity.Graphs and Combinatorics, 23(Suppl 1):195– 208, 2007. 2

  10. [10]

    Solving archae- ological puzzles.Pattern Recognition, 119:108065, 2021

    Niv Derech, Ayellet Tal, and Ilan Shimshoni. Solving archae- ological puzzles.Pattern Recognition, 119:108065, 2021. 1

  11. [11]

    An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale

    Alexey Dosovitskiy. An image is worth 16x16 words: Trans- formers for image recognition at scale.arXiv preprint arXiv:2010.11929, 2020. 5, 4

  12. [12]

    Seq2seq models reconstruct visual jigsaw puzzles without seeing them

    Gur Elkin, Ofir Itzhak Shahar, and Ohad Ben-Shahar. Seq2seq models reconstruct visual jigsaw puzzles without seeing them. arXiv preprint arXiv:2511.06315, 2025. 3, 6, 7, 4

  13. [13]

    Freeman and L

    H. Freeman and L. Garder. Apictorial jigsaw puzzles: The computer solution of a problem in pattern recognition.IEEE Transactions on Electronic Computers, EC-13(2):118–127,

  14. [14]

    A novel image based captcha using jigsaw puzzle

    Haichang Gao, Dan Yao, Honggang Liu, Xiyang Liu, and Liming Wang. A novel image based captcha using jigsaw puzzle. In2010 13th IEEE international conference on com- putational science and engineering, pages 351–356. IEEE,

  15. [15]

    A test of the” jigsaw puzzle” model for protein folding by mul- tiple methionine substitutions within the core of t4 lysozyme

    Nadine C Gassner, Walter A Baase, and Brian W Matthews. A test of the” jigsaw puzzle” model for protein folding by mul- tiple methionine substitutions within the core of t4 lysozyme. Proceedings of the National Academy of Sciences, 93(22): 12155–12158, 1996. 1

  16. [16]

    Positional diffusion: Graph-based diffusion models for set ordering.Pattern Recognition Letters, 186:272–278, 2024

    Francesco Giuliari, Gianluca Scarpellini, Stefano Fiorini, Stu- art James, Pietro Morerio, Yiming Wang, and Alessio Del Bue. Positional diffusion: Graph-based diffusion models for set ordering.Pattern Recognition Letters, 186:272–278, 2024. 2

  17. [17]

    From square pieces to brick walls: The next challenge in solving jigsaw puzzles

    Shir Gur and Ohad Ben-Shahar. From square pieces to brick walls: The next challenge in solving jigsaw puzzles. InPro- ceedings of the IEEE international conference on computer vision, pages 4029–4037, 2017. 6

  18. [18]

    Pic- torial and apictorial polygonal jigsaw puzzles from arbitrary number of crossing cuts.International Journal of Computer Vision, 132(9):3428–3462, 2024

    Peleg Harel, Ofir Itzhak Shahar, and Ohad Ben-Shahar. Pic- torial and apictorial polygonal jigsaw puzzles from arbitrary number of crossing cuts.International Journal of Computer Vision, 132(9):3428–3462, 2024. 1

  19. [19]

    Solving jigsaw puzzles with vision transformers.Pattern Analysis and Applications, 28(2):110, 2025

    Ga¨el Heck, Nicolas Lerm ´e, and Sylvie Le H ´egarat-Mascle. Solving jigsaw puzzles with vision transformers.Pattern Analysis and Applications, 28(2):110, 2025. 2

  20. [20]

    Reassemblenet: Learnable keypoints and diffusion for 2d fresco reconstruction.arXiv preprint arXiv:2505.21117, 2025

    Adeela Islam, Stefano Fiorini, Stuart James, Pietro Morerio, and Alessio Del Bue. Reassemblenet: Learnable keypoints and diffusion for 2d fresco reconstruction.arXiv preprint arXiv:2505.21117, 2025. 1

  21. [21]

    Jigsaw puzzle solving as a consistent labeling problem

    Marina Khoroshiltseva, Ben Vardi, Alessandro Torcinovich, Arianna Traviglia, Ohad Ben-Shahar, and Marcello Pelillo. Jigsaw puzzle solving as a consistent labeling problem. In International Conference on Computer Analysis of Images and Patterns, pages 392–402. Springer, 2021. 2

  22. [22]

    Solving jigsaw puzzles by predicting fragment’s coordinate based on vision transformer.Expert Systems with Applications, 272: 126776, 2025

    Garam Kim, Hyeonseong Cho, and Hyoungsik Nam. Solving jigsaw puzzles by predicting fragment’s coordinate based on vision transformer.Expert Systems with Applications, 272: 126776, 2025. 2, 6, 7, 4

  23. [23]

    Auto-Encoding Variational Bayes

    Diederik P Kingma and Max Welling. Auto-encoding vari- ational bayes.arXiv preprint arXiv:1312.6114, 2013. 3, 1

  24. [24]

    Scientific puzzle solv- ing: Current techniques and applications

    Florian Kleber and Robert Sablatnig. Scientific puzzle solv- ing: Current techniques and applications. InProceedings of the Computer Applications and Quantitative Methods in Archaeology Conference, 2009. 1

  25. [25]

    Jigsawnet: Shredded image reassem- bly using convolutional neural network and loop-based com- position.IEEE Transactions on Image Processing, 28(8): 4000–4015, 2019

    Canyu Le and Xin Li. Jigsawnet: Shredded image reassem- bly using convolutional neural network and loop-based com- position.IEEE Transactions on Image Processing, 28(8): 4000–4015, 2019. 3

  26. [26]

    Jigsawgan: Auxiliary learning for solving jigsaw puzzles with generative adversarial networks.IEEE Transac- tions on Image Processing, 31:513–524, 2021

    Ru Li, Shuaicheng Liu, Guangfu Wang, Guanghui Liu, and Bing Zeng. Jigsawgan: Auxiliary learning for solving jigsaw puzzles with generative adversarial networks.IEEE Transac- tions on Image Processing, 31:513–524, 2021. 2, 6, 7 9

  27. [27]

    Hi- erarchical fragmented image reassembly using a bundle-of- superpixel representation.Computer Aided Geometric Design, 71:220–230, 2019

    Xin Li, Kang Xie, Wenxing Hong, and Celong Liu. Hi- erarchical fragmented image reassembly using a bundle-of- superpixel representation.Computer Aided Geometric Design, 71:220–230, 2019. 3

  28. [28]

    Flow Matching for Generative Modeling

    Yaron Lipman, Ricky TQ Chen, Heli Ben-Hamu, Maximilian Nickel, and Matt Le. Flow matching for generative modeling. arXiv preprint arXiv:2210.02747, 2022. 5

  29. [29]

    Solving masked jigsaw puzzles with diffusion vision transformers

    Jinyang Liu, Wondmgezahu Teshome, Sandesh Ghimire, Mario Sznaier, and Octavia Camps. Solving masked jigsaw puzzles with diffusion vision transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 23009–23018, 2024. 2, 6, 7, 4

  30. [30]

    Decoupled Weight Decay Regularization

    Ilya Loshchilov and Frank Hutter. Decoupled weight decay regularization.arXiv preprint arXiv:1711.05101, 2017. 6, 5

  31. [31]

    A survey on com- putational solutions for reconstructing complete objects by reassembling their fractured parts

    Jiaxin Lu, Yongqing Liang, Huijun Han, Jiacheng Hua, Jun- feng Jiang, Xin Li, and Qixing Huang. A survey on com- putational solutions for reconstructing complete objects by reassembling their fractured parts. InComputer Graphics Forum, page e70081. Wiley Online Library, 2025. 2

  32. [32]

    Mitochondrial dna as a genomic jigsaw puzzle.Science, 318(5849):415–415, 2007

    William Marande and Gertraud Burger. Mitochondrial dna as a genomic jigsaw puzzle.Science, 318(5849):415–415, 2007. 1

  33. [33]

    Jigsaw puzzle solving techniques and applications: a survey.The Visual Computer, 39(10):4405–4421, 2023

    Smaragda Markaki and Costas Panagiotakis. Jigsaw puzzle solving techniques and applications: a survey.The Visual Computer, 39(10):4405–4421, 2023. 2

  34. [34]

    Self-supervised learning of pretext-invariant representations

    Ishan Misra and Laurens van der Maaten. Self-supervised learning of pretext-invariant representations. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 6707–6717, 2020. 1

  35. [35]

    Unsupervised learning of visual representations by solving jigsaw puzzles

    Mehdi Noroozi and Paolo Favaro. Unsupervised learning of visual representations by solving jigsaw puzzles. InEuropean conference on computer vision, pages 69–84. Springer, 2016. 1

  36. [36]

    The metropolitan museum of art open access dataset

    The Metropolitan Museum of Art. The metropolitan museum of art open access dataset. https : / / www . metmuseum . org / about - the - met / policies - and- documents/open- access , 2017. Dataset li- censed under Creative Commons Zero (CC0). 3, 4, 1

  37. [37]

    Solving convex partition visual jigsaw puzzles.arXiv preprint arXiv:2511.04450, 2025

    Yaniv Ohayon, Ofir Itzhak Shahar, and Ohad Ben-Shahar. Solving convex partition visual jigsaw puzzles.arXiv preprint arXiv:2511.04450, 2025. 1

  38. [38]

    Deepzzle: Solving visual jigsaw puzzles with deep learn- ing and shortest path optimization.IEEE Transactions on Image Processing, 29:3569–3581, 2020

    Marie-Morgane Paumard, David Picard, and Hedi Tabia. Deepzzle: Solving visual jigsaw puzzles with deep learn- ing and shortest path optimization.IEEE Transactions on Image Processing, 29:3569–3581, 2020. 2, 3, 4, 6, 7

  39. [39]

    A fully automated greedy square jigsaw puzzle solver

    Dolev Pomeranz, Michal Shemesh, and Ohad Ben-Shahar. A fully automated greedy square jigsaw puzzle solver. InCVPR 2011, pages 9–16. IEEE, 2011. 2, 4, 6, 7, 5

  40. [40]

    Masked jigsaw puzzle: A versatile position embedding for vision transformers

    Bin Ren, Yahui Liu, Yue Song, Wei Bi, Rita Cucchiara, Nicu Sebe, and Wei Wang. Masked jigsaw puzzle: A versatile position embedding for vision transformers. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 20382–20391, 2023. 2

  41. [41]

    A novel hybrid scheme using genetic algorithms and deep learning for the reconstruction of portuguese tile panels

    Daniel Rika, Dror Sholomon, Eli David, and Nathan S Ne- tanyahu. A novel hybrid scheme using genetic algorithms and deep learning for the reconstruction of portuguese tile panels. InProceedings of the genetic and evolutionary computation conference, pages 1319–1327, 2019. 1

  42. [42]

    Ten: Twin embedding networks for the jigsaw puzzle problem with eroded boundaries.arXiv preprint arXiv:2203.06488, 2022

    Daniel Rika, Dror Sholomon, Eli David, and Nathan S Ne- tanyahu. Ten: Twin embedding networks for the jigsaw puzzle problem with eroded boundaries.arXiv preprint arXiv:2203.06488, 2022. 2

  43. [43]

    Solv- ing jigsaw puzzles in the wild: Human-guided reconstruction of cultural heritage fragments

    Omidreza Safaei, Sinem Aslan, Sebastiano Vascon, Luca Palmieri, Marina Khoroshiltseva, and Marcello Pelillo. Solv- ing jigsaw puzzles in the wild: Human-guided reconstruction of cultural heritage fragments. In2025 IEEE 35th Interna- tional Workshop on Machine Learning for Signal Processing (MLSP), pages 1–6. IEEE, 2025. 1

  44. [44]

    Diffassemble: A unified graph-diffusion model for 2d and 3d reassembly

    Gianluca Scarpellini, Stefano Fiorini, Francesco Giuliari, Pietro Moreiro, and Alessio Del Bue. Diffassemble: A unified graph-diffusion model for 2d and 3d reassembly. InProceed- ings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 28098–28108, 2024. 2, 6, 7

  45. [45]

    Pair- wise alignment & compatibility for arbitrarily irregular image fragments.arXiv preprint arXiv:2507.09767, 2025

    Ofir Itzhak Shahar, Gur Elkin, and Ohad Ben-Shahar. Pair- wise alignment & compatibility for arbitrarily irregular image fragments.arXiv preprint arXiv:2507.09767, 2025. 1, 2

  46. [46]

    A genetic algorithm-based solver for very large jigsaw puzzles

    Dror Sholomon, Omid David, and Nathan S Netanyahu. A genetic algorithm-based solver for very large jigsaw puzzles. InProceedings of the IEEE conference on computer vision and pattern recognition, pages 1767–1774, 2013. 2, 4, 6, 7

  47. [47]

    Dnn-buddies: A deep neural network-based estimation metric for the jigsaw puzzle problem

    Dror Sholomon, Omid E David, and Nathan S Netanyahu. Dnn-buddies: A deep neural network-based estimation metric for the jigsaw puzzle problem. InInternational Conference on Artificial Neural Networks, pages 170–178. Springer, 2016. 2

  48. [48]

    Wall painting recon- struction using a genetic algorithm.Journal on Computing and Cultural Heritage (JOCCH), 11(1):1–17, 2017

    Elena Sizikova and Thomas Funkhouser. Wall painting recon- struction using a genetic algorithm.Journal on Computing and Cultural Heritage (JOCCH), 11(1):1–17, 2017. 1

  49. [49]

    Siamese-discriminant deep re- inforcement learning for solving jigsaw puzzles with large eroded gaps

    Xingke Song, Jiahuan Jin, Chenglin Yao, Shihe Wang, Jian- feng Ren, and Ruibin Bai. Siamese-discriminant deep re- inforcement learning for solving jigsaw puzzles with large eroded gaps. InProceedings of the AAAI Conference on Ar- tificial Intelligence, pages 2303–2311, 2023. 1, 2, 3, 4, 6, 7

  50. [50]

    Solving jigsaw puzzle of large eroded gaps using puzzlet discriminant network

    Xingke Song, Xiaoying Yang, Jianfeng Ren, Ruibin Bai, and Xudong Jiang. Solving jigsaw puzzle of large eroded gaps using puzzlet discriminant network. InICASSP 2023-2023 IEEE international conference on acoustics, speech and sig- nal processing (ICASSP), pages 1–5. IEEE, 2023. 2, 7

  51. [51]

    Ceari: Co-evolutionary agents for reassembling and inpainting puz- zles with gaps and missing pieces

    Xingke Song, Jianxu Shangguan, Yiran Li, Jialu Zhang, Jian- feng Ren, Ruibin Bai, Xin Chen, and Xudong Jiang. Ceari: Co-evolutionary agents for reassembling and inpainting puz- zles with gaps and missing pieces. InProceedings of the 33rd ACM International Conference on Multimedia, pages 2634–2642, 2025. 2

  52. [52]

    ERL-MPP: Evolu- tionary reinforcement learning with multi-head puzzle percep- tion for solving large-scale jigsaw puzzles of eroded gaps

    Xingke Song, Xiaoying Yang, Chenglin Yao, Jianfeng Ren, Ruibin Bai, Xin Chen, and Xudong Jiang. ERL-MPP: Evolu- tionary reinforcement learning with multi-head puzzle percep- tion for solving large-scale jigsaw puzzles of eroded gaps. In Proceedings of the AAAI Conference on Artificial Intelligence, pages 6968–6977, 2025. 2, 7

  53. [53]

    Ganzzle: Reframing jigsaw puzzle solving as a retrieval task using a generative mental image

    Davide Talon, Alessio Del Bue, and Stuart James. Ganzzle: Reframing jigsaw puzzle solving as a retrieval task using a generative mental image. In2022 IEEE international confer- ence on image processing (ICIP), pages 4083–4087. IEEE,

  54. [54]

    Ganzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations.Pattern Recognition Letters, 187:35–41, 2025

    Davide Talon, Alessio Del Bue, and Stuart James. Ganzzle++: Generative approaches for jigsaw puzzle solving as local to global assignment in latent spatial representations.Pattern Recognition Letters, 187:35–41, 2025. 2

  55. [55]

    Re-assembling the past: The repair dataset and benchmark for real world 2d and 3d puzzle solving.Advances in Neural Information Process- ing Systems, 37:30076–30105, 2024

    Theodore Tsesmelis, Luca Palmieri, Marina Khoroshiltseva, Adeela Islam, Gur Elkin, Ofir I Shahar, Gianluca Scarpellini, Stefano Fiorini, Yaniv Ohayon, Nadav Alali, Sinem Aslan, Pietro Morerio, Sebastiano Vascon, Elena gravina, Maria Napolitano, Giuseppe Scarpati, Gabriel zuchtriegel, Alexan- dra Sp¨uhler, Michel Fuchs, Stuart James, Ohad Ben-Shahar, Marce...

  56. [56]

    Shredded doc- ument reconstruction using mpeg-7 standard descriptors

    Anna Ukovich, Giovanni Ramponi, Haralambos Doulaver- akis, Yiannis Kompatsiaris, and MG Strintzis. Shredded doc- ument reconstruction using mpeg-7 standard descriptors. In Proceedings of the Fourth IEEE International Symposium on Signal Processing and Information Technology, 2004., pages 334–337. IEEE, 2004. 1

  57. [57]

    Multi-phase relax- ation labeling for square jigsaw puzzle solving.arXiv preprint arXiv:2303.14793, 2023

    Ben Vardi, Alessandro Torcinovich, Marina Khoroshiltseva, Marcello Pelillo, and Ohad Ben-Shahar. Multi-phase relax- ation labeling for square jigsaw puzzle solving.arXiv preprint arXiv:2303.14793, 2023. 2

  58. [58]

    Attention is all you need.Advances in neural information processing systems, 30, 2017

    Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszko- reit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. Attention is all you need.Advances in neural information processing systems, 30, 2017. 6

  59. [59]

    The puzzle assembled: Ediacaran guide fossil cloudina reveals an old proto-gondwana seaway.Geology, 42(5):391–394, 2014

    Lucas V Warren, Fernanda Quaglio, Claudio Riccomini, Mar- cello Guimar˜aes Sim˜oes, Daniel Gustavo Poire, Nicol´as Mis- ailidis Strikis, Luis E Anelli, and Pedro Carlos Strikis. The puzzle assembled: Ediacaran guide fossil cloudina reveals an old proto-gondwana seaway.Geology, 42(5):391–394, 2014. 1

  60. [60]

    Iterative reorganization with weak spatial constraints: Solving arbitrary jigsaw puz- zles for unsupervised representation learning

    Chen Wei, Lingxi Xie, Xutong Ren, Yingda Xia, Chi Su, Jiay- ing Liu, Qi Tian, and Alan L Yuille. Iterative reorganization with weak spatial constraints: Solving arbitrary jigsaw puz- zles for unsupervised representation learning. InProceedings of the IEEE/CVF conference on computer vision and pattern recognition, pages 1910–1919, 2019. 1

  61. [61]

    Computational re- construction of ancient artifacts.IEEE Signal processing magazine, 25(4):65–83, 2008

    Andrew R Willis and David B Cooper. Computational re- construction of ancient artifacts.IEEE Signal processing magazine, 25(4):65–83, 2008. 1

  62. [62]

    A solution to reconstruct cross-cut shredded text documents based on character recognition and genetic algorithm

    Hedong Xu, Jing Zheng, Ziwei Zhuang, and Suohai Fan. A solution to reconstruct cross-cut shredded text documents based on character recognition and genetic algorithm. In Abstract and applied analysis, page 829602. Wiley Online Library, 2014. 1

  63. [63]

    Vlhsa: Vision-language hier- archical semantic alignment for jigsaw puzzle solving with eroded gaps.arXiv preprint arXiv:2509.25202, 2025

    Zhuoning Xu and Xinyan Liu. Vlhsa: Vision-language hier- archical semantic alignment for jigsaw puzzle solving with eroded gaps.arXiv preprint arXiv:2509.25202, 2025. 2, 7

  64. [64]

    Solving jigsaw puzzles with linear programming.arXiv preprint arXiv:1511.04472, 2015

    Rui Yu, Chris Russell, and Lourdes Agapito. Solving jigsaw puzzles with linear programming.arXiv preprint arXiv:1511.04472, 2015. 2

  65. [65]

    A graph-based optimization algo- rithm for fragmented image reassembly.Graphical Models, 76(5):484–495, 2014

    Kang Zhang and Xin Li. A graph-based optimization algo- rithm for fragmented image reassembly.Graphical Models, 76(5):484–495, 2014. 3

  66. [66]

    A jigsaw puzzle in- spired algorithm for solving large-scale no-wait flow shop scheduling problems.Applied Intelligence, 50:87–100, 2020

    Fuqing Zhao, Xuan He, Yi Zhang, Wenchang Lei, Weimin Ma, Chuck Zhang, and Houbin Song. A jigsaw puzzle in- spired algorithm for solving large-scale no-wait flow shop scheduling problems.Applied Intelligence, 50:87–100, 2020. 1 11 The Missing GAP: From Solving Square Jigsaw Puzzles to Handling Real World Archaeological Fragments Supplementary Material

  67. [67]

    Introduction This supplementary material provides comprehensive techni- cal documentation for all components of our work. We orga- nize the content into three main sections: (1) the GAP dataset generation pipeline and statistical validation (Section 8), (2) complete implementation details and training configurations for PuzzleFlow and all baseline methods...

  68. [68]

    GAP Dataset: Generation and Validation This section details the complete pipeline for generating the GAP (Generated Archaeological-fragments Puzzles) datasets, including the fragment generator architecture, train- ing procedure, source data collection, and comprehensive statistical validation against real archaeological fragments. 8.1. Fragment Generator ...

  69. [69]

    API Query: Query collectionapi.metmuseum.org for public domain objects

  70. [70]

    Filtering: Apply isPublicDomain=True AND title NOT LIKE ’%fragment%’ , in order to as- sure selected images are indeed categorized as public domain, while filtering out images of already fragmented artifacts

  71. [71]

    Sampling: Random selection of 40,000 unique object IDs (20,000 for GAP-3, 20,000 for GAP-5)

  72. [72]

    Download: Parallel retrieval with 20 workers and retry logic for failed requests

  73. [73]

    The reparameterization trick enables sampling diverse fragments during training while maintaining archaeological realism

    Storage: Full-resolution primary images with complete metadata 1 Figure 6.Fragment Generator Architecture.Our V AE encodes 128×128 binary fragment masks through four convolutional layers into a 64-dimensional latent space, then reconstructs synthetic fragments via transposed convolutions. The reparameterization trick enables sampling diverse fragments dur...

  74. [74]

    Metadata: CSV files with object ID, title, artist infor- mation, date/period, department, culture, medium, and dimensions (where available in the MET’s original meta- data) 8.2.2. Collection Diversity Statistics Analysis of the 40,000 collected images reveals exceptional temporal, geographical, and medium diversity: Departmental Distribution: • 19 unique ...

  75. [75]

    Area A= P i,j M(i, j) : Total number of foreground pixels, providing an absolute size measure in px2

  76. [76]

    Captures edge extent and complexity

    Perimeter P : Length of the fragment boundary com- puted via contour tracing, measured in pixels. Captures edge extent and complexity

  77. [77]

    Aspect Ratio r=w bbox/hbbox: Ratio of minimum bound- ing rectangle width to height. Note that original frag- ments were normalized to square bounding boxes (aspect ratio ≈1 ) pre-training to ensure consistent input dimen- sions (128×128 pixels), resulting in distributions centered near unity

  78. [78]

    S= 1 for convex fragments; S <1 quantifies boundary concavity depth

    Solidity S=A/A hull: Ratio of fragment area to its convex hull area. S= 1 for convex fragments; S <1 quantifies boundary concavity depth. Formally, Ahull = Area(ConvexHull(M))

  79. [79]

    C= 1 for perfect circles; C <1 for irregular shapes

    Circularity C= 4πA/P 2: Isoperimetric quotient com- paring shape to a circle. C= 1 for perfect circles; C <1 for irregular shapes. Invariant to scaling

  80. [80]

    Lower values indicate more compact shapes; higher values reflect irregular boundaries

    Compactness K=P 2/A: Inverse measure of shape efficiency. Lower values indicate more compact shapes; higher values reflect irregular boundaries. Related to circularity byK= 4π/C

Showing first 80 references.