arxiv: 2605.14068 · v1 · submitted 2026-05-13 · 💻 cs.CV · cs.LG

Recognition: 1 theorem link

· Lean Theorem

CurveBench: A Benchmark for Exact Topological Reasoning over Nested Jordan Curves

Amirreza Mohseni , Mona Mohammadi , Morteza Saghafian , Naser Talebizadeh Saradari

Authors on Pith no claims yet

Pith reviewed 2026-05-15 05:30 UTC · model grok-4.3

classification 💻 cs.CV cs.LG

keywords CurveBenchJordan curvestopological reasoningcontainment treevision-language modelsnested regionsbenchmarkhierarchical structure

0 comments

The pith

Vision-language models recover exact containment trees from nested Jordan curves at only 71 percent accuracy on easy cases and 19 percent on hard cases.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper presents CurveBench, a collection of 756 images showing non-intersecting Jordan curves in varied layouts, each annotated with a rooted tree that records which regions contain which others. It frames the task as full tree recovery from the image alone. Even the strongest tested model reaches 71.1 percent tree-generation accuracy on the easy subset and falls to 19.1 percent on the hard subset. Fine-tuning an open 8B vision-language model lifts easy-set performance to 33.3 percent and surpasses several larger closed models, yet the gap on complex configurations remains large.

Core claim

CurveBench supplies images of pairwise non-intersecting Jordan curves together with ground-truth rooted trees that encode the full hierarchy of containment among the bounded planar regions. The central finding is that current vision-language models cannot reliably reconstruct these trees, with top accuracy at 71.1 percent on easy images and 19.1 percent on hard images; targeted fine-tuning improves results but leaves the capability far from solved.

What carries the argument

The rooted containment tree that encodes the complete hierarchy of nesting relations among the planar regions delimited by the curves.

If this is right

Fine-tuning with RLVR-style methods raises tree-generation accuracy on the easy subset and can exceed some closed models.
Accuracy collapses on maze-like and dense counting configurations, exposing limits in handling complex nesting.
The benchmark supplies a concrete metric for measuring progress toward topology-aware visual reasoning.
A persistent performance shortfall after fine-tuning indicates the required capability is not yet present in current models.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

Topological containment appears to be a distinct visual reasoning skill that general training does not reliably produce.
The same tree-recovery format could be applied to test hierarchical understanding in maps, diagrams, or organizational charts.
Models that master this task may also improve on other spatial problems that require precise region relations rather than approximate visual matching.

Load-bearing premise

The generated images and their tree annotations isolate topological containment without supplying unintended low-level visual shortcuts or dataset-specific biases.

What would settle it

A model that consistently generates correct trees on the hard subset across new image styles and without prior exposure to similar curve data would show the reasoning gap has been closed.

Figures

Figures reproduced from arXiv: 2605.14068 by Amirreza Mohseni, Mona Mohammadi, Morteza Saghafian, Naser Talebizadeh Saradari.

**Figure 2.** Figure 2: Tree-reward learning dynamics for trained models. Left: training set Right: eval set environments for CurveBench-Easy and CurveBench-Hard, ensuring that all models are evaluated under identical conditions. The released environments are listed in Appendix B.3. 6 Results [PITH_FULL_IMAGE:figures/full_fig_p007_2.png] view at source ↗

**Figure 3.** Figure 3: CurveBench Dataset: Hierarchical Distribution CurveBench is released as a collection of benchmark datasets for evaluating visual topological reasoning. The released resources include CurveBench-Easy and the main CurveBench benchmark used for the harder evaluation setting. Review mode. This submission is intended for the singleblind review option in the NeurIPS 2026 Evaluations & Datasets Track. CurveBenc… view at source ↗

**Figure 4.** Figure 4: Per-category success-rates. 16 [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

**Figure 5.** Figure 5: Stacked reward greakdown for CurveBench-Hard. Darkest, medium, and Lightest color [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

read the original abstract

We introduce CurveBench, a benchmark for hierarchical topological reasoning from visual input. CurveBench consists of \textbf{756 images} of pairwise non-intersecting Jordan curves across easy, polygonal, topographic-inspired, maze-like, and dense counting configurations. Each image is annotated with a rooted tree encoding the containment relations between planar regions. We formulate the task as structured prediction: given an image, a model must recover the full rooted containment tree induced by the curves. Despite the visual simplicity of the task, the strongest evaluated model, Gemini 3.1 Pro, achieves only \textbf{71.1\%} tree-generation accuracy on CurveBench-Easy and \textbf{19.1\%} on CurveBench-Hard. We further demonstrate benchmark utility through RLVR-style fine-tuning of open-weight vision-language models. Our trained Qwen3-VL-8B model improves over \texttt{Qwen-3-VL-8B-Thinking} from \textbf{2.8\%} to \textbf{33.3\%} tree-generation accuracy on CurveBench-Easy, exceeding GPT-5.4 and Claude Opus 4.5 under our evaluation protocol. The remaining gap, especially on CurveBench-Hard, shows that exact topology-aware visual reasoning remains far from solved.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

CurveBench adds a clean new task for recovering containment trees from simple curve images, with results showing VLMs still struggle on harder variants, though the images may leak non-topological cues.

read the letter

The paper's main contribution is CurveBench: 756 procedurally generated images of non-intersecting Jordan curves in polygonal, topographic, maze-like, and dense setups, each paired with a rooted tree that encodes the exact containment hierarchy. Models must output the full tree rather than just count regions or answer yes/no questions. Gemini 3.1 Pro reaches 71% tree accuracy on the easy split but only 19% on hard, and the authors show that RLVR-style fine-tuning lifts an 8B Qwen model from 2.8% to 33% on easy, beating some closed models under their protocol. That is concrete and useful data for anyone tracking visual reasoning progress. The task definition is precise and the variety of visual styles is a plus; it moves beyond simple counting benchmarks. The fine-tuning experiment also demonstrates that the benchmark can drive measurable gains. The weakest part is the lack of detail on whether the generated images truly force topological reasoning. If curve thickness, vertex placement, or region shading correlates with nesting depth, models could succeed by pattern matching instead of winding numbers or ray casting. The abstract gives no description of annotation validation, independent symbolic checks on the trees, or ablations that remove low-level cues. Without those, the gap between easy and hard could partly reflect dataset artifacts rather than a pure reasoning limit. This work is aimed at researchers building or evaluating multimodal models for diagrams, GIS, or robotics. It is coherent on its own terms and supplies a reproducible target, so it deserves peer review. Ask the authors for the image-generation code, annotation verification steps, and any feature-correlation analysis before accepting. Overall, send it out.

Referee Report

2 major / 2 minor

Summary. The manuscript introduces CurveBench, a benchmark of 756 procedurally generated images depicting pairwise non-intersecting Jordan curves in easy, polygonal, topographic, maze-like, and dense configurations. Each image is annotated with a rooted tree encoding containment relations among planar regions. The task is formulated as structured prediction of the full containment tree from the image. The authors report that Gemini 3.1 Pro achieves 71.1% tree-generation accuracy on CurveBench-Easy and 19.1% on CurveBench-Hard; fine-tuning Qwen3-VL-8B raises its Easy accuracy from 2.8% to 33.3%, exceeding several closed models under the authors' protocol. The central conclusion is that exact topology-aware visual reasoning remains far from solved.

Significance. If the images and annotations isolate containment relations without exploitable low-level visual cues, the benchmark would provide a useful, falsifiable test of exact hierarchical topological reasoning in VLMs. The fine-tuning result would further demonstrate that targeted training can measurably improve performance on this task. Such a resource could help shift evaluation away from approximate or heuristic visual reasoning toward precise, verifiable topological competence.

major comments (2)

[Abstract and dataset-construction section] Abstract and dataset-construction section: the central claim that low accuracies demonstrate failure of exact topological reasoning rests on the assumption that the 756 images contain no unintended low-level signals (curve thickness, vertex density, shading, or positional statistics) correlated with nesting depth or region count. No ablation, correlation analysis, or independent symbolic verification of the tree annotations decoupled from pixel features is described, leaving open the possibility that models exploit rendering artifacts rather than winding-number or ray-casting reasoning.
[Evaluation protocol] Evaluation protocol: the reported tree-generation accuracies (71.1% Easy, 19.1% Hard) are presented without details on the exact prompting template, output parsing rules, or statistical significance testing across multiple runs, making it difficult to assess whether the performance gap is robust or sensitive to evaluation choices.

minor comments (2)

The abstract states that the benchmark uses 'rooted-tree annotations' but does not specify the exact tree representation (e.g., parent-pointer list, nested parentheses) or how ties in containment are resolved; a brief formal definition would improve reproducibility.
Figure captions and the description of the five configurations could include explicit counts of images per subset and average nesting depth to allow readers to gauge difficulty distribution.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback on dataset integrity and evaluation details. We address each major comment below and have revised the manuscript to incorporate additional verification and protocol specifications.

read point-by-point responses

Referee: [Abstract and dataset-construction section] Abstract and dataset-construction section: the central claim that low accuracies demonstrate failure of exact topological reasoning rests on the assumption that the 756 images contain no unintended low-level signals (curve thickness, vertex density, shading, or positional statistics) correlated with nesting depth or region count. No ablation, correlation analysis, or independent symbolic verification of the tree annotations decoupled from pixel features is described, leaving open the possibility that models exploit rendering artifacts rather than winding-number or ray-casting reasoning.

Authors: We agree that explicit checks strengthen the claim. CurveBench images are generated via a procedural pipeline that independently samples curve parameters (control points, nesting depth, region count) from uniform distributions before rendering with fixed line width and no shading or texture. In the revision we add a dedicated verification subsection: (i) Pearson correlations between low-level image statistics (edge length, vertex density, bounding-box area) and tree properties (depth, node count) are all below 0.12; (ii) containment trees were recomputed from the underlying vector representations using an independent symbolic ray-casting routine, matching the released annotations at 100 %. These results are now reported in the dataset-construction section. revision: yes
Referee: [Evaluation protocol] Evaluation protocol: the reported tree-generation accuracies (71.1% Easy, 19.1% Hard) are presented without details on the exact prompting template, output parsing rules, or statistical significance testing across multiple runs, making it difficult to assess whether the performance gap is robust or sensitive to evaluation choices.

Authors: We have expanded the Evaluation Protocol section to include the complete prompting templates (system message plus user prompt with image placeholder) for every model family, the deterministic JSON schema validator used for output parsing, and the fallback rule-based extractor applied to the <5 % of malformed outputs. All accuracies are now reported as means over three independent runs with distinct sampling seeds, accompanied by standard deviations and paired t-test p-values confirming significance of the reported gaps (p < 0.01). revision: yes

Circularity Check

0 steps flagged

No circularity: empirical benchmark evaluation is self-contained

full rationale

The paper introduces CurveBench with 756 procedurally generated images and rooted-tree annotations for containment relations, then reports direct accuracy measurements (e.g., Gemini 3.1 Pro at 71.1% Easy / 19.1% Hard) and fine-tuning gains on external models. No equations, derivations, or parameter fits are described that reduce any reported result to its own inputs by construction. The central claims rest on dataset construction and model evaluation rather than any self-referential mathematical chain, satisfying the default expectation of no significant circularity.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The central claim rests on the standard mathematical definition of Jordan curves together with the creation and annotation of a new image dataset; no free parameters, invented entities, or ad-hoc axioms are introduced beyond these.

axioms (1)

standard math Jordan curves are simple closed curves in the plane that do not self-intersect and divide the plane into an interior and exterior region.
Invoked to define the containment relations encoded in the rooted trees.

pith-pipeline@v0.9.0 · 5547 in / 1291 out tokens · 64878 ms · 2026-05-15T05:30:49.710142+00:00 · methodology

discussion (0)

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

IndisputableMonolith/Foundation/AlexanderDuality.lean alexander_duality_circle_linking unclear

?

unclear
Relation between the paper passage and the cited Recognition theorem.

We introduce CurveBench... pairwise non-intersecting Jordan curves... rooted tree encoding the containment relations between planar regions.

What do these tags mean?

matches: The paper's claim is directly supported by a theorem in the formal canon.
supports: The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends: The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses: The paper appears to rely on the theorem as machinery.
contradicts: The paper's claim conflicts with a theorem or certificate in the canon.
unclear: Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.

Reference graph

Works this paper leans on

44 extracted references · 44 canonical work pages · 8 internal anchors

[1]

Topographic Map Symbols , year =

work page
[2]

Jordan Theorem , year =

work page
[3]

Journal of Visual Languages and Computing , volume =

Rodgers, Peter , title =. Journal of Visual Languages and Computing , volume =. 2014 , url =

work page 2014
[4]

Findings of the Association for Computational Linguistics: ACL 2022 , pages =

Masry, Ahmed and Long, Do Xuan and Tan, Jia Qing and Joty, Shafiq and Hoque, Enamul , title =. Findings of the Association for Computational Linguistics: ACL 2022 , pages =. 2022 , url =

work page 2022
[5]

Mathew, Minesh and Karatzas, Dimosthenis and Jawahar, C. V. , title =. Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision , pages =. 2021 , url =

work page 2021
[6]

Advances in Neural Information Processing Systems , volume =

Lu, Pan and Mishra, Swaroop and Xia, Tony and Qiu, Liang and Chang, Kai-Wei and Zhu, Song-Chun and Tafjord, Oyvind and Clark, Peter and Kalyan, Ashwin , title =. Advances in Neural Information Processing Systems , volume =. 2022 , url =

work page 2022
[7]

International Conference on Learning Representations Workshop Track , year =

Ebrahimi Kahou, Samira and Michalski, Vincent and Atkinson, Adam and K. International Conference on Learning Representations Workshop Track , year =

work page
[8]

Advances in Neural Information Processing Systems, Datasets and Benchmarks Track , year =

Lu, Pan and Qiu, Liang and Chen, Jiaqi and Xia, Tony and Zhao, Yizhou and Zhang, Wei and Yu, Zhou and Liang, Xiaodan and Zhu, Song-Chun , title =. Advances in Neural Information Processing Systems, Datasets and Benchmarks Track , year =

work page
[9]

and Ma, Wei-Chiu and Krishna, Ranjay , title =

Fu, Xingyu and Hu, Yushi and Li, Bangzheng and Feng, Yu and Wang, Haoyu and Lin, Xudong and Roth, Dan and Smith, Noah A. and Ma, Wei-Chiu and Krishna, Ranjay , title =. European Conference on Computer Vision , year =

work page
[10]

Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages =

Chen, Boyuan and Xu, Zhuo and Kirmani, Sean and Ichter, Brian and Sadigh, Dorsa and Guibas, Leonidas and Xia, Fei , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages =. 2024 , url =

work page 2024
[11]

Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies , year =

Rismanchian, Sina and Razeghi, Yasaman and Singh, Sameer and Doroudi, Shayan , title =. Proceedings of the 2025 Conference of the Nations of the Americas Chapter of the Association for Computational Linguistics: Human Language Technologies , year =

work page 2025
[12]

Contour Detection and Hierarchical Image Segmentation , journal =

Arbel. Contour Detection and Hierarchical Image Segmentation , journal =. 2011 , url =

work page 2011
[13]

, title =

Bastani, Favyen and He, Songtao and Abbar, Sofiane and Alizadeh, Mohammad and Balakrishnan, Hari and Chawla, Sanjay and Madden, Sam and DeWitt, David J. , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages =. 2018 , url =

work page 2018
[14]

Proceedings of the IEEE/CVF International Conference on Computer Vision , pages =

Li, Zuoyue and Wegner, Jan Dirk and Lucchi, Aurelien , title =. Proceedings of the IEEE/CVF International Conference on Computer Vision , pages =. 2019 , url =

work page 2019
[15]

Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages =

Zellers, Rowan and Yatskar, Mark and Thomson, Sam and Choi, Yejin , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages =. 2018 , url =

work page 2018
[16]

and Bernstein, Michael S

Krishna, Ranjay and Zhu, Yuke and Groth, Oliver and Johnson, Justin and Hata, Kenji and Kravitz, Joshua and Chen, Stephanie and Kalantidis, Yannis and Li, Li-Jia and Shamma, David A. and Bernstein, Michael S. and Fei-Fei, Li , title =. International Journal of Computer Vision , volume =. 2017 , url =

work page 2017
[17]

and Hinton, Geoffrey , title =

Chen, Ting and Saxena, Saurabh and Li, Lala and Fleet, David J. and Hinton, Geoffrey , title =. International Conference on Learning Representations , year =

work page
[18]

Proceedings of the 40th International Conference on Machine Learning , pages =

Lee, Kenton and Joshi, Mandar and Turc, Iulia Raluca and Hu, Hexiang and Liu, Fangyu and Eisenschlos, Julian Martin and Khandelwal, Urvashi and Shaw, Peter and Chang, Ming-Wei and Toutanova, Kristina , title =. Proceedings of the 40th International Conference on Machine Learning , pages =. 2023 , url =

work page 2023
[19]

European Conference on Computer Vision , pages =

Carion, Nicolas and Massa, Francisco and Synnaeve, Gabriel and Usunier, Nicolas and Kirillov, Alexander and Zagoruyko, Sergey , title =. European Conference on Computer Vision , pages =. 2020 , url =

work page 2020
[20]

European Conference on Computer Vision , pages =

Kembhavi, Aniruddha and Salvato, Mike and Kolve, Eric and Seo, Minjoon and Hajishirzi, Hannaneh and Farhadi, Ali , title =. European Conference on Computer Vision , pages =. 2016 , url =

work page 2016
[21]

and Manning, Christopher D

Hudson, Drew A. and Manning, Christopher D. , title =. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition , pages =. 2019 , url =

work page 2019
[22]

Transactions of the Association for Computational Linguistics , volume =

Liu, Fangyu and Emerson, Guy and Collier, Nigel , title =. Transactions of the Association for Computational Linguistics , volume =. 2023 , url =

work page 2023
[23]

Advances in Neural Information Processing Systems , volume =

Hu, Xiaoling and Li, Fuxin and Samaras, Dimitris and Chen, Chao , title =. Advances in Neural Information Processing Systems , volume =. 2019 , url =

work page 2019
[24]

Shao, Zhihong and Wang, Peiyi and Zhu, Qihao and Xu, Runxin and Song, Junxiao and Bi, Xiao and Zhang, Haowei and Zhang, Mingchuan and Li, Y. K. and Wu, Y. and Guo, Daya , title =. arXiv preprint arXiv:2402.03300 , year =. 2402.03300 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[25]

Proximal Policy Optimization Algorithms

Schulman, John and Wolski, Filip and Dhariwal, Prafulla and Radford, Alec and Klimov, Oleg , title =. arXiv preprint arXiv:1707.06347 , year =. 1707.06347 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[26]

Prime Intellect Environments Hub , year =

work page
[27]

Lawrence and Girshick, Ross , title =

Johnson, Justin and Hariharan, Bharath and van der Maaten, Laurens and Fei-Fei, Li and Zitnick, C. Lawrence and Girshick, Ross , title =. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition , pages =. 2017 , url =

work page 2017
[28]

arXiv preprint arXiv:2204.02380 , year =

Salewski, Leonard and Koepke, Sophia and Lensch, Hendrik and Akata, Zeynep , title =. arXiv preprint arXiv:2204.02380 , year =. 2204.02380 , archivePrefix =

work page arXiv
[29]

Advances in Neural Information Processing Systems , volume =

Madaan, Aman and Tandon, Niket and Gupta, Prakhar and Hallinan, Skyler and Gao, Luyu and Wiegreffe, Sarah and Alon, Uri and Dziri, Nouha and Prabhumoye, Shrimai and Yang, Yiming and Gupta, Shashank and Majumder, Bodhisattwa Prasad and Hermann, Katherine and Welleck, Sean and Yazdanbakhsh, Amir and Clark, Peter , title =. Advances in Neural Information Pro...

work page 2023
[30]

and Zhang, Hao and Gonzalez, Joseph E

Zheng, Lianmin and Chiang, Wei-Lin and Sheng, Ying and Zhuang, Siyuan and Wu, Zhanghao and Zhuang, Yonghao and Lin, Zi and Li, Zhuohan and Li, Dacheng and Xing, Eric P. and Zhang, Hao and Gonzalez, Joseph E. and Stoica, Ion , title =. Advances in Neural Information Processing Systems, Datasets and Benchmarks Track , volume =. 2023 , url =

work page 2023
[31]

and Gonzalez, Joseph E

Chiang, Wei-Lin and Zheng, Lianmin and Sheng, Ying and Angelopoulos, Anastasios Nikolas and Li, Tianle and Li, Dacheng and Zhu, Banghua and Zhang, Hao and Jordan, Michael I. and Gonzalez, Joseph E. and Stoica, Ion , title =. Proceedings of the 41st International Conference on Machine Learning , pages =. 2024 , url =

work page 2024
[32]

Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages =

Liu, Yang and Iter, Dan and Xu, Yichong and Wang, Shuohang and Xu, Ruochen and Zhu, Chenguang , title =. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing , pages =. 2023 , url =

work page 2023
[33]

Tulu 3: Pushing Frontiers in Open Language Model Post-Training

Lambert, Nathan and Morrison, Jacob and Pyatkin, Valentina and Huang, Shengyi and Ivison, Hamish and Brahman, Faeze and Miranda, Lester James V. and Liu, Alisa and Dziri, Nouha and Lyu, Shane and Gu, Yuling and Malik, Saumya and Graf, Victoria and Hwang, Jena D. and Yang, Jiangjiang and Le Bras, Ronan and Tafjord, Oyvind and Wilhelm, Chris and Soldaini, L...

work page internal anchor Pith review Pith/arXiv arXiv
[34]

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

arXiv preprint arXiv:2501.12948 , year =. 2501.12948 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[35]

and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu , title =

Hu, Edward J. and Shen, Yelong and Wallis, Phillip and Allen-Zhu, Zeyuan and Li, Yuanzhi and Wang, Shean and Wang, Lu and Chen, Weizhu , title =. International Conference on Learning Representations , year =

work page
[36]

2024 , eprint =

Gemma: Open Models Based on Gemini Research and Technology , journal =. 2024 , eprint =

work page 2024
[37]

Qwen3-VL Technical Report

Bai, Shuai and Cai, Yuxuan and Chen, Ruizhe and Chen, Keqin and Chen, Xionghui and Cheng, Zesen and Deng, Lianghao and Ding, Wei and Gao, Chang and Ge, Chunjiang and Ge, Wenbin and Guo, Zhifang and Huang, Qidong and Huang, Jie and Huang, Fei and Hui, Binyuan and Jiang, Shutong and Li, Zhaohai and Li, Mingsheng and Li, Mei and Li, Kaixin and Lin, Zicheng a...

work page internal anchor Pith review Pith/arXiv arXiv
[38]

Understanding R1-Zero-Like Training: A Critical Perspective

Liu, Zichen and Chen, Changyu and Li, Wenjun and Qi, Penghui and Pang, Tianyu and Du, Chao and Lee, Wee Sun and Lin, Min , title =. arXiv preprint arXiv:2503.20783 , year =. 2503.20783 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[39]

VLM-R1: A Stable and Generalizable R1-style Large Vision-Language Model

Shen, Haozhan and Liu, Peng and Li, Jingcheng and Fang, Chunxin and Ma, Yibo and Liao, Jiajia and Shen, Qiaoli and Zhang, Zilun and Zhao, Kangjia and Zhang, Qianqian and Xu, Ruochen and Zhao, Tiancheng , title =. arXiv preprint arXiv:2504.07615 , year =. 2504.07615 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[40]

arXiv preprint arXiv:2503.07536 , year =

Peng, Yingzhe and Zhang, Gongrui and Zhang, Miaosen and You, Zhiyuan and Liu, Jie and Zhu, Qipeng and Yang, Kai and Xu, Xingzhong and Geng, Xin and Yang, Xu , title =. arXiv preprint arXiv:2503.07536 , year =. 2503.07536 , archivePrefix =

work page arXiv
[41]

arXiv preprint arXiv:2503.12937 , year =

Zhang, Jingyi and Huang, Jiaxing and Yao, Huanjin and Liu, Shunyu and Zhang, Xikun and Lu, Shijian and Tao, Dacheng , title =. arXiv preprint arXiv:2503.12937 , year =. 2503.12937 , archivePrefix =

work page arXiv
[42]

arXiv preprint arXiv:2504.07954 , year =

Yu, En and Lin, Kangheng and Zhao, Liang and Yin, Jisheng and Wei, Yana and Peng, Yuang and Wei, Haoran and Sun, Jianjian and Han, Chunrui and Ge, Zheng and Zhang, Xiangyu and Jiang, Daxin and Wang, Jingyu and Tao, Wenbing , title =. arXiv preprint arXiv:2504.07954 , year =. 2504.07954 , archivePrefix =

work page arXiv
[43]

MM-Eureka: Exploring the Frontiers of Multimodal Reasoning with Rule-based Reinforcement Learning

Meng, Fanqing and Du, Lingxiao and Liu, Zongkai and Zhou, Zhixiang and Lu, Quanfeng and Fu, Daocheng and Shi, Botian and Wang, Wenhai and He, Junjun and Zhang, Kaipeng and Luo, Ping and Qiao, Yu and Zhang, Qiaosheng and Shao, Wenqi , title =. arXiv preprint arXiv:2503.07365 , year =. 2503.07365 , archivePrefix =

work page internal anchor Pith review Pith/arXiv arXiv
[44]

2025 , howpublished =

Schulman, John and Thinking Machines Lab , title =. 2025 , howpublished =

work page 2025