Semantic Triplet Restoration: A Novel Protocol for Hierarchical Table Understanding in Large Language Models

Dingrui Yang; Fangxin Shang; Yibin Zhao; Yuqi Wang

arxiv: 2605.31550 · v1 · pith:6LC74A7Unew · submitted 2026-05-29 · 💻 cs.CL

Semantic Triplet Restoration: A Novel Protocol for Hierarchical Table Understanding in Large Language Models

Yibin Zhao , Fangxin Shang , Dingrui Yang , Yuqi Wang This is my paper

Pith reviewed 2026-06-28 22:18 UTC · model grok-4.3

classification 💻 cs.CL

keywords table question answeringsemantic tripletshierarchical tableslarge language modelstoken reductiontable understandingquery routing

0 comments

The pith

Semantic triplets restore table hierarchy for LLMs and reduce input tokens versus HTML.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Table question answering depends on recovering relations hidden in two-dimensional layouts and merged cells. Standard pipelines serialize tables as HTML or Markdown, which adds markup and leaves models to infer alignments from row and column spans. The paper proposes Semantic Triplet Restoration, which rewrites every cell as an atomic fact of the form item path, feature path, value, and introduces TripletQL, a lightweight router that selects the right rendering or subset for a given question. Experiments across four Chinese and English table-QA benchmarks show that the triplet form matches or exceeds HTML baselines while using fewer tokens. The advantage widens when the model is small or the table context is long.

Core claim

The paper establishes that rewriting each cell as an explicit semantic triplet consisting of an item path for the row-wise entity, a feature path for the hierarchical attribute, and the cell value allows large language models to recover implicit hierarchical relations without the overhead of layout markup, and that a query-aware router can select appropriate renderings or filtered subsets to achieve performance that matches or exceeds HTML-based methods on table question answering while lowering token counts.

What carries the argument

Semantic Triplet Restoration protocol that converts cells to <item path, feature path, value> triplets together with the TripletQL query-aware router for selecting renderings.

If this is right

STR matches or exceeds HTML baselines on four table-QA benchmarks in Chinese and English.
STR uses fewer input tokens than HTML representations of the same tables.
The token and accuracy benefits of STR increase as model size decreases.
The benefits of STR increase as table context length increases.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The triplet format may allow direct integration with semantic parsers that operate on path-like structures rather than markup.
Smaller models deployed under token budgets could become viable for table tasks that currently require larger models.
The approach might generalize to other grid-like or hierarchical data such as forms or spreadsheets if the same path-based encoding is applied.

Load-bearing premise

Converting cells to item-path feature-path value triplets preserves all necessary hierarchical and alignment information without loss, and TripletQL can choose renderings without introducing new errors or omissions.

What would settle it

An evaluation on a held-out table-QA benchmark where the triplet method produces lower accuracy than an HTML baseline on the same questions despite using comparable or fewer tokens.

Figures

Figures reproduced from arXiv: 2605.31550 by Dingrui Yang, Fangxin Shang, Yibin Zhao, Yuqi Wang.

**Figure 2.** Figure 2: Processing pipeline of the TripletQL agent. [PITH_FULL_IMAGE:figures/full_fig_p004_2.png] view at source ↗

**Figure 3.** Figure 3: Visualization of visual grid prediction and cell [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

**Figure 4.** Figure 4: Input token cost by sub-task. 8k 16k 32k 64k Context Scale 55 60 65 70 75 80 85 Accuracy (%) Accuracy vs Context Scale HTML Baseline STR (Ours) 8k 16k 32k 64k 0 20000 40000 60000 80000 100000 Avg. Input Tokens -49.3% -49.0% -48.2% -51.9% Token Usage Scaling HTML Baseline STR (Ours) [PITH_FULL_IMAGE:figures/full_fig_p008_4.png] view at source ↗

**Figure 5.** Figure 5: Token scaling on TQA-Bench. 4.4 Cross-Model Results The same pattern appears across GLM-4.5-Air, LongCat-Flash-Lite, and Qwen3-0.6B: the smaller the model, the larger the gain from STR. Model HTML F1 (↑) Agent F1 (↑) ∆ (Relative) GLM-4.5-Air 91.05 92.69 +1.64 (+1.80%) LongCat-Lite 85.61 89.15 +3.54 (+4.13%) Qwen3-0.6B 46.34 51.44 +5.10 (+11.00%) [PITH_FULL_IMAGE:figures/full_fig_p008_5.png] view at source ↗

**Figure 6.** Figure 6: Performance gains across model scales. The gain is larger on smaller models. With HTML, the model still has to read structural tags and rebuild the 2D layout for itself. STR does that work before reasoning and passes the semantic relations directly to the model, which is why smaller models benefit the most. 5 Conclusion We presented Semantic Triplet Restoration (STR), which rewrites tables into explicit s… view at source ↗

**Figure 9.** Figure 9: Visualization of the representative-sub-task [PITH_FULL_IMAGE:figures/full_fig_p011_9.png] view at source ↗

**Figure 8.** Figure 8: DeepSeek-OCR collapse case (2): structured [PITH_FULL_IMAGE:figures/full_fig_p011_8.png] view at source ↗

read the original abstract

Table question answering requires models to recover semantic relations encoded implicitly by two-dimensional layout, merged cells, and hierarchical headers. Current pipelines typically use HTML or Markdown as intermediate table representations, but these layout-oriented serializations introduce markup overhead and require large language models to infer header-cell alignments from row and column spans. We propose Semantic Triplet Restoration (STR), a protocol that rewrites each cell as an atomic fact <item path, feature path, value>, where the item path specifies the row-wise entity, the feature path specifies the hierarchical attribute, and the value contains the cell content. We also present TripletQL, a lightweight query-aware router that uses STR to select an appropriate rendering or filtered subset of triplets for each question. Across four Chinese and English table-QA benchmarks, STR matches or improves upon HTML-based baselines while reducing input tokens. The relative benefit grows for smaller language models and longer table contexts, suggesting that explicit semantic representations are especially useful under constrained inference budgets. Code and data are available at https://github.com/Phoenix-ni/STR.git .

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

STR rewrites tables as explicit triplets to cut tokens and help small models on QA, but without path construction rules the gains cannot be attributed to the representation.

read the letter

The main takeaway is that this paper replaces HTML table serializations with a semantic triplet format plus a query router called TripletQL, and reports that the change matches or beats baselines on four table-QA benchmarks while using fewer tokens, especially for smaller models and longer tables.

What is new is the explicit separation of row-wise entities from hierarchical attributes instead of relying on layout markup and spans. The released code and data make the protocol testable, which is a practical step forward from the abstract alone.

The approach is straightforward and targets a real pain point in table pipelines where models must infer alignments. Releasing the github link lets others check the implementation directly.

The soft spots sit in the missing details. The abstract gives no rules for building paths when cells are merged or headers are multi-level, so it is unclear whether the triplet step actually preserves all row-entity and attribute alignments or quietly drops information. TripletQL's selection step adds another possible source of omission. No error bars, baseline descriptions, or statistical tests are supplied, which leaves the performance numbers unverified.

This is for people working on table QA and efficient inference who want to try different serializations. A reader focused on smaller models or long contexts could get value from the idea and the code.

It deserves a serious referee because the core change is simple and the artifacts are public, even if the writeup needs expansion on construction rules and experiments. I would send it to review rather than desk reject.

Referee Report

2 major / 1 minor

Summary. The manuscript introduces Semantic Triplet Restoration (STR), a protocol that rewrites each table cell as an atomic triplet <item path, feature path, value> to explicitly encode row-wise entities and hierarchical attributes, along with TripletQL, a lightweight query-aware router that selects renderings or filtered subsets. The central empirical claim is that STR matches or exceeds HTML-based baselines across four Chinese and English table-QA benchmarks while reducing input tokens, with relative gains increasing for smaller language models and longer table contexts.

Significance. If the results hold after verification of the representation rules and experimental details, the approach could provide a more token-efficient alternative to layout-oriented serializations for hierarchical table understanding, particularly under constrained inference budgets. The public release of code and data at the cited GitHub repository is a clear strength that aids reproducibility.

major comments (2)

[Abstract] Abstract: performance results on four benchmarks are stated without any description of experimental setup, baselines, error bars, or statistical tests. This prevents verification of whether the data support the claimed improvements and is load-bearing for the paper's primary contribution.
[STR protocol description] The construction rules for item paths and feature paths (especially under merged cells, row/column spans, and multi-level headers) are not provided. The central claim that the triplet format preserves all necessary hierarchical alignments and semantics without loss (required to attribute gains to the representation rather than TripletQL or model behavior) cannot be evaluated without these rules.

minor comments (1)

Notation for the triplet format is introduced clearly in the abstract but would benefit from an explicit example table showing path construction for a merged cell.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for these constructive comments. We address each major point below and will revise the manuscript accordingly to improve verifiability.

read point-by-point responses

Referee: [Abstract] Abstract: performance results on four benchmarks are stated without any description of experimental setup, baselines, error bars, or statistical tests. This prevents verification of whether the data support the claimed improvements and is load-bearing for the paper's primary contribution.

Authors: We agree that the abstract lacks sufficient experimental context. In the revised manuscript we will expand the abstract to briefly name the four benchmarks, note the HTML baselines, and indicate that results report means with standard deviations from multiple runs along with statistical tests detailed in Section 4. revision: yes
Referee: [STR protocol description] The construction rules for item paths and feature paths (especially under merged cells, row/column spans, and multi-level headers) are not provided. The central claim that the triplet format preserves all necessary hierarchical alignments and semantics without loss (required to attribute gains to the representation rather than TripletQL or model behavior) cannot be evaluated without these rules.

Authors: We acknowledge that the current manuscript does not supply explicit construction rules for item and feature paths under merged cells, spans, or multi-level headers. We will add a dedicated subsection with formal rules and examples for these cases to demonstrate semantic preservation and to support attribution of gains to the triplet representation. revision: yes

Circularity Check

0 steps flagged

No circularity: empirical protocol evaluated on external benchmarks

full rationale

The paper defines STR as a rewriting protocol into <item path, feature path, value> triplets and TripletQL as a selection router, then reports empirical results on four table-QA benchmarks. No equations, fitted parameters, or derivations are present that reduce a claimed prediction to the input representation by construction. The performance claim is tested against external HTML baselines rather than being forced by the definition of the triplet format itself. Any self-citations (none visible in the provided text) would not be load-bearing for the central empirical result.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 2 invented entities

Review is based solely on the abstract; full paper may contain additional parameters or assumptions not visible here.

axioms (1)

domain assumption The triplet format <item path, feature path, value> captures all relevant hierarchical semantic relations present in the original table layout.
This premise underpins the claim that STR is at least as effective as HTML while using fewer tokens.

invented entities (2)

Semantic Triplet Restoration (STR) protocol no independent evidence
purpose: Rewrite table cells as explicit atomic facts for LLM consumption.
New method introduced to replace layout-oriented serializations.
TripletQL router no independent evidence
purpose: Query-aware selection of triplet renderings or subsets.
New lightweight component paired with STR.

pith-pipeline@v0.9.1-grok · 5723 in / 1307 out tokens · 29933 ms · 2026-06-28T22:18:34.613861+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

25 extracted references · 15 canonical work pages · 4 internal anchors

[1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...
[2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...
[3]

Boming Chen, Zining Wang, Zhentao Guo, Jianqiang Liu, Chen Duan, Yu Gu, Kai Zhou, and Pengfei Yan. 2026. https://arxiv.org/abs/2604.02880 Instructtable: Improving table structure recognition through instructions . arXiv preprint arXiv:2604.02880

work page internal anchor Pith review Pith/arXiv arXiv 2026
[4]

Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, and William Yang Wang. 2020. https://arxiv.org/abs/1909.02164 Tabfact: A large-scale dataset for table-based fact verification . In Proceedings of the International Conference on Learning Representations (ICLR)

work page arXiv 2020
[5]

Cheng Cui, Ting Sun, Suyin Liang, Tingquan Gao, Zelun Zhang, Jiaxuan Liu, Xueqing Wang, Changda Zhou, Hongen Liu, Manhui Lin, Yue Zhang, Yubo Zhang, Handong Zheng, Jing Zhang, Jun Zhang, Yi Liu, Dianhai Yu, and Yanjun Ma. 2025. https://arxiv.org/abs/2510.14528 Paddleocr-vl: Boosting multilingual document parsing via a 0.9b ultra-compact vision-language mo...

work page arXiv 2025
[6]

Jonathan Herzig, Pawel Krzysztof Nowak, Thomas M \"u ller, Francesco Piccinno, and Julian Eisenschlos. 2020. https://aclanthology.org/2020.acl-main.398/ TAPAS : Weakly supervised table parsing via pre-training . In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pages 4320--4333

2020
[7]

Qiyu Hou and Jun Wang. 2025. https://arxiv.org/abs/2506.07015 Tablet: Table structure recognition using encoder-only transformers . arXiv preprint arXiv:2506.07015

work page arXiv 2025
[8]

Deyi Ji, Lanyun Zhang, Jianda Zhang, Xiansheng Liu, Tianrun Wang, and Hong Liu. 2024. https://arxiv.org/abs/2411.08516 Tree-of-table: Unleashing the power of LLM s for enhanced large-scale table understanding . arXiv preprint arXiv:2411.08516

work page arXiv 2024
[9]

Jiacheng Li and 1 others. 2024 a . Are llms effective for tabular data? a systematic study on table representation and redundancy. arXiv preprint arXiv:2404.09876

work page arXiv 2024
[10]

Weichen Li, Xiaotong Huang, Jianwu Zheng, Zheng Wang, Chaokun Wang, Li Pan, and Jianhua Li. 2024 b . https://arxiv.org/abs/2407.20157 rllm: Relational table learning with llms . arXiv preprint arXiv:2407.20157

work page arXiv 2024
[11]

Zhang Li, Yuliang Liu, Qiang Liu, Zhiyin Ma, Ziyang Zhang, Shuo Zhang, Biao Yang, Zidun Guo, Jiarui Zhang, Xinyu Wang, and Xiang Bai. 2025. https://arxiv.org/abs/2506.05218 Monkeyocr: Document parsing with a structure-recognition-relation triplet paradigm . Preprint, arXiv:2506.05218

work page arXiv 2025
[12]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. 2024. https://aclanthology.org/2024.tacl-1.9/ Lost in the middle: How language models use long contexts . Transactions of the Association for Computational Linguistics, 12:157--173

2024
[13]

Alfonso Ure \ n a-L \'o pez, Eugenio Mart \'i nez C \'a mara, and Jose Camacho-Collados

Jorge Os \'e s Grijalba, L. Alfonso Ure \ n a-L \'o pez, Eugenio Mart \'i nez C \'a mara, and Jose Camacho-Collados. 2024. https://aclanthology.org/2024.lrec-main.1179/ Question answering over tabular data with D ata B ench: A large-scale empirical evaluation of LLM s . In Proceedings of the 2024 Joint International Conference on Computational Linguistics...

2024
[14]

Panupong Pasupat and Percy Liang. 2015. https://aclanthology.org/P15-1146/ Compositional semantic parsing on semi-structured tables . In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP), pages 1470--1480

2015
[15]

Chunxia Qin, Chenyu Liu, Pengcheng Xia, Jun Du, Baocai Yin, Bing Yin, and Cong Liu. 2026. https://arxiv.org/abs/2603.22819 Tdatr: Improving end-to-end table recognition via table detail-aware learning and cell-level visual alignment . arXiv preprint arXiv:2603.22819

work page arXiv 2026
[16]

Zipeng Qiu, You Peng, Guangxin He, Binhang Yuan, and Chen Wang. 2024. https://arxiv.org/abs/2411.19504 TQA-Bench : Evaluating LLM s for multi-table question answering with scalable context and symbolic extension . arXiv preprint arXiv:2411.19504

work page internal anchor Pith review Pith/arXiv arXiv 2024
[17]

Sahil Sen, Akhil Kasturi, Elias Lumer, Anmol Gulati, and Vamse Kumar Subbiah. 2026. https://arxiv.org/abs/2605.15184 Is Grep all you need? how agent harnesses reshape agentic search . arXiv preprint arXiv:2605.15184

work page internal anchor Pith review Pith/arXiv arXiv 2026
[18]

Yuan Sui, Mengyu Zhou, Mingjie Zhou, Shi Han, and Dongmei Zhang. 2024. https://doi.org/10.1145/3616855.3635831 Table meets LLM : Can large language models understand structured table data? a benchmark and empirical study . In Proceedings of the 17th ACM International Conference on Web Search and Data Mining (WSDM '24), pages 645--654. ACM. ArXiv preprint ...

work page doi:10.1145/3616855.3635831 2024
[19]

Rishit Tyagi, Mohit Gupta, and Rahul Bouri. 2025. https://aclanthology.org/2025.semeval-1.292/ A estar at S em E val-2025 task 8: Agentic LLM s for question answering over tabular data . In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2249--2255, Vienna, Austria. Association for Computational Linguistics

2025
[20]

Bin Wang, Chao Xu, Xiaomeng Zhao, Linke Ouyang, Fan Wu, Zhiyuan Zhao, Rui Xu, Kaiwen Liu, Yuan Qu, Fukai Shang, Bo Zhang, Liqun Wei, Zhihao Sui, Wei Li, Botian Shi, Yu Qiao, Dahua Lin, and Conghui He. 2024. https://arxiv.org/abs/2409.18839 Mineru: An open-source solution for precise document content extraction . Preprint, arXiv:2409.18839

work page internal anchor Pith review Pith/arXiv arXiv 2024
[21]

Xianjie Wu, Jian Yang, Linzheng Chai, Ge Zhang, Jiaheng Liu, Xeron Du, Di Liang, Daixin Shu, Xianfu Cheng, and Tianzhen Sun. 2025. https://ojs.aaai.org/index.php/AAAI/article/view/33979 Tablebench: A comprehensive and complex benchmark for table question answering . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 25497--25506

2025
[22]

Jingfeng Yang, Aditya Gupta, Shyam Upadhyay, Luheng He, Rahul Goel, and Shachi Paul. 2022. https://aclanthology.org/2022.acl-long.40/ TableFormer : Robust transformer modeling for table-text encoding . In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pages 528--537

2022
[23]

Pengcheng Yin, Graham Neubig, Wen-tau Yih, and Sebastian Riedel. 2020. https://aclanthology.org/2020.acl-main.745/ TaBERT : Pretraining for joint understanding of textual and tabular data . In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pages 8413--8426

2020
[24]

Yitong Zhou, Mingyue Cheng, Qingyang Mao, Jiahao Wang, Feiyang Xu, and Xin Li. 2025. https://arxiv.org/abs/2412.20662 Enhancing table recognition with vision LLM s: A benchmark and neighbor-guided toolchain reasoner . In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25), pages 2503--2511. International J...

work page arXiv 2025
[25]

Junnan Zhu, Jingyi Wang, Bohan Yu, Xiaoyu Wu, Junbo Li, Lei Wang, and Nan Xu. 2025. https://arxiv.org/abs/2506.03949 Tableeval: A real-world benchmark for complex, multilingual, and multi-structured table question answering . arXiv preprint arXiv:2506.03949

work page arXiv 2025

[1] [1]

online" 'onlinestring :=

ENTRY address archivePrefix author booktitle chapter edition editor eid eprint eprinttype howpublished institution journal key month note number organization pages publisher school series title type volume year doi pubmed url lastchecked label extra.label sort.label short.list INTEGERS output.state before.all mid.sentence after.sentence after.block STRING...

[2] [2]

write newline

" write newline "" before.all 'output.state := FUNCTION n.dashify 't := "" t empty not t #1 #1 substring "-" = t #1 #2 substring "--" = not "--" * t #2 global.max substring 't := t #1 #1 substring "-" = "-" * t #2 global.max substring 't := while if t #1 #1 substring * t #2 global.max substring 't := if while FUNCTION word.in bbl.in capitalize " " * FUNCT...

[3] [3]

Boming Chen, Zining Wang, Zhentao Guo, Jianqiang Liu, Chen Duan, Yu Gu, Kai Zhou, and Pengfei Yan. 2026. https://arxiv.org/abs/2604.02880 Instructtable: Improving table structure recognition through instructions . arXiv preprint arXiv:2604.02880

work page internal anchor Pith review Pith/arXiv arXiv 2026

[4] [4]

Wenhu Chen, Hongmin Wang, Jianshu Chen, Yunkai Zhang, Hong Wang, Shiyang Li, Xiyou Zhou, and William Yang Wang. 2020. https://arxiv.org/abs/1909.02164 Tabfact: A large-scale dataset for table-based fact verification . In Proceedings of the International Conference on Learning Representations (ICLR)

work page arXiv 2020

[5] [5]

Cheng Cui, Ting Sun, Suyin Liang, Tingquan Gao, Zelun Zhang, Jiaxuan Liu, Xueqing Wang, Changda Zhou, Hongen Liu, Manhui Lin, Yue Zhang, Yubo Zhang, Handong Zheng, Jing Zhang, Jun Zhang, Yi Liu, Dianhai Yu, and Yanjun Ma. 2025. https://arxiv.org/abs/2510.14528 Paddleocr-vl: Boosting multilingual document parsing via a 0.9b ultra-compact vision-language mo...

work page arXiv 2025

[6] [6]

Jonathan Herzig, Pawel Krzysztof Nowak, Thomas M \"u ller, Francesco Piccinno, and Julian Eisenschlos. 2020. https://aclanthology.org/2020.acl-main.398/ TAPAS : Weakly supervised table parsing via pre-training . In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pages 4320--4333

2020

[7] [7]

Qiyu Hou and Jun Wang. 2025. https://arxiv.org/abs/2506.07015 Tablet: Table structure recognition using encoder-only transformers . arXiv preprint arXiv:2506.07015

work page arXiv 2025

[8] [8]

Deyi Ji, Lanyun Zhang, Jianda Zhang, Xiansheng Liu, Tianrun Wang, and Hong Liu. 2024. https://arxiv.org/abs/2411.08516 Tree-of-table: Unleashing the power of LLM s for enhanced large-scale table understanding . arXiv preprint arXiv:2411.08516

work page arXiv 2024

[9] [9]

Jiacheng Li and 1 others. 2024 a . Are llms effective for tabular data? a systematic study on table representation and redundancy. arXiv preprint arXiv:2404.09876

work page arXiv 2024

[10] [10]

Weichen Li, Xiaotong Huang, Jianwu Zheng, Zheng Wang, Chaokun Wang, Li Pan, and Jianhua Li. 2024 b . https://arxiv.org/abs/2407.20157 rllm: Relational table learning with llms . arXiv preprint arXiv:2407.20157

work page arXiv 2024

[11] [11]

Zhang Li, Yuliang Liu, Qiang Liu, Zhiyin Ma, Ziyang Zhang, Shuo Zhang, Biao Yang, Zidun Guo, Jiarui Zhang, Xinyu Wang, and Xiang Bai. 2025. https://arxiv.org/abs/2506.05218 Monkeyocr: Document parsing with a structure-recognition-relation triplet paradigm . Preprint, arXiv:2506.05218

work page arXiv 2025

[12] [12]

Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang

Nelson F. Liu, Kevin Lin, John Hewitt, Ashwin Paranjape, Michele Bevilacqua, Fabio Petroni, and Percy Liang. 2024. https://aclanthology.org/2024.tacl-1.9/ Lost in the middle: How language models use long contexts . Transactions of the Association for Computational Linguistics, 12:157--173

2024

[13] [13]

Alfonso Ure \ n a-L \'o pez, Eugenio Mart \'i nez C \'a mara, and Jose Camacho-Collados

Jorge Os \'e s Grijalba, L. Alfonso Ure \ n a-L \'o pez, Eugenio Mart \'i nez C \'a mara, and Jose Camacho-Collados. 2024. https://aclanthology.org/2024.lrec-main.1179/ Question answering over tabular data with D ata B ench: A large-scale empirical evaluation of LLM s . In Proceedings of the 2024 Joint International Conference on Computational Linguistics...

2024

[14] [14]

Panupong Pasupat and Percy Liang. 2015. https://aclanthology.org/P15-1146/ Compositional semantic parsing on semi-structured tables . In Proceedings of the 53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing (ACL-IJCNLP), pages 1470--1480

2015

[15] [15]

Chunxia Qin, Chenyu Liu, Pengcheng Xia, Jun Du, Baocai Yin, Bing Yin, and Cong Liu. 2026. https://arxiv.org/abs/2603.22819 Tdatr: Improving end-to-end table recognition via table detail-aware learning and cell-level visual alignment . arXiv preprint arXiv:2603.22819

work page arXiv 2026

[16] [16]

Zipeng Qiu, You Peng, Guangxin He, Binhang Yuan, and Chen Wang. 2024. https://arxiv.org/abs/2411.19504 TQA-Bench : Evaluating LLM s for multi-table question answering with scalable context and symbolic extension . arXiv preprint arXiv:2411.19504

work page internal anchor Pith review Pith/arXiv arXiv 2024

[17] [17]

Sahil Sen, Akhil Kasturi, Elias Lumer, Anmol Gulati, and Vamse Kumar Subbiah. 2026. https://arxiv.org/abs/2605.15184 Is Grep all you need? how agent harnesses reshape agentic search . arXiv preprint arXiv:2605.15184

work page internal anchor Pith review Pith/arXiv arXiv 2026

[18] [18]

Yuan Sui, Mengyu Zhou, Mingjie Zhou, Shi Han, and Dongmei Zhang. 2024. https://doi.org/10.1145/3616855.3635831 Table meets LLM : Can large language models understand structured table data? a benchmark and empirical study . In Proceedings of the 17th ACM International Conference on Web Search and Data Mining (WSDM '24), pages 645--654. ACM. ArXiv preprint ...

work page doi:10.1145/3616855.3635831 2024

[19] [19]

Rishit Tyagi, Mohit Gupta, and Rahul Bouri. 2025. https://aclanthology.org/2025.semeval-1.292/ A estar at S em E val-2025 task 8: Agentic LLM s for question answering over tabular data . In Proceedings of the 19th International Workshop on Semantic Evaluation (SemEval-2025), pages 2249--2255, Vienna, Austria. Association for Computational Linguistics

2025

[20] [20]

Bin Wang, Chao Xu, Xiaomeng Zhao, Linke Ouyang, Fan Wu, Zhiyuan Zhao, Rui Xu, Kaiwen Liu, Yuan Qu, Fukai Shang, Bo Zhang, Liqun Wei, Zhihao Sui, Wei Li, Botian Shi, Yu Qiao, Dahua Lin, and Conghui He. 2024. https://arxiv.org/abs/2409.18839 Mineru: An open-source solution for precise document content extraction . Preprint, arXiv:2409.18839

work page internal anchor Pith review Pith/arXiv arXiv 2024

[21] [21]

Xianjie Wu, Jian Yang, Linzheng Chai, Ge Zhang, Jiaheng Liu, Xeron Du, Di Liang, Daixin Shu, Xianfu Cheng, and Tianzhen Sun. 2025. https://ojs.aaai.org/index.php/AAAI/article/view/33979 Tablebench: A comprehensive and complex benchmark for table question answering . In Proceedings of the AAAI Conference on Artificial Intelligence, volume 39, pages 25497--25506

2025

[22] [22]

Jingfeng Yang, Aditya Gupta, Shyam Upadhyay, Luheng He, Rahul Goel, and Shachi Paul. 2022. https://aclanthology.org/2022.acl-long.40/ TableFormer : Robust transformer modeling for table-text encoding . In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics (ACL), pages 528--537

2022

[23] [23]

Pengcheng Yin, Graham Neubig, Wen-tau Yih, and Sebastian Riedel. 2020. https://aclanthology.org/2020.acl-main.745/ TaBERT : Pretraining for joint understanding of textual and tabular data . In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (ACL), pages 8413--8426

2020

[24] [24]

Yitong Zhou, Mingyue Cheng, Qingyang Mao, Jiahao Wang, Feiyang Xu, and Xin Li. 2025. https://arxiv.org/abs/2412.20662 Enhancing table recognition with vision LLM s: A benchmark and neighbor-guided toolchain reasoner . In Proceedings of the Thirty-Fourth International Joint Conference on Artificial Intelligence (IJCAI-25), pages 2503--2511. International J...

work page arXiv 2025

[25] [25]

Junnan Zhu, Jingyi Wang, Bohan Yu, Xiaoyu Wu, Junbo Li, Lei Wang, and Nan Xu. 2025. https://arxiv.org/abs/2506.03949 Tableeval: A real-world benchmark for complex, multilingual, and multi-structured table question answering . arXiv preprint arXiv:2506.03949

work page arXiv 2025