When TableQA Meets Noise: A Dual Denoising Framework for Complex Questions and Large-scale Tables

Dong Jin; Jian Yang; Shenghao Ye; Shuangwu Chen; Xiaofeng Jiang; Yikai Shen; Yu Guo; Yunpeng Hou

arxiv: 2509.17680 · v2 · pith:FFPGXXUQnew · submitted 2025-09-22 · 💻 cs.CL

When TableQA Meets Noise: A Dual Denoising Framework for Complex Questions and Large-scale Tables

Shenghao Ye , Yu Guo , Dong Jin , Yikai Shen , Yunpeng Hou , Shuangwu Chen , Jian Yang , Xiaofeng Jiang This is my paper

classification 💻 cs.CL

keywords tablereasoningcomplexdenoisingquestionstableslarge-scalepruning

0 comments

read the original abstract

Table question answering (TableQA) is a fundamental task in natural language processing (NLP). The strong reasoning capabilities of large language models (LLMs) have brought significant advances in this field. However, as real-world applications involve increasingly complex questions and larger tables, substantial noisy data is introduced, which severely degrades reasoning performance. To address this challenge, we focus on improving two core capabilities: Relevance Filtering, which identifies and retains information truly relevant to reasoning, and Table Pruning, which reduces table size while preserving essential content. Based on these principles, we propose EnoTab, a dual denoising framework for complex questions and large-scale tables. Specifically, we first perform Evidence-based Question Denoising by decomposing the question into minimal semantic units and filtering out those irrelevant to answer reasoning based on consistency and usability criteria. Then, we propose Evidence Tree-guided Table Denoising, which constructs an explicit and transparent table pruning path to remove irrelevant data step by step. At each pruning step, we observe the intermediate state of the table and apply a post-order node rollback mechanism to handle abnormal table states, ultimately producing a highly reliable sub-table for final answer reasoning. Finally, extensive experiments show that EnoTab achieves outstanding performance on TableQA tasks with complex questions and large-scale tables, confirming its effectiveness.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

One Refiner to Unlock Them All: Inference-Time Reasoning Elicitation via Reinforcement Query Refinement
cs.CL 2026-04 unverdicted novelty 5.0

ReQueR trains a single RL-based query refiner with an adaptive curriculum to decompose raw queries into structured logic, delivering 1.7-7.2% absolute gains on reasoning tasks across diverse LLMs and generalizing to u...