EviLink: Multi-Path Schema Linking with Uncertainty-Guided Evidence Acquisition for Large-Scale Text-to-SQL

Chao Hu; Chen Hou; Danqing Huang; Dazhen Deng; Defeng Xie; Haoxuan Li; Haozhe Feng; Huawei Zheng; Peng Chen; Sen Yang

arxiv: 2605.29670 · v1 · pith:EAHR4XWKnew · submitted 2026-05-28 · 💻 cs.CL · cs.AI

EviLink: Multi-Path Schema Linking with Uncertainty-Guided Evidence Acquisition for Large-Scale Text-to-SQL

Huawei Zheng , Sen Yang , Zhaorui Yang , Yuhui Zhang , Haozhe Feng , Haoxuan Li , Xuan Yi , Chao Hu

show 7 more authors

Defeng Xie Chen Hou Danqing Huang Wei Chen Yingcai Wu Peng Chen Dazhen Deng

This is my paper

Pith reviewed 2026-06-29 07:50 UTC · model grok-4.3

classification 💻 cs.CL cs.AI

keywords schema linkingtext-to-sqlmulti-path inferenceuncertainty estimationevidence acquisitionlarge-scale databasessql generation

0 comments

The pith

Schema linking for Text-to-SQL improves when multiple plausible paths guide uncertainty-based evidence selection.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper establishes that schema linking should not treat selection as a deterministic choice around one SQL realization. Complex questions often admit several valid SQL paths, each requiring different schema items. By generating multiple hypotheses and using uncertainty estimates to separate items that are required regardless of path from those that vary, the system acquires evidence only for the uncertain cases. This reframing produces a better trade-off between completeness, relevance, and token usage. Experiments on BIRD-Dev and Spider2-Snow confirm higher field-level recall at lower average token cost while also lifting the performance of a fixed downstream SQL generator.

Core claim

EviLink reframes schema linking as uncertainty-aware schema-need inference over multiple plausible SQL paths. It combines multi-hypothesis schema grounding with uncertainty-guided evidence acquisition to distinguish required schema items from path-dependent uncertain ones and acquires evidence only where needed, improving the balance among schema completeness, schema relevance, and token cost.

What carries the argument

Multi-hypothesis schema grounding paired with uncertainty-guided evidence acquisition, which identifies items needed across all paths while limiting evidence collection to uncertain positions.

If this is right

On Spider2-Snow the method reaches 90.15 percent field-level strict recall while using 123.30K average tokens.
The same procedure improves downstream SQL generation accuracy when the generator is held fixed.
The approach yields a measurable improvement in the three-way balance of completeness, relevance, and token cost on both BIRD-Dev and Spider2-Snow.
Evidence acquisition occurs selectively rather than uniformly across the schema.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same uncertainty signal could be reused to prune evidence in other retrieval-augmented generation settings that face large, ambiguous contexts.
If path sampling is cheap, the method suggests a general template for any task where one input maps to several valid outputs with differing context needs.
An extension could test whether the same multi-path uncertainty logic transfers to schema linking in other query languages or in visual question answering over large tables.

Load-bearing premise

Multiple plausible SQL paths can be generated reliably and uncertainty estimates over those paths accurately mark which schema items are required rather than path-specific.

What would settle it

A dataset of ambiguous questions where EviLink's uncertainty scores systematically omit a critical schema item that appears in every valid path, producing lower recall than single-path baselines.

Figures

Figures reproduced from arXiv: 2605.29670 by Chao Hu, Chen Hou, Danqing Huang, Dazhen Deng, Defeng Xie, Haoxuan Li, Haozhe Feng, Huawei Zheng, Peng Chen, Sen Yang, Wei Chen, Xuan Yi, Yingcai Wu, Yuhui Zhang, Zhaorui Yang.

**Figure 1.** Figure 1: From single-path schema linking to uncertainty-guided multi-path reasoning. (A) Many existing formulations rely on a single SQL path, deterministic schema decisions, and static evidence provision. (B) Human engineers instead consider multiple plausible SQL paths, separate required from uncertain schema items, and inspect evidence selectively. (C) We reframe schema linking as uncertainty-guided multi-path r… view at source ↗

**Figure 2.** Figure 2: Overview of EviLink. The first stage performs multi-hypothesis schema grounding: (A) Multi-Hypothesis proposes multiple plausible SQL paths, (B) Consolidate and Cross-Vote aggregates their schema selections, and (C) Initialize Sets separates required, uncertain, and candidate items. The second stage performs uncertainty-guided agentic refinement: (D) Agentic Refine provides evidence and decision tools, (E)… view at source ↗

**Figure 3.** Figure 3: Sensitivity to the hypothesis budget K. Results are reported on the Spider2-Snow ablation subset. K = 1 corresponds to single-hypothesis grounding, while moderate multi-hypothesis settings provide a more stable balance between schema-linking quality and token cost. We use K = 4 as the default setting. ments, while abbreviating long illustrative examples and verbose JSON templates [PITH_FULL_IMAGE:figures… view at source ↗

**Figure 4.** Figure 4: Tool design for uncertainty-guided agentic refinement. The six tools support targeted evidence acquisition, [PITH_FULL_IMAGE:figures/full_fig_p016_4.png] view at source ↗

**Figure 5.** Figure 5: System prompt excerpt: multi-hypothesis schema-need elicitation. [PITH_FULL_IMAGE:figures/full_fig_p017_5.png] view at source ↗

**Figure 6.** Figure 6: System prompt excerpt: table selection [PITH_FULL_IMAGE:figures/full_fig_p018_6.png] view at source ↗

**Figure 7.** Figure 7: System prompt excerpt: field selection [PITH_FULL_IMAGE:figures/full_fig_p019_7.png] view at source ↗

**Figure 8.** Figure 8: System prompt excerpt: uncertainty-guided agentic refinement. [PITH_FULL_IMAGE:figures/full_fig_p020_8.png] view at source ↗

read the original abstract

Schema linking is a difficult and important step in large-scale Text-to-SQL, where systems must identify a compact yet sufficient schema context from large and ambiguous databases. Existing methods often treat schema linking as deterministic selection around a single SQL path, but complex questions may admit multiple valid realizations with different schema needs. We reframe schema linking as uncertainty-aware schema-need inference over multiple plausible SQL paths, where the system distinguishes required schema items from path-dependent uncertain ones and acquires evidence only where needed. We instantiate this reframing with EviLink, which combines multi-hypothesis schema grounding with uncertainty-guided evidence acquisition. Experiments on BIRD-Dev and Spider2-Snow show that this perspective improves the balance among schema completeness, schema relevance, and token cost. On Spider2-Snow, EviLink achieves 90.15% field-level strict recall rate, uses 123.30K average tokens, and improves downstream SQL generation under a fixed generator.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

EviLink reframes schema linking as multi-path uncertainty-aware inference and reports concrete gains on Spider2-Snow, but the abstract leaves the mechanics thin.

read the letter

The paper's central move is to stop treating schema linking as picking items for one SQL path and instead run inference across multiple plausible paths, then use uncertainty to acquire evidence only where the paths disagree on what is needed. This is the main novelty relative to prior deterministic methods.

They implement it as EviLink with multi-hypothesis schema grounding and uncertainty-guided acquisition. The results on Spider2-Snow are 90.15% field-level strict recall at 123.30K average tokens, plus better SQL generation with a fixed generator. They also test on BIRD-Dev and claim a better balance of completeness, relevance, and cost.

This is a practical step because real questions can have several valid SQL realizations that require different schema subsets. Distinguishing required from path-dependent items makes sense for controlling token use in large databases.

The soft spot is that the abstract and high-level description do not include the method details, baseline comparisons, or error analysis needed to fully verify the claims. The assumption that multiple paths can be generated reliably and that uncertainty distinguishes the right items needs checking in the full paper. No load-bearing flaw is visible from what is presented, but the evidence is still thin without those sections.

This work is for people building or evaluating Text-to-SQL systems on big schemas. It is worth sending out for peer review because the problem is important in the subfield and the proposed reframing is distinct enough to merit referee scrutiny.

Referee Report

0 major / 2 minor

Summary. The paper reframes schema linking for large-scale Text-to-SQL as uncertainty-aware inference over multiple plausible SQL paths rather than deterministic selection around a single path. It introduces EviLink, which performs multi-hypothesis schema grounding combined with uncertainty-guided evidence acquisition to distinguish required schema items from path-dependent ones. Experiments on BIRD-Dev and Spider2-Snow are reported to show improved balance among schema completeness, relevance, and token cost; on Spider2-Snow the method achieves 90.15% field-level strict recall while using 123.30K average tokens and yields better downstream SQL generation under a fixed generator.

Significance. If the empirical claims hold under rigorous validation, the work offers a meaningful advance for Text-to-SQL on large ambiguous databases by explicitly modeling query ambiguity through multi-path uncertainty. The reported recall/token trade-off on Spider2-Snow is practically relevant, and the reframing could influence future systems that must avoid both under- and over-retrieval of schema context. No machine-checked proofs or parameter-free derivations are claimed, but the empirical focus on falsifiable downstream improvement is a strength.

minor comments (2)

Abstract: the description of how multi-hypothesis paths are generated and how uncertainty is quantified would benefit from one additional sentence to allow readers to assess the weakest assumption identified in the review process.
The manuscript should include a brief error analysis or ablation on cases where uncertainty estimates fail to surface critical schema items, even if only in the appendix.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of the work's significance and for recognizing the practical relevance of the recall/token trade-off on Spider2-Snow. The report does not enumerate any specific major comments, so we have no point-by-point rebuttals to provide at this stage.

Circularity Check

0 steps flagged

No significant circularity; empirical method with no derivations or self-referential reductions

full rationale

The paper describes an empirical reframing of schema linking as uncertainty-aware inference over multiple SQL paths, instantiated as EviLink, and evaluated on BIRD-Dev and Spider2-Snow benchmarks. No equations, parameters, or derivations are present in the provided text. Claims rest on experimental metrics (e.g., 90.15% recall, token usage) rather than any chain that reduces by construction to fitted inputs or self-citations. The central premise does not invoke uniqueness theorems, ansatzes smuggled via citation, or renaming of known results. This is a self-contained empirical contribution with no load-bearing steps that match the enumerated circularity patterns.

Axiom & Free-Parameter Ledger

0 free parameters · 0 axioms · 0 invented entities

Abstract-only review supplies no equations, parameters, or explicit assumptions beyond the high-level reframing; ledger entries cannot be populated.

pith-pipeline@v0.9.1-grok · 5744 in / 933 out tokens · 20434 ms · 2026-06-29T07:50:20.263714+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

39 extracted references · 3 canonical work pages

[1]

C3: Zero-shot text-to-SQL with ChatGPT,

C3: Zero-shot Text-to-SQL with ChatGPT. Preprint, arXiv:2307.07306. Dawei Gao, Haibin Wang, Yaliang Li, Xiuyu Sun, Yichen Qian, Bolin Ding, and Jingren Zhou. 2024. Text-to-SQL Empowered by Large Language Mod- els: A Benchmark Evaluation.Proc. VLDB Endow., 17(5):1132–1145. Zhifeng Hao, Qibin Song, Ruichu Cai, and Boyan Xu

work page arXiv 2024
[2]

Preprint, arXiv:2511.21402

Text-to-SQL as Dual-State Reasoning: Inte- grating Adaptive Context and Progressive Generation. Preprint, arXiv:2511.21402. George Katsogiannis-Meimarakis, Katsiaryna Mirylenka, Paolo Scotton, Francesco Fusco, and Abdel Labbi. 2026. In-depth Analysis of LLM-based Schema Linking. InInternational Conference on Extending Database Technology, pages 117–130. D...

work page arXiv 2026
[3]

arXiv preprint arXiv:2408.07702 , year=

Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL. InPro- ceedings of the 31st International Conference on Computational Linguistics, pages 9793–9803. Karime Maamari, Fadhil Abubaker, Daniel Jaroslawicz, and Amine Mhedhbi. 2024. The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models.Preprint,...

work page arXiv 2024
[4]

Params: field_ids, a list of working-set field identifiers

get_field_stats(field_ids) Purpose: r equest available L3-level statistics beyond inline L2 evidence for uncertain working-set fields. Params: field_ids, a list of working-set field identifiers. Retur ns: available L3 pr ofiling incr ements, such as ranges, summary statistics, histograms, and distributional signals when available. Constraint: use only whe...
[5]

Params: sql, a diagnostic query written by the agent

pr obe_sql(sql) Purpose: run a lightweight SQL pr obe to verify whether a join works or whether a value range overlaps the question. Params: sql, a diagnostic query written by the agent. Retur ns: up to 5 r ows, or the raw database err or if execution fails. Constraint: runs with automatic LIMIT <= 5 and runtime timeout 30s; use for verification, not open...
[6]

Params: none

list_tables() Purpose: inspect curr ently unselected tables, mainly during the r ecall-r ecovery phase. Params: none. Retur ns: a compact list of unselected tables; isomorphic tables may be gr ouped to expose partitions or duplicated schemas. Constraint: this tool only exposes candidate tables. The agent must call keep(...) to add any table to the final schema
[7]

Params: table_id, the identifier of one candidate table

list_columns(table_id) Purpose: expand one candidate table when the agent suspects Stage 1 may have missed useful schema items. Params: table_id, the identifier of one candidate table. Retur ns: columns of the table with L2 evidence, including name, type, description, samples, and null_ratio when available. Constraint: this tool only exposes columns. Any ...
[8]

Params: item_ids, a list of schema item identifiers

keep(item_ids) Purpose: r etain schema items in the final linked schema. Params: item_ids, a list of schema item identifiers. Retur ns: the specified items ar e added to the final keep set. Constraint: accepts working-set or candidate-pool items. Field-level keep may pr opagate acr oss isomorphic siblings; table-level keep does not
[9]

this table is pr obably not needed,

discar d(item_ids, r eason) Purpose: r emove clearly unnecessary working-set items fr om the final linked schema. Params: item_ids, a list of working-set item identifiers; r eason, a one-line semantic justification. Retur ns: the specified items ar e r emoved fr om the final schema. Constraint: r eason is r equir ed and must be gr ounded in the item's own...
[10]

Any table whose name or field names align with an entity , concept, attribute, filter value, or synonym in the question
[11]

Bridge / junction / mapping / r elationship tables that could connect two entities mentioned in the question — even when a dir ect for eign key also exists
[12]

When several tables shar e the same structur e and r epr esent partitions of one logical dataset by date, shar d suffix, r egion, version, or duplicated schema, and the question's scope may touch mor e than one partition, select ALL such partitions
[13]

summary , primary vs

When the same entity is r ecor ded in multiple tables at differ ent granularities or fr om differ ent sour ces, such as detail vs. summary , primary vs. history / audit, or two independent pr oviders of the same measur e, select ALL such candidates
[14]

Case and schema boundaries do not exempt a table

Cr oss-database and cr oss-schema tables whose names or semantics match a concept in the question. Case and schema boundaries do not exempt a table
[15]

r ecent,

When the question carries any temporal scope, such as a date range, a year , “r ecent,” or “latest,” select every time-series or time-keyed table whose coverage could overlap that scope
[16]

Such tables often expose multiple identifier -shaped columns and may carry timestamp or status fields

T ables whose name or visible field names signal a structural connector r ole: co-occurr ence, mapping, link, transition, or event tables between two or mor e concepts alr eady named by the question. Such tables often expose multiple identifier -shaped columns and may carry timestamp or status fields. Default to keeping connector tables when both endpoint...
[17]

every excluded table has a concr ete r eason why no corr ect SQL could use it
[18]

any table without such a r eason has been moved back to the selected set
[19]

selected_tables

every selected table name matches an input table exactly , case-sensitive. [OUTPUT FORMA T] Retur n ONL Y a single-line minified JSON object with: {"selected_tables":["<DB>.<SCHEMA>.<T ABLE>", ...]} Use the EXACT full path string fr om the input. Do NOT pr epend literal wor ds such as db, schema, or database. [OUTPUT RULES]
[20]

Use exact full table names fr om the input, case-sensitive
[21]

this field is pr obably not needed,

selected_tables must be non-empty and contain no duplicates. Figure 6: System prompt excerpt: table selection. System Pr ompt: Field Selection Y ou ar e a schema linker . Select every field that COULD be needed to answer the question, within the alr eady-filter ed tables. [INPUT] Y ou will r eceive a natural-language question, an optional structur ed hypo...
[22]

Any field whose name, type, category , or table context aligns with an entity , attribute, filter value, synonym, or display tar get in the question
[23]

If you select an id / code / key on one table, select its counterpart on every other table it could join to

Every JOIN key must be selected on BOTH sides of the join. If you select an id / code / key on one table, select its counterpart on every other table it could join to
[24]

This also applies to same-schema partition / shar d tables

When the same concept, such as entity id, timestamp, location, status, or name, can be encoded by columns in several r etained tables, select it in EVER Y such table. This also applies to same-schema partition / shar d tables. Do NOT deduplicate acr oss tables
[25]

Select time, date, and or dering columns whenever the question mentions any temporal or ranking notion and the column could plausibly carry that semantics
[26]

NULL-ness, name vs

When a filter or measur e admits multiple encodings, such as flag vs. NULL-ness, name vs. code, or dedicated column vs. descriptive patter n match, select ALL candidate columns
[27]

When the question implies a hierar chy , such as country / state / county / zip, year / month / day , or category / subcategory , select identifiers at EVER Y level that could be touched
[28]

Within a r etained table, default-keep r etrieval-unit fields: primary identifiers, for eign-key-shaped columns, categorical / status / type / flag columns, human-r eadable name / title / label columns, and timestamp / date / time columns
[29]

A field not mentioned in the hypothesis may still be r equir ed

Hypothesis text is a HINT , not a whitelist. A field not mentioned in the hypothesis may still be r equir ed
[30]

If question tokens and column-name segments shar e a non-trivial substring, default-keep the column unless a concr ete exclusion r eason exists

Surface-form alignment between the question and a column is a positive signal. If question tokens and column-name segments shar e a non-trivial substring, default-keep the column unless a concr ete exclusion r eason exists. [EXCLUDE ONL Y IF CER T AIN] A field may be omitted ONL Y if it is an engineering / ETL artifact with no business semantics AND the q...
[31]

every excluded field has a concr ete r eason why no corr ect SQL could use it
[32]

every selected JOIN key has counterparts selected on all tables it could join to
[33]

question tokens have been r e-scanned against r etained-table column names
[34]

selected_fields

every selected field name matches an input field exactly , case-sensitive. [OUTPUT FORMA T] Retur n ONL Y a single-line minified JSON object with: {"selected_fields":["<DB>.<SCHEMA>.<T ABLE>.<FIELD>", ...]} Use the EXACT full path string fr om the input. Do NOT pr epend literal wor ds such as db, schema, or database. [OUTPUT RULES]
[35]

Use exact full field names fr om the input, case-sensitive
[36]

Figure 7: System prompt excerpt: field selection

selected_fields must contain no duplicates. Figure 7: System prompt excerpt: field selection. System Pr ompt: Uncertainty-Guided Agentic Refinement Y ou ar e a schema verifier . T wo har d rules gover n every action you take: [RULE 1] Recall First A corr ect SQL may exist along multiple plausible r easoning paths. Any schema item that could be needed by a...
[37]

Read the L2 view and decide whether it is enough for keep or discar d
[38]

If mor e evidence is needed, use L3 statistics or SQL pr obes
[39]

the question does not literally mention this column

Or leave the item undecided. Undecided items default to KEEP at conver gence. [DISCARD REQUIRES A JUSTIFICA TION] Every discar d call must include a one-line r eason explaining why no plausible SQL path could need this field. V alid r easons cite the field's own semantics: ETL bookkeeping, inter nal system metadata, or L3 evidence showing irr elevant / co...

[1] [1]

C3: Zero-shot text-to-SQL with ChatGPT,

C3: Zero-shot Text-to-SQL with ChatGPT. Preprint, arXiv:2307.07306. Dawei Gao, Haibin Wang, Yaliang Li, Xiuyu Sun, Yichen Qian, Bolin Ding, and Jingren Zhou. 2024. Text-to-SQL Empowered by Large Language Mod- els: A Benchmark Evaluation.Proc. VLDB Endow., 17(5):1132–1145. Zhifeng Hao, Qibin Song, Ruichu Cai, and Boyan Xu

work page arXiv 2024

[2] [2]

Preprint, arXiv:2511.21402

Text-to-SQL as Dual-State Reasoning: Inte- grating Adaptive Context and Progressive Generation. Preprint, arXiv:2511.21402. George Katsogiannis-Meimarakis, Katsiaryna Mirylenka, Paolo Scotton, Francesco Fusco, and Abdel Labbi. 2026. In-depth Analysis of LLM-based Schema Linking. InInternational Conference on Extending Database Technology, pages 117–130. D...

work page arXiv 2026

[3] [3]

arXiv preprint arXiv:2408.07702 , year=

Solid-SQL: Enhanced Schema-linking based In-context Learning for Robust Text-to-SQL. InPro- ceedings of the 31st International Conference on Computational Linguistics, pages 9793–9803. Karime Maamari, Fadhil Abubaker, Daniel Jaroslawicz, and Amine Mhedhbi. 2024. The Death of Schema Linking? Text-to-SQL in the Age of Well-Reasoned Language Models.Preprint,...

work page arXiv 2024

[4] [4]

Params: field_ids, a list of working-set field identifiers

get_field_stats(field_ids) Purpose: r equest available L3-level statistics beyond inline L2 evidence for uncertain working-set fields. Params: field_ids, a list of working-set field identifiers. Retur ns: available L3 pr ofiling incr ements, such as ranges, summary statistics, histograms, and distributional signals when available. Constraint: use only whe...

[5] [5]

Params: sql, a diagnostic query written by the agent

pr obe_sql(sql) Purpose: run a lightweight SQL pr obe to verify whether a join works or whether a value range overlaps the question. Params: sql, a diagnostic query written by the agent. Retur ns: up to 5 r ows, or the raw database err or if execution fails. Constraint: runs with automatic LIMIT <= 5 and runtime timeout 30s; use for verification, not open...

[6] [6]

Params: none

list_tables() Purpose: inspect curr ently unselected tables, mainly during the r ecall-r ecovery phase. Params: none. Retur ns: a compact list of unselected tables; isomorphic tables may be gr ouped to expose partitions or duplicated schemas. Constraint: this tool only exposes candidate tables. The agent must call keep(...) to add any table to the final schema

[7] [7]

Params: table_id, the identifier of one candidate table

list_columns(table_id) Purpose: expand one candidate table when the agent suspects Stage 1 may have missed useful schema items. Params: table_id, the identifier of one candidate table. Retur ns: columns of the table with L2 evidence, including name, type, description, samples, and null_ratio when available. Constraint: this tool only exposes columns. Any ...

[8] [8]

Params: item_ids, a list of schema item identifiers

keep(item_ids) Purpose: r etain schema items in the final linked schema. Params: item_ids, a list of schema item identifiers. Retur ns: the specified items ar e added to the final keep set. Constraint: accepts working-set or candidate-pool items. Field-level keep may pr opagate acr oss isomorphic siblings; table-level keep does not

[9] [9]

this table is pr obably not needed,

discar d(item_ids, r eason) Purpose: r emove clearly unnecessary working-set items fr om the final linked schema. Params: item_ids, a list of working-set item identifiers; r eason, a one-line semantic justification. Retur ns: the specified items ar e r emoved fr om the final schema. Constraint: r eason is r equir ed and must be gr ounded in the item's own...

[10] [10]

Any table whose name or field names align with an entity , concept, attribute, filter value, or synonym in the question

[11] [11]

Bridge / junction / mapping / r elationship tables that could connect two entities mentioned in the question — even when a dir ect for eign key also exists

[12] [12]

When several tables shar e the same structur e and r epr esent partitions of one logical dataset by date, shar d suffix, r egion, version, or duplicated schema, and the question's scope may touch mor e than one partition, select ALL such partitions

[13] [13]

summary , primary vs

When the same entity is r ecor ded in multiple tables at differ ent granularities or fr om differ ent sour ces, such as detail vs. summary , primary vs. history / audit, or two independent pr oviders of the same measur e, select ALL such candidates

[14] [14]

Case and schema boundaries do not exempt a table

Cr oss-database and cr oss-schema tables whose names or semantics match a concept in the question. Case and schema boundaries do not exempt a table

[15] [15]

r ecent,

When the question carries any temporal scope, such as a date range, a year , “r ecent,” or “latest,” select every time-series or time-keyed table whose coverage could overlap that scope

[16] [16]

Such tables often expose multiple identifier -shaped columns and may carry timestamp or status fields

T ables whose name or visible field names signal a structural connector r ole: co-occurr ence, mapping, link, transition, or event tables between two or mor e concepts alr eady named by the question. Such tables often expose multiple identifier -shaped columns and may carry timestamp or status fields. Default to keeping connector tables when both endpoint...

[17] [17]

every excluded table has a concr ete r eason why no corr ect SQL could use it

[18] [18]

any table without such a r eason has been moved back to the selected set

[19] [19]

selected_tables

every selected table name matches an input table exactly , case-sensitive. [OUTPUT FORMA T] Retur n ONL Y a single-line minified JSON object with: {"selected_tables":["<DB>.<SCHEMA>.<T ABLE>", ...]} Use the EXACT full path string fr om the input. Do NOT pr epend literal wor ds such as db, schema, or database. [OUTPUT RULES]

[20] [20]

Use exact full table names fr om the input, case-sensitive

[21] [21]

this field is pr obably not needed,

selected_tables must be non-empty and contain no duplicates. Figure 6: System prompt excerpt: table selection. System Pr ompt: Field Selection Y ou ar e a schema linker . Select every field that COULD be needed to answer the question, within the alr eady-filter ed tables. [INPUT] Y ou will r eceive a natural-language question, an optional structur ed hypo...

[22] [22]

Any field whose name, type, category , or table context aligns with an entity , attribute, filter value, synonym, or display tar get in the question

[23] [23]

If you select an id / code / key on one table, select its counterpart on every other table it could join to

Every JOIN key must be selected on BOTH sides of the join. If you select an id / code / key on one table, select its counterpart on every other table it could join to

[24] [24]

This also applies to same-schema partition / shar d tables

When the same concept, such as entity id, timestamp, location, status, or name, can be encoded by columns in several r etained tables, select it in EVER Y such table. This also applies to same-schema partition / shar d tables. Do NOT deduplicate acr oss tables

[25] [25]

Select time, date, and or dering columns whenever the question mentions any temporal or ranking notion and the column could plausibly carry that semantics

[26] [26]

NULL-ness, name vs

When a filter or measur e admits multiple encodings, such as flag vs. NULL-ness, name vs. code, or dedicated column vs. descriptive patter n match, select ALL candidate columns

[27] [27]

When the question implies a hierar chy , such as country / state / county / zip, year / month / day , or category / subcategory , select identifiers at EVER Y level that could be touched

[28] [28]

Within a r etained table, default-keep r etrieval-unit fields: primary identifiers, for eign-key-shaped columns, categorical / status / type / flag columns, human-r eadable name / title / label columns, and timestamp / date / time columns

[29] [29]

A field not mentioned in the hypothesis may still be r equir ed

Hypothesis text is a HINT , not a whitelist. A field not mentioned in the hypothesis may still be r equir ed

[30] [30]

If question tokens and column-name segments shar e a non-trivial substring, default-keep the column unless a concr ete exclusion r eason exists

Surface-form alignment between the question and a column is a positive signal. If question tokens and column-name segments shar e a non-trivial substring, default-keep the column unless a concr ete exclusion r eason exists. [EXCLUDE ONL Y IF CER T AIN] A field may be omitted ONL Y if it is an engineering / ETL artifact with no business semantics AND the q...

[31] [31]

every excluded field has a concr ete r eason why no corr ect SQL could use it

[32] [32]

every selected JOIN key has counterparts selected on all tables it could join to

[33] [33]

question tokens have been r e-scanned against r etained-table column names

[34] [34]

selected_fields

every selected field name matches an input field exactly , case-sensitive. [OUTPUT FORMA T] Retur n ONL Y a single-line minified JSON object with: {"selected_fields":["<DB>.<SCHEMA>.<T ABLE>.<FIELD>", ...]} Use the EXACT full path string fr om the input. Do NOT pr epend literal wor ds such as db, schema, or database. [OUTPUT RULES]

[35] [35]

Use exact full field names fr om the input, case-sensitive

[36] [36]

Figure 7: System prompt excerpt: field selection

selected_fields must contain no duplicates. Figure 7: System prompt excerpt: field selection. System Pr ompt: Uncertainty-Guided Agentic Refinement Y ou ar e a schema verifier . T wo har d rules gover n every action you take: [RULE 1] Recall First A corr ect SQL may exist along multiple plausible r easoning paths. Any schema item that could be needed by a...

[37] [37]

Read the L2 view and decide whether it is enough for keep or discar d

[38] [38]

If mor e evidence is needed, use L3 statistics or SQL pr obes

[39] [39]

the question does not literally mention this column

Or leave the item undecided. Undecided items default to KEEP at conver gence. [DISCARD REQUIRES A JUSTIFICA TION] Every discar d call must include a one-line r eason explaining why no plausible SQL path could need this field. V alid r easons cite the field's own semantics: ETL bookkeeping, inter nal system metadata, or L3 evidence showing irr elevant / co...