arxiv: 2604.10212 · v1 · submitted 2026-04-11 · 💻 cs.CL

Recognition: unknown

Relational Probing: LM-to-Graph Adaptation for Financial Prediction

Changhong Jin, Rian Dolphin, Ruihai Dong, Yingjie Niu

Authors on Pith no claims yet

Pith reviewed 2026-05-10 15:40 UTC · model grok-4.3

classification 💻 cs.CL

keywords relational probinglanguage modelsrelational graphsfinancial predictionstock trend predictiongraph adaptationsmall language models

0 comments

The pith

Language models can induce relational graphs directly from hidden states for financial stock-trend prediction by swapping their output head and training jointly with the task model.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper proposes Relational Probing to adapt language models for structured outputs in financial prediction tasks. Instead of generating text autoregressively with the standard head, a lightweight relation head extracts relationships from the model's hidden states to build a graph of financial entities. This head trains end-to-end together with a downstream model that predicts stock trends. The method keeps semantic understanding from the language model while enforcing strict graph structure and avoiding the cost of sequential decoding. A reader would care because it integrates language understanding with graph-based reasoning at lower inference cost than prompting pipelines.

Core claim

Relational Probing replaces the standard language-model head with a relation head that induces a relational graph directly from language-model hidden states and is trained jointly with the downstream task model for stock-trend prediction. This both learns semantic representations and preserves the strict structure of the induced relational graph, enabling language-model outputs to be reshaped into task-specific formats for downstream models.

What carries the argument

The relation head, a lightweight module that takes hidden states from the upstream small language model and produces a relational graph for joint optimization with the stock-trend predictor.

If this is right

Stock-trend prediction accuracy improves consistently over co-occurrence baselines while inference cost stays competitive with prompting methods.
Small language models defined as those fine-tunable end-to-end on a single 24GB GPU can be adapted this way using Qwen3 backbones of 0.6B to 4B parameters.
The induced graphs remain strictly structured rather than being softened by autoregressive generation.
Language model outputs become directly usable in task-specific graph formats without separate decoding steps.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same head-swapping and joint-training pattern could apply to other domains that need graphs extracted from text, such as supply-chain or regulatory networks.
Joint training may reshape the language model's hidden states to emphasize relational features more than standard pretraining does.
The operational definition of small language models given in the paper offers one way to compare efficiency claims across future adaptation methods.

Load-bearing premise

Hidden states from the upstream language model contain relational information about financial entities that a lightweight relation head can extract.

What would settle it

An experiment in which joint training of the relation head with the downstream stock-trend model produces no accuracy gain or a loss relative to a co-occurrence baseline or separately trained components.

Figures

Figures reproduced from arXiv: 2604.10212 by Changhong Jin, Rian Dolphin, Ruihai Dong, Yingjie Niu.

**Figure 1.** Figure 1: End-to-End Language Model → Relational Probing → GAT framework for Financial Trend Prediction: (a) News encoding; (b) A lightweight Relation Head (RH) maps language-model hidden states to adjacency matrices; (c) The GAT uses the dynamic graphs with node features to predict next-step trends. Let V = {1, . . . , N} denote the ticker set. We initialize learnable ticker embeddings E = {e1, . . . , eN } ∈ R N×d… view at source ↗

**Figure 2.** Figure 2: Prompt used to extract relations from news. [PITH_FULL_IMAGE:figures/full_fig_p006_2.png] view at source ↗

read the original abstract

Language models can be used to identify relationships between financial entities in text. However, while structured output mechanisms exist, prompting-based pipelines still incur autoregressive decoding costs and decouple graph construction from downstream optimization. We propose \emph{Relational Probing}, which replaces the standard language-model head with a relation head that induces a relational graph directly from language-model hidden states and is trained jointly with the downstream task model for stock-trend prediction. This approach both learns semantic representations and preserves the strict structure of the induced relational graph. It enables language-model outputs to go beyond text, allowing them to be reshaped into task-specific formats for downstream models. To enhance reproducibility, we provide an operational definition of small language models (SLMs): models that can be fine-tuned end-to-end on a single 24GB GPU under specified batch-size and sequence-length settings. Experiments use Qwen3 backbones (0.6B/1.7B/4B) as upstream SLMs and compare against a co-occurrence baseline. Relational Probing yields consistent performance improvements at competitive inference cost.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

Relational Probing is a clean joint-training trick for turning LM hidden states into graphs for stock prediction, but the experiments do not show that the graphs themselves drive the gains.

read the letter

The core idea is to drop the usual LM head, attach a lightweight relation head that reads hidden states and builds a graph on the fly, then train the whole thing end-to-end on the downstream stock-trend task. That setup is new enough compared with prompting pipelines or separate graph builders, and the paper gives a usable operational definition of small language models that fit on one 24 GB GPU. They run it on Qwen3 backbones from 0.6 B to 4 B and report better numbers than a simple co-occurrence baseline at similar inference cost. Those are the parts worth noting if you work on LM adaptations for structured financial output. The experiments stay narrow, which is fine for a focused method paper. The real limitation is that the comparison is only against co-occurrence. There are no controls that would tell us whether the induced graph is doing real work or whether the lift comes from extra parameters and joint optimization. No frozen-head runs, no random-graph baselines, and no separate evaluation of edge quality against known financial relations. Without those, it is hard to know if the relational structure is the active ingredient. The paper is aimed at people who already care about LM-to-graph pipelines in finance or similar domains. A reader who wants a practical recipe for small-model adaptation will find the architecture and the SLM definition useful. Someone looking for strong evidence that the graphs are better than alternatives will come away wanting more controls. I would send it to peer review. The method is straightforward to reproduce, the efficiency angle is clear, and the missing ablations are the sort of thing referees can ask for without killing the paper.

Referee Report

2 major / 2 minor

Summary. The paper proposes Relational Probing, which replaces the standard LM head with a lightweight relation head that induces a relational graph directly from upstream LM hidden states (Qwen3 0.6B/1.7B/4B backbones). The graph is trained jointly end-to-end with a downstream stock-trend predictor; the method is compared only against a co-occurrence baseline and is claimed to deliver consistent performance gains at competitive inference cost while preserving graph structure and enabling LM outputs to be reshaped for structured downstream tasks. An operational definition of small language models (SLMs) is supplied for reproducibility.

Significance. If the central claim holds after proper controls, the work would demonstrate a practical, low-cost route for adapting frozen or lightly tuned LMs into task-specific graph representations for financial prediction, avoiding autoregressive prompting overhead. The joint-training formulation and SLM operationalization are potentially useful contributions for resource-constrained structured-prediction settings.

major comments (2)

[Experiments] Experiments section: the manuscript reports gains only versus a co-occurrence baseline and supplies no ablations that isolate the relational graph (e.g., frozen relation head, random-graph controls, or direct prediction from LM states without the graph). Because the central claim attributes improvements to extractable relational information and joint optimization, the absence of these controls leaves open the possibility that gains arise from extra parameters or end-to-end fine-tuning rather than the induced graph structure.
[Results] Results and evaluation: the abstract asserts 'consistent performance improvements' yet the manuscript provides neither quantitative metrics with error bars, dataset statistics, nor statistical significance tests for the reported gains. Without these, the magnitude, reliability, and generalizability of the claimed advantage cannot be assessed.

minor comments (2)

The precise parameterization and output dimensionality of the relation head are not fully specified; a diagram or equation showing how hidden states are mapped to edge probabilities while 'preserving strict graph structure' would improve clarity.
The operational definition of SLMs (single 24 GB GPU, batch size, sequence length) should be stated explicitly in a dedicated subsection or table so that the reproducibility claim can be verified.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback and the recommendation for major revision. We address each major comment point by point below, agreeing where revisions are warranted to strengthen the claims.

read point-by-point responses

Referee: [Experiments] Experiments section: the manuscript reports gains only versus a co-occurrence baseline and supplies no ablations that isolate the relational graph (e.g., frozen relation head, random-graph controls, or direct prediction from LM states without the graph). Because the central claim attributes improvements to extractable relational information and joint optimization, the absence of these controls leaves open the possibility that gains arise from extra parameters or end-to-end fine-tuning rather than the induced graph structure.

Authors: We agree that the current set of experiments does not fully isolate the contribution of the induced relational graph. To address this, the revised manuscript will incorporate the suggested ablations: a frozen relation head, random-graph controls, and direct prediction from LM hidden states without the graph. These additions will help demonstrate that the observed gains stem from the relational structure and joint optimization rather than additional parameters or fine-tuning alone. revision: yes
Referee: [Results] Results and evaluation: the abstract asserts 'consistent performance improvements' yet the manuscript provides neither quantitative metrics with error bars, dataset statistics, nor statistical significance tests for the reported gains. Without these, the magnitude, reliability, and generalizability of the claimed advantage cannot be assessed.

Authors: We acknowledge that stronger statistical reporting is needed to support the abstract's claim. In the revised version, we will add quantitative performance metrics with error bars (standard deviation across runs), full dataset statistics, and statistical significance tests (such as paired t-tests) for the gains over the baseline. This will allow readers to better evaluate the magnitude, reliability, and generalizability of the results. revision: yes

Circularity Check

0 steps flagged

No significant circularity; method is an independent architectural proposal

full rationale

The paper proposes Relational Probing by replacing the standard LM head with a relation head that induces a graph from hidden states and trains it jointly with a downstream stock-trend predictor, then compares results to a co-occurrence baseline. No equations, derivations, fitted parameters renamed as predictions, or self-citations appear in the abstract or described text that would reduce the claimed performance gains to quantities defined by the inputs themselves. The design choices (head replacement, joint training, SLM operational definition) are presented as novel adaptations rather than self-referential or forced by prior results from the same authors. This is a standard empirical method paper whose central claim rests on external comparisons, not on any load-bearing step that collapses to its own definition or fit.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Review performed on abstract only; ledger entries are therefore minimal and provisional.

axioms (1)

domain assumption Language-model hidden states encode relational semantics between financial entities that can be extracted by a lightweight head.
Implicit premise required for the relation head to be useful.

invented entities (1)

Relation head no independent evidence
purpose: Induces a relational graph directly from LM hidden states instead of autoregressive decoding.
New architectural component introduced by the paper.

pith-pipeline@v0.9.0 · 5490 in / 1204 out tokens · 49982 ms · 2026-05-10T15:40:44.017673+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

24 extracted references · 16 canonical work pages · 3 internal anchors

[1]

Adam: A Method for Stochastic Optimization

Kingma DP Ba J Adam et al. A method for stochastic optimization.arXiv preprint arXiv:1412.6980, 1412(6),

work page internal anchor Pith review Pith/arXiv arXiv
[2]

Probing linguistic features of sentence-level representations in neural relation extraction.arXiv preprint arXiv:2004.08134,

Christoph Alt, Aleksandra Gabryszak, and Leonhard Hennig. Probing linguistic features of sentence-level representations in neural relation extraction.arXiv preprint arXiv:2004.08134,

work page arXiv 2004
[3]

Chatgpt informed graph neural network for stock movement prediction.arXiv preprint arXiv:2306.03763, 2023

Zihan Chen, Lei Nico Zheng, Cheng Lu, Jialu Yuan, and Di Zhu. Chatgpt informed graph neural network for stock movement prediction.arXiv preprint arXiv:2306.03763,

work page arXiv
[4]

Economic links and predictable returns.The Journal of Finance, 63(4):1977–2011,

Lauren Cohen and Andrea Frazzini. Economic links and predictable returns.The Journal of Finance, 63(4):1977–2011,

1977
[5]

Deep biaffine attention for neural dependency parsing

Timothy Dozat and Christopher D Manning. Deep biaffine attention for neural dependency parsing. arXiv preprint arXiv:1611.01734,

work page arXiv
[6]

Span-based joint entity and relation extraction with transformer pre-training.arXiv preprint arXiv:1909.07755,

Markus Eberts and Adrian Ulges. Span-based joint entity and relation extraction with transformer pre-training.arXiv preprint arXiv:1909.07755,

work page arXiv 1909
[7]

Temporal relational ranking for stock prediction.ACM Transactions on Information Systems (TOIS), 37(2): 1–30,

9 Published as a conference paper at ICLR 2026 Fuli Feng, Xiangnan He, Xiang Wang, Cheng Luo, Yiqun Liu, and Tat-Seng Chua. Temporal relational ranking for stock prediction.ACM Transactions on Information Systems (TOIS), 37(2): 1–30,

2026
[8]

Fire: A dataset for financial relation extraction

Hassan Hamad, Abhinav Kumar Thakur, Nijil Kolleri, Sujith Pulikodan, and Keith Chugg. Fire: A dataset for financial relation extraction. InFindings of the Association for Computational Linguistics: NAACL 2024, pp. 3628–3642,

2024
[9]

Instruct and extract: Instruction tuning for on-demand information extraction,

Yizhu Jiao, Ming Zhong, Sha Li, Ruining Zhao, Siru Ouyang, Heng Ji, and Jiawei Han. In- struct and extract: Instruction tuning for on-demand information extraction.arXiv preprint arXiv:2310.16040,

work page arXiv
[10]

Revisiting large language models as zero-shot relation extractors.arXiv preprint arXiv:2310.05028,

Guozheng Li, Peng Wang, and Wenjun Ke. Revisiting large language models as zero-shot relation extractors.arXiv preprint arXiv:2310.05028,

work page arXiv
[11]

Learning to generalize for cross-domain qa

Yingjie Niu, Linyi Yang, Ruihai Dong, and Yue Zhang. Learning to generalize for cross-domain qa. arXiv preprint arXiv:2305.08208,

work page arXiv
[12]

Ngat: A node-level graph attention network for long-term stock prediction.arXiv preprint arXiv:2507.02018,

Yingjie Niu, Mingchuan Zhao, Valerio Poti, and Ruihai Dong. Ngat: A node-level graph attention network for long-term stock prediction.arXiv preprint arXiv:2507.02018,

work page arXiv
[13]

Does corporate headquarters location matter for stock returns? The journal of finance, 61(4):1991–2015,

Christo Pirinsky and Qinghai Wang. Does corporate headquarters location matter for stock returns? The journal of finance, 61(4):1991–2015,

1991
[14]

Stock selection via spatiotemporal hypergraph attention network: A learning to rank approach

10 Published as a conference paper at ICLR 2026 Ramit Sawhney, Shivam Agarwal, Arnav Wadhwa, Tyler Derr, and Rajiv Ratn Shah. Stock selection via spatiotemporal hypergraph attention network: A learning to rank approach. InProceedings of the AAAI Conference on Artificial Intelligence, volume 35, pp. 497–504,

2026
[15]

Research on financial trend prediction technology using lstm

Yu Sun. Research on financial trend prediction technology using lstm. InProceedings of the 2024 In- ternational Conference on Economic Data Analytics and Artificial Intelligence, pp. 58–63,

2024
[16]

Graph Attention Networks

Petar Veliˇckovi´c, Guillem Cucurull, Arantxa Casanova, Adriana Romero, Pietro Lio, and Yoshua Bengio. Graph attention networks.arXiv preprint arXiv:1710.10903,

work page internal anchor Pith review arXiv
[17]

Simultaneously self-attending to all men- tions for full-abstract biological relation extraction.arXiv preprint arXiv:1802.10569,

Patrick Verga, Emma Strubell, and Andrew McCallum. Simultaneously self-attending to all men- tions for full-abstract biological relation extraction.arXiv preprint arXiv:1802.10569,

work page arXiv
[18]

Revisiting relation extraction in the era of large language models

Somin Wadhwa, Silvio Amir, and Byron C Wallace. Revisiting relation extraction in the era of large language models. InProceedings of the conference. association for computational linguistics. meeting, volume 2023, pp. 15566,

2023
[19]

A review on graph neural network methods in financial applications.arXiv preprint arXiv:2111.15367,

Jianian Wang, Sheng Zhang, Yanghua Xiao, and Rui Song. A review on graph neural network methods in financial applications.arXiv preprint arXiv:2111.15367,

work page arXiv
[20]

arXiv preprint arXiv:2304.08085

Xiao Wang, Weikang Zhou, Can Zu, Han Xia, Tianze Chen, Yuansen Zhang, Rui Zheng, Junjie Ye, Qi Zhang, Tao Gui, et al. Instructuie: Multi-task instruction tuning for unified information extraction.arXiv preprint arXiv:2304.08085,

work page arXiv
[21]

Chatie: Zero-shot information extraction via chatting with chatgpt.arXiv preprint arXiv:2302.10205, 2023

Xiang Wei, Xingyu Cui, Ning Cheng, Xiaobin Wang, Xin Zhang, Shen Huang, Pengjun Xie, Jinan Xu, Yufeng Chen, Meishan Zhang, et al. Chatie: Zero-shot information extraction via chatting with chatgpt.arXiv preprint arXiv:2302.10205,

work page arXiv
[22]

Qwen3 Technical Report

An Yang, Anfeng Li, Baosong Yang, Beichen Zhang, Binyuan Hui, Bo Zheng, Bowen Yu, Chang Gao, Chengen Huang, Chenxu Lv, et al. Qwen3 technical report.arXiv preprint arXiv:2505.09388,

work page internal anchor Pith review Pith/arXiv arXiv
[23]

Grasping the essentials: Tailoring large language models for zero-shot relation extraction.arXiv preprint arXiv:2402.11142,

Sizhe Zhou, Yu Meng, Bowen Jin, and Jiawei Han. Grasping the essentials: Tailoring large language models for zero-shot relation extraction.arXiv preprint arXiv:2402.11142,

work page arXiv
[24]

The LLM’s role was strictly limited to improving grammar, spelling, and clarity

11 Published as a conference paper at ICLR 2026 A THEUSE OFLARGELANGUAGEMODELS(LLMS) We utilized a large language model (LLM) as a general-purpose assist tool to aid or polish the writing. The LLM’s role was strictly limited to improving grammar, spelling, and clarity. It was not used for research ideation, experiment design, data analysis, code, or gener...

2026