arxiv: 2605.00120 · v1 · submitted 2026-04-30 · 💻 cs.CV · cs.CR· cs.LG

Recognition: unknown

GAFSV-Net: A Vision Framework for Online Signature Verification

Himanshu Singhal , Suresh Sundaram

Authors on Pith no claims yet

Pith reviewed 2026-05-09 21:04 UTC · model grok-4.3

classification 💻 cs.CV cs.CRcs.LG

keywords online signature verificationGramian Angular FieldGASFGADFConvNeXtdeep learningbiometricscomputer vision

0 comments

The pith

Converting online signature sequences into six-channel asymmetric Gramian Angular Field images allows 2D vision models to outperform traditional sequence-based methods for distinguishing genuine signatures from forgeries.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper aims to address the restriction of online signature verification methods to 1D sequence models by transforming kinematic time series into images suitable for 2D networks. It demonstrates that encoding pen speed, pressure derivative, and direction angle into complementary GASF and GADF matrices captures temporal structures that improve forgery detection even with few enrollment samples. A sympathetic reader would care because this shift enables leveraging established vision architectures for a biometric task plagued by high variability and skilled attacks. The evaluations on DeepSignDB and BiosecurID confirm that the gains from 2D encoding remain consistent regardless of training details.

Core claim

GAFSV-Net represents each signature as a six-channel asymmetric Gramian Angular Field image by encoding three kinematic channels into complementary GASF and GADF matrices that capture pairwise temporal co-occurrence and directional transition structure. A dual-branch ConvNeXt-Tiny encoder processes the GASF and GADF branches independently with bidirectional cross-attention before metric-space projection via semi-hard triplet loss with skilled-forgery hard-negative injection. Verification uses cosine similarity to a small enrollment prototype, and the method outperforms all sequence-based baselines on DeepSignDB and BiosecurID.

What carries the argument

Six-channel asymmetric Gramian Angular Field image encoding of kinematic sequences, processed by a dual-branch ConvNeXt-Tiny encoder with bidirectional cross-attention.

If this is right

The representational gain of 2D temporal encoding is consistent and independent of training procedure.
Ablations show measurable contribution from the six-channel design, dual-branch processing, and cross-attention.
Pretrained 2D vision backbones become directly applicable to online signature verification.
Cosine similarity verification against small enrollment prototypes works reliably under high intra-class variability.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

The same angular-field encoding could be tested on other sequential biometric signals such as keystroke or gait data.
Larger vision backbones might widen the observed gap between image and sequence approaches.
End-to-end optimization of the field parameters instead of fixed GASF/GADF definitions is a natural next experiment.

Load-bearing premise

Converting raw 1D kinematic sequences into six-channel asymmetric GASF and GADF images preserves all discriminative information without loss.

What would settle it

Training and evaluating an otherwise identical network directly on the raw 1D temporal sequences and finding equal or higher verification accuracy than the six-channel image version on DeepSignDB.

Figures

Figures reproduced from arXiv: 2605.00120 by Himanshu Singhal, Suresh Sundaram.

**Figure 2.** Figure 2: Overview of the proposed dual-branch architecture for online signature verification. Three kinematic signals: speed v(t), pressure derivative p˙(t), and direction angle θ(t), are encoded into GASF and GADF images, forming two 3-channel inputs processed by independent ConvNeXt-Tiny backbones. The resulting feature maps are tokenized, refined through intra-branch self-attention, and fused via bidirectional c… view at source ↗

**Figure 3.** Figure 3: Illustration of the semi-hard triplet loss, where loss is [PITH_FULL_IMAGE:figures/full_fig_p004_3.png] view at source ↗

read the original abstract

Online signature verification (OSV) requires distinguishing skilled forgeries from genuine samples under high intra-class variability and with very few enrollment samples. Existing deep learning methods operate directly on raw temporal sequences, restricting them to 1D architectures and preventing the use of pretrained 2D vision backbones. We bridge this gap with GAFSV-Net, which represents each signature as a six-channel asymmetric Gramian Angular Field image: three kinematic channels (pen speed, pressure derivative, direction angle) are each encoded into complementary GASF and GADF matrices that capture pairwise temporal co-occurrence and directional transition structure respectively. A dual-branch ConvNeXt-Tiny encoder processes GASF and GADF independently, with bidirectional cross-attention enabling each branch to query discriminative patterns from the other before metric-space projection. Training uses semi-hard triplet loss with skilled-forgery hard-negative injection; verification is performed via cosine similarity against a small enrollment prototype. We evaluate on DeepSignDB and BiosecurID, outperforming all sequence-based baselines trained under identical objectives, demonstrating that the representational gain of 2D temporal encoding is consistent and independent of training procedure, with ablations characterising each design choice's contribution.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Desk Editor's Note private letter to a colleague

GAFSV-Net turns three kinematic channels from online signatures into six-channel asymmetric GAF images so 2D vision models can be used directly, and the transform preserves the original series.

read the letter

Hi, the main thing here is that the authors convert online signature time series into six-channel images via asymmetric Gramian Angular Field encoding on speed, pressure derivative, and direction angle. This lets them run a dual-branch ConvNeXt-Tiny with bidirectional cross-attention and semi-hard triplet loss that includes skilled forgeries as negatives. They report better results than sequence baselines on DeepSignDB and BiosecurID, plus ablations for each piece. The representation is new for this task and the math checks out because GAF is invertible for normalized univariate data, so nothing is lost by moving to 2D. That removes the usual objection to image-based time-series work. The claim that the gain holds independent of training procedure is also reasonable since they retrain the baselines under the same loss. The design choices look deliberate rather than arbitrary. The main soft spot is that the abstract gives no numbers, error bars, or split details, so the size of the improvement is still unclear from what we have. If the full paper has those tables and they hold up, the rest follows. Minor point: picking exactly those three channels is sensible for signatures but could be justified more explicitly. This is useful for anyone working on biometric verification or on turning 1D sequences into images for pretrained vision backbones. A reader who needs a practical way to apply 2D models to few-sample signature data would get concrete value from the method and the experiments. I would send it to peer review; the core contribution is grounded and the pipeline is reproducible enough to evaluate properly.

Referee Report

0 major / 3 minor

Summary. The manuscript introduces GAFSV-Net for online signature verification. It encodes each signature's 1D kinematic time series (pen speed, pressure derivative, direction angle) into a six-channel asymmetric Gramian Angular Field image by computing complementary GASF and GADF matrices per channel. These images are processed by a dual-branch ConvNeXt-Tiny encoder with bidirectional cross-attention, trained via semi-hard triplet loss with skilled-forgery hard-negative injection. Verification uses cosine similarity against a small enrollment prototype. The central claim is that this 2D temporal encoding yields consistent outperformance over sequence-based baselines on DeepSignDB and BiosecurID, with the gain independent of training procedure and ablations quantifying each design choice.

Significance. If the reported results hold, the work is significant because it demonstrates that invertible 2D image encodings of temporal data enable effective use of pretrained 2D vision backbones (ConvNeXt-Tiny) in a domain previously restricted to 1D architectures, producing gains that are robust across training procedures. The explicit invertibility of GASF/GADF (recoverable from the diagonal or via arccos/arccsin) and the ablation of standard components (cross-attention, triplet loss) provide a clear, falsifiable basis for the representational advantage. This could extend to other kinematic or time-series verification tasks and encourages exploration of image-based encodings for sequential biometrics.

minor comments (3)

Abstract: the claim of 'consistent outperformance' and 'ablations characterising each design choice' would be strengthened by including at least one key quantitative result (e.g., EER reduction on DeepSignDB) and a brief statement of dataset splits or statistical testing.
The six-channel construction is described clearly, but a small illustrative figure showing the GASF/GADF encoding of a single kinematic channel would improve accessibility for readers unfamiliar with Gramian Angular Fields.
Section on verification procedure: the enrollment prototype construction and cosine-similarity threshold selection should be stated more explicitly to allow exact reproduction.

Simulated Author's Rebuttal

0 responses · 0 unresolved

We thank the referee for the positive assessment of our manuscript, the accurate summary of GAFSV-Net, and the recommendation for minor revision. The significance discussion correctly identifies the core contribution of invertible 2D encodings enabling pretrained vision backbones for online signature verification. No major comments were raised in the report.

Circularity Check

0 steps flagged

No significant circularity detected

full rationale

The paper's core contribution is an empirical representational change: converting 1D kinematic signature sequences into six-channel asymmetric GASF/GADF images to enable 2D vision backbones, with performance gains measured via direct comparison to sequence baselines on DeepSignDB and BiosecurID under matched training objectives, plus ablations of design choices. No equations, derivations, or load-bearing steps reduce any claimed result to a fitted parameter, self-definition, or self-citation chain by construction. The invertibility of the GAF transform is noted but does not create circularity, as the benefit is externally validated rather than assumed.

Axiom & Free-Parameter Ledger

1 free parameters · 1 axioms · 0 invented entities

The central claim rests on the unproven premise that the chosen GAF encoding is information-preserving for signature discrimination and that the dual-branch cross-attention architecture adds value beyond a single-branch baseline; these are domain assumptions not independently validated in the abstract.

free parameters (1)

selection of three kinematic channels
Pen speed, pressure derivative, and direction angle are chosen without stated justification or ablation against other possible features.

axioms (1)

domain assumption Gramian Angular Field matrices capture pairwise temporal co-occurrence and directional transitions that are more discriminative than raw sequences for skilled forgery detection
Invoked when the paper states that GASF and GADF encode complementary structure enabling 2D vision backbones.

pith-pipeline@v0.9.0 · 5511 in / 1456 out tokens · 38029 ms · 2026-05-09T21:04:36.693093+00:00 · methodology

discussion (0)

Reference graph

Works this paper leans on

26 extracted references · 4 canonical work pages · 1 internal anchor

[1]

S. Bai, J. Z. Kolter, and V . Koltun. An empirical evaluation of generic convolutional and recurrent networks for sequence modeling.arXiv preprint arXiv:1803.01271, 2018

work page internal anchor Pith review arXiv 2018
[2]

T. Chen, S. Kornblith, M. Norouzi, and G. Hinton. A simple framework for contrastive learning of visual representations. InInternational Conference on Machine Learning, pages 1597–1607. PMLR, 2020

2020
[3]

Fierrez, J

J. Fierrez, J. Galbally, J. Ortega-Garcia, M. R. Freire, F. Alonso-Fernandez, D. Ramos, D. T. Toledano, J. Gonzalez- Rodriguez, J.-L. Siguero, S. Garcia-Salicetti, et al. Biose- curID: a multimodal biometric database.Pattern Analysis and Applications, 13(2):235–246, 2010

2010
[4]

Fierrez, J

J. Fierrez, J. Ortega-Garcia, D. Ramos, and J. Gonzalez- Rodriguez. Hmm-based on-line signature verification: Fea- ture extraction and signature modeling.Pattern Recognition Letters, 28(16):2325–2334, 2007

2007
[5]

Moment: A family of open time-series foundation models

M. Goswami, K. Szafer, A. Choudhry, Y . Cai, S. Li, and A. Dubrawski. Moment: A family of open time-series foun- dation models.arXiv preprint arXiv:2402.03885, 2024

work page arXiv 2024
[6]

Hatami, Y

N. Hatami, Y . Gavet, and J. Debayle. Classification of time- series images using deep convolutional neural networks. In Proceedings of SPIE—Tenth International Conference on Ma- chine Vision, volume 10696, page 106960Y , 2018

2018
[7]

Impedovo and G

D. Impedovo and G. Pirlo. Automatic signature verification: The state of the art.IEEE Transactions on Systems, Man, and Cybernetics, Part C (Applications and Reviews), 38(5):609– 635, 2008

2008
[8]

Kholmatov and B

A. Kholmatov and B. Yanikoglu. Identity authentication using improved online signature verification method.Pattern Recognition Letters, 26(15):2400–2408, 2005

2005
[9]

Lai and L

S. Lai and L. Jin. Recurrent neural network for online signa- ture verification.arXiv preprint arXiv:2002.10119, 2020

work page arXiv 2002
[10]

Z. Liu, H. Mao, C.-Y . Wu, C. Feichtenhofer, T. Darrell, and S. Xie. A ConvNet for the 2020s. InProceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pages 11976–11986, 2022

2022
[11]

Melzi, R

P. Melzi, R. Vera-Rodriguez, R. Tolosana, and J. Fierrez. Exploring transformers for on-line handwritten signature ver- ification.arXiv preprint arXiv:2307.10532, 2023

work page arXiv 2023
[12]

M. E. Munich and P. Perona. Continuous dynamic time warp- ing for translation-invariant curve alignment with applications to signature verification. InProceedings of the Seventh In- ternational Conference on Computer Vision, pages 108–115. IEEE, 1999

1999
[13]

Muramatsu and T

D. Muramatsu and T. Matsumoto. An hmm on-line signature verification algorithm. InProceedings of the 4th International Conference on Audio- and Video-Based Biometric Person Au- thentication, A VBPA’03, page 233–241, Berlin, Heidelberg,
[14]

On the exploration of information from the dtw cost matrix for online signature verification.IEEE Transactions on Cybernetics, 48(2):611– 624, 2018

Sharma, Abhishek and Sundaram, Suresh. On the exploration of information from the dtw cost matrix for online signature verification.IEEE Transactions on Cybernetics, 48(2):611– 624, 2018

2018
[15]

Tolosana, R

R. Tolosana, R. Vera-Rodriguez, J. Fierrez, and J. Ortega- Garcia. Exploring recurrent neural networks for on-line hand- written signature biometrics.IEEE Access, 6:5128–5138, 2018

2018
[16]

Tolosana, R

R. Tolosana, R. Vera-Rodriguez, J. Fierrez, and J. Ortega- Garcia. Online signature verification based on a single tem- plate via elastic subsequence matching.IET Biometrics, 8(1):37–46, 2019

2019
[17]

Tolosana, R

R. Tolosana, R. Vera-Rodriguez, J. Fierrez, and J. Ortega- Garcia. DeepSign: Deep on-line signature verification.IEEE Transactions on Biometrics, Behavior, and Identity Science, 3(2):229–239, 2021

2021
[18]

Tolosana, R

R. Tolosana, R. Vera-Rodriguez, J. Fierrez, and J. Ortega- Garcia. DeepSignDB: A large-scale database for online hand- written signature biometric verification.Pattern Recognition Letters, 150:112–120, 2021

2021
[19]

Vaswani, N

A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, and I. Polosukhin. Attention is all you need.Advances in Neural Information Processing Systems, 30, 2017

2017
[20]

C. S. V orugunti, A. Gautam, and V . Pulabaigari. A hybrid transformer and convolution signature network for online signature verification. InProceedings of the International Joint Conference on Biometrics (IJCB), pages 1–9. IEEE, 2023

2023
[21]

C. S. V orugunti, A. Gautam, and V . Pulabaigari. OSVCon- Tramer: A hybrid CNN and transformer based online signa- ture verification. InProceedings of the International Joint Conference on Biometrics (IJCB), pages 1–10. IEEE, 2023

2023
[22]

C. S. V orugunti and V . Pulabaigari. OSVNet: Convolutional siamese network for writer independent online signature veri- fication. InProceedings of the International Conference on Document Analysis and Recognition (ICDAR), pages 1470–
[23]

Wang and P

T. Wang and P. Isola. Understanding contrastive representa- tion learning through alignment and uniformity on the hyper- sphere. InInternational Conference on Machine Learning, pages 9929–9939. PMLR, 2020

2020
[24]

Wang and T

Z. Wang and T. Oates. Encoding time series as images for visual inspection and classification using tiled convolutional neural networks. InWorkshops at the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015

2015
[25]

P. Wei, H. Li, and P. Hu. Inverse discriminative networks for handwritten signature verification. In2019 IEEE/CVF Con- ference on Computer Vision and Pattern Recognition (CVPR), pages 5757–5765, 2019

2019
[26]

Wightman

R. Wightman. PyTorch Image Models. https://github. com/rwightman/pytorch-image-models, 2019

2019