hub

GeoNum: Bridging numerical continuity and language semantics via geometric embedding

Evgenii Nikishin, Remo Abachi, Rishabh Agarwal, Pierre-Luc Bacon · 2022 · DOI 10.1609/aaai

29 Pith papers cite this work. Polarity classification is still indexing.

29 Pith papers citing it

open at publisher browse 29 citing papers

hub tools

JSON dossier citing papers JSON publisher DOI

representative citing papers

Same Image, Different Meanings: Toward Retrieval of Context-Dependent Meanings

cs.IR · 2026-05-13 · unverdicted · novelty 7.0

Image meanings grow more context-dependent with semantic abstraction, requiring narrative grounding for accurate retrieval at higher levels.

IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback

cs.LG · 2026-05-12 · unverdicted · novelty 7.0

IGT-OMD reduces gradient transport error from quadratic to linear in delay length for delayed bilevel optimization and achieves sublinear regret with adaptive steps.

Large Language Models as Amortized Pareto-Front Generators for Constrained Bi-Objective Convex Optimization

cs.AI · 2026-05-12 · unverdicted · novelty 7.0

DIPS fine-tunes LLMs to output ordered feasible decision vectors approximating Pareto fronts for constrained bi-objective convex problems, reaching 95-98% normalized hypervolume with 0.16s inference.

Closing the Theory-Practice Gap in Spiking Transformers via Effective Dimension

cs.LG · 2026-04-17 · unverdicted · novelty 7.0

Spiking attention is a universal approximator of permutation-equivariant functions with ε-approximation requiring Ω(L_f² nd / ε²) spikes, but low effective dimensions (47-89) allow T=4 timesteps in practice.

NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions

cs.DB · 2026-04-13 · conditional · novelty 7.0

NL2SQLBench is a new modular benchmarking framework that evaluates LLM NL2SQL methods across three core modules on existing datasets, exposing large accuracy gaps and computational inefficiency.

Tencent Advertising Algorithm Challenge 2025: All-Modality Generative Recommendation

cs.IR · 2026-04-04 · accept · novelty 7.0

Releases TencentGR-1M and TencentGR-10M datasets with baselines for all-modality generative recommendation in advertising, including weighted evaluation for conversions.

No One Knows the State of the Art in Geospatial Foundation Models

cs.CV · 2026-05-12 · accept · novelty 6.0

An audit of 152 papers reveals that geospatial foundation models lack standardized evaluations, training controls, and weight releases, so no one knows the state of the art.

CauSim: Scaling Causal Reasoning with Increasingly Complex Causal Simulators

cs.AI · 2026-05-09 · unverdicted · novelty 6.0

CauSim turns scarce causal reasoning labels into scalable supervised data by having LLMs incrementally construct complex executable structural causal models.

NICE FACT: Diagnosing and Calibrating VLMs in Quantitative Reasoning for Kinematic Physics

cs.CV · 2026-05-08 · unverdicted · novelty 6.0

VLMs fail to identify visual preconditions or apply physical laws in kinematic physics tasks, as shown by new FACT diagnostics and NICE calibration methods evaluated on six state-of-the-art models.

Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits

cs.CL · 2026-05-07 · unverdicted · novelty 6.0

Probabilistic circuits detect LLM hallucinations as residual-stream anomalies with up to 99% AUROC and enable dynamic correction that raises truthfulness scores while cutting unnecessary output corruption.

Adversarial Graph Neural Network Benchmarks: Towards Practical and Fair Evaluation

cs.LG · 2026-05-07 · unverdicted · novelty 6.0

A large-scale standardized benchmark of GNN attacks and defenses reveals that target node selection and attacked-model training process can completely distort measured attack effectiveness.

Identifier-Free Code Embedding Models for Scalable Search

cs.CR · 2026-05-05 · unverdicted · novelty 6.0

A fine-tuned Qwen3-Embedding model with contrastive learning outperforms baselines on bidirectional source-to-decompiled code association and generalizes to constant-algorithm tasks.

SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On

cs.CV · 2026-05-02 · unverdicted · novelty 6.0

SIFT-VTON adds explicit geometric supervision from SIFT keypoints to diffusion-based virtual try-on to improve spatial alignment and detail preservation.

Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation

cs.CV · 2026-05-02 · unverdicted · novelty 6.0

CoE applies vision-language models directly to document screenshots to deliver pixel-level bounding-box attribution for evidence in iterative retrieval-augmented generation, outperforming text baselines on visual-layout tasks.

Tail allocation for conformal prediction intervals

stat.ME · 2026-04-28 · unverdicted · novelty 6.0

TA-CQR adaptively allocates miscoverage tails to produce shortest single-interval conformal prediction sets with exact marginal coverage and provides oracle inequalities for length.

Provably Adaptive Linear Approximation for the Shapley Value and Beyond

cs.LG · 2026-04-09 · unverdicted · novelty 6.0

Introduces Adalina, the first adaptive linear-time linear-space randomized algorithm for semi-value approximation with provable O(n/ε² log(1/δ)) query complexity and explicit MSE minimization.

Deep Image Clustering Based on Curriculum Learning and Density Information

cs.CV · 2026-03-31 · unverdicted · novelty 6.0

IDCL adds density-based curriculum learning and density-core guidance to deep image clustering, claiming superior robustness, faster convergence, and flexibility on benchmark datasets.

How Creatives Approach GenAI Image Generation: Tensions Between Structured Guidance, Self-Experimentation, and Creative Autonomy

cs.HC · 2026-05-11 · accept · novelty 5.0

Creatives prefer self-experimentation over structured guidance for GenAI image tools to preserve creative freedom, even when guidance aids AI literacy.

Signal Reshaping for GRPO in Weak-Feedback Agentic Code Repair

cs.AI · 2026-05-08 · unverdicted · novelty 5.0

Reshaping outcome rewards, process signals, and rollout comparability in GRPO raises strict compile-and-semantic accuracy in agentic code repair from 0.385 to 0.535 under weak feedback.

Architecture-agnostic Lipschitz-constant Bayesian header and its application to resolve semantically proximal classification errors with vision transformers

cs.CV · 2026-05-07 · unverdicted · novelty 5.0

LipB-ViT adds bi-Lipschitz Bayesian layers to vision transformers and uses uncertainty-aware fusion to identify corrupted labels with over 93% recall at 15% noise, beating kNN baselines.

Bias in the Loop: Auditing LLM-as-a-Judge for Software Engineering

cs.SE · 2026-04-18 · unverdicted · novelty 5.0

LLM judges for code tasks show high sensitivity to prompt biases that systematically favor certain options, changing accuracy and model rankings even when code is unchanged.

Consistency Analysis of Sentiment Predictions using Syntactic & Semantic Context Assessment Summarization (SSAS)

cs.CL · 2026-04-16 · unverdicted · novelty 5.0

SSAS improves LLM sentiment prediction consistency and data quality by up to 30% on three review datasets via syntactic and semantic context assessment summarization.

Weak-to-Strong Knowledge Distillation Accelerates Visual Learning

cs.CV · 2026-04-16 · unverdicted · novelty 5.0

Weak-to-strong knowledge distillation applied early and then turned off accelerates convergence to target performance in visual learning tasks by factors of 1.7-4.8x.

MyoVision: A Mobile Research Tool and NEATBoost-Attention Ensemble Framework for Real Time Chicken Breast Myopathy Detection

cs.LG · 2026-04-15 · unverdicted · novelty 5.0

Smartphone transillumination imaging paired with a neuroevolution-tuned ensemble model classifies chicken breast myopathies at 82.4% accuracy on 336 fillets, matching costly hyperspectral systems.

citing papers explorer

Showing 29 of 29 citing papers.

Same Image, Different Meanings: Toward Retrieval of Context-Dependent Meanings cs.IR · 2026-05-13 · unverdicted · none · ref 22
Image meanings grow more context-dependent with semantic abstraction, requiring narrative grounding for accurate retrieval at higher levels.
IGT-OMD: Implicit Gradient Transport for Decision-Focused Learning under Delayed Feedback cs.LG · 2026-05-12 · unverdicted · none · ref 13
IGT-OMD reduces gradient transport error from quadratic to linear in delay length for delayed bilevel optimization and achieves sublinear regret with adaptive steps.
Large Language Models as Amortized Pareto-Front Generators for Constrained Bi-Objective Convex Optimization cs.AI · 2026-05-12 · unverdicted · none · ref 22
DIPS fine-tunes LLMs to output ordered feasible decision vectors approximating Pareto fronts for constrained bi-objective convex problems, reaching 95-98% normalized hypervolume with 0.16s inference.
Closing the Theory-Practice Gap in Spiking Transformers via Effective Dimension cs.LG · 2026-04-17 · unverdicted · none · ref 23
Spiking attention is a universal approximator of permutation-equivariant functions with ε-approximation requiring Ω(L_f² nd / ε²) spikes, but low effective dimensions (47-89) allow T=4 timesteps in practice.
NL2SQLBench: A Modular Benchmarking Framework for LLM-Enabled NL2SQL Solutions cs.DB · 2026-04-13 · conditional · none · ref 75
NL2SQLBench is a new modular benchmarking framework that evaluates LLM NL2SQL methods across three core modules on existing datasets, exposing large accuracy gaps and computational inefficiency.
Tencent Advertising Algorithm Challenge 2025: All-Modality Generative Recommendation cs.IR · 2026-04-04 · accept · none · ref 77
Releases TencentGR-1M and TencentGR-10M datasets with baselines for all-modality generative recommendation in advertising, including weighted evaluation for conversions.
No One Knows the State of the Art in Geospatial Foundation Models cs.CV · 2026-05-12 · accept · none · ref 6
An audit of 152 papers reveals that geospatial foundation models lack standardized evaluations, training controls, and weight releases, so no one knows the state of the art.
CauSim: Scaling Causal Reasoning with Increasingly Complex Causal Simulators cs.AI · 2026-05-09 · unverdicted · none · ref 22
CauSim turns scarce causal reasoning labels into scalable supervised data by having LLMs incrementally construct complex executable structural causal models.
NICE FACT: Diagnosing and Calibrating VLMs in Quantitative Reasoning for Kinematic Physics cs.CV · 2026-05-08 · unverdicted · none · ref 8
VLMs fail to identify visual preconditions or apply physical laws in kinematic physics tasks, as shown by new FACT diagnostics and NICE calibration methods evaluated on six state-of-the-art models.
Hallucination as an Anomaly: Dynamic Intervention via Probabilistic Circuits cs.CL · 2026-05-07 · unverdicted · none · ref 10
Probabilistic circuits detect LLM hallucinations as residual-stream anomalies with up to 99% AUROC and enable dynamic correction that raises truthfulness scores while cutting unnecessary output corruption.
Adversarial Graph Neural Network Benchmarks: Towards Practical and Fair Evaluation cs.LG · 2026-05-07 · unverdicted · none · ref 16
A large-scale standardized benchmark of GNN attacks and defenses reveals that target node selection and attacked-model training process can completely distort measured attack effectiveness.
Identifier-Free Code Embedding Models for Scalable Search cs.CR · 2026-05-05 · unverdicted · none · ref 6
A fine-tuned Qwen3-Embedding model with contrastive learning outperforms baselines on bidirectional source-to-decompiled code association and generalizes to constant-algorithm tasks.
SIFT-VTON: Geometric Correspondence Supervision on Cross-Attention for Virtual Try-On cs.CV · 2026-05-02 · unverdicted · none · ref 26
SIFT-VTON adds explicit geometric supervision from SIFT keypoints to diffusion-based virtual try-on to improve spatial alignment and detail preservation.
Chain of Evidence: Pixel-Level Visual Attribution for Iterative Retrieval-Augmented Generation cs.CV · 2026-05-02 · unverdicted · none · ref 9
CoE applies vision-language models directly to document screenshots to deliver pixel-level bounding-box attribution for evidence in iterative retrieval-augmented generation, outperforming text baselines on visual-layout tasks.
Tail allocation for conformal prediction intervals stat.ME · 2026-04-28 · unverdicted · none · ref 8
TA-CQR adaptively allocates miscoverage tails to produce shortest single-interval conformal prediction sets with exact marginal coverage and provides oracle inequalities for length.
Provably Adaptive Linear Approximation for the Shapley Value and Beyond cs.LG · 2026-04-09 · unverdicted · none · ref 2
Introduces Adalina, the first adaptive linear-time linear-space randomized algorithm for semi-value approximation with provable O(n/ε² log(1/δ)) query complexity and explicit MSE minimization.
Deep Image Clustering Based on Curriculum Learning and Density Information cs.CV · 2026-03-31 · unverdicted · none · ref 40
IDCL adds density-based curriculum learning and density-core guidance to deep image clustering, claiming superior robustness, faster convergence, and flexibility on benchmark datasets.
How Creatives Approach GenAI Image Generation: Tensions Between Structured Guidance, Self-Experimentation, and Creative Autonomy cs.HC · 2026-05-11 · accept · none · ref 9
Creatives prefer self-experimentation over structured guidance for GenAI image tools to preserve creative freedom, even when guidance aids AI literacy.
Signal Reshaping for GRPO in Weak-Feedback Agentic Code Repair cs.AI · 2026-05-08 · unverdicted · none · ref 54
Reshaping outcome rewards, process signals, and rollout comparability in GRPO raises strict compile-and-semantic accuracy in agentic code repair from 0.385 to 0.535 under weak feedback.
Architecture-agnostic Lipschitz-constant Bayesian header and its application to resolve semantically proximal classification errors with vision transformers cs.CV · 2026-05-07 · unverdicted · none · ref 13
LipB-ViT adds bi-Lipschitz Bayesian layers to vision transformers and uses uncertainty-aware fusion to identify corrupted labels with over 93% recall at 15% noise, beating kNN baselines.
Bias in the Loop: Auditing LLM-as-a-Judge for Software Engineering cs.SE · 2026-04-18 · unverdicted · none · ref 20
LLM judges for code tasks show high sensitivity to prompt biases that systematically favor certain options, changing accuracy and model rankings even when code is unchanged.
Consistency Analysis of Sentiment Predictions using Syntactic & Semantic Context Assessment Summarization (SSAS) cs.CL · 2026-04-16 · unverdicted · none · ref 54
SSAS improves LLM sentiment prediction consistency and data quality by up to 30% on three review datasets via syntactic and semantic context assessment summarization.
Weak-to-Strong Knowledge Distillation Accelerates Visual Learning cs.CV · 2026-04-16 · unverdicted · none · ref 60
Weak-to-strong knowledge distillation applied early and then turned off accelerates convergence to target performance in visual learning tasks by factors of 1.7-4.8x.
MyoVision: A Mobile Research Tool and NEATBoost-Attention Ensemble Framework for Real Time Chicken Breast Myopathy Detection cs.LG · 2026-04-15 · unverdicted · none · ref 24
Smartphone transillumination imaging paired with a neuroevolution-tuned ensemble model classifies chicken breast myopathies at 82.4% accuracy on 336 fillets, matching costly hyperspectral systems.
Case-Grounded Evidence Verification: A Framework for Constructing Evidence-Sensitive Supervision cs.CL · 2026-04-10 · unverdicted · none · ref 43
A supervision construction procedure generates explicit support and controlled non-support examples (counterfactual and topic-related negatives) without manual annotation, producing verifiers that demonstrate genuine evidence dependence in radiology tasks.
MemCoT: Test-Time Scaling through Memory-Driven Chain-of-Thought cs.MA · 2026-04-09 · unverdicted · none · ref 56
MemCoT redefines long-context reasoning as iterative stateful search with zoom-in/zoom-out memory perception and dual short-term memories, claiming SOTA results on LoCoMo and LongMemEval-S benchmarks.
A Graph-Enhanced Defense Framework for Explainable Fake News Detection with LLM cs.CL · 2026-04-08 · unverdicted · none · ref 74
G-Defense builds claim-centered graphs from sub-claims, applies RAG for evidence and competing explanations, then uses graph inference to detect fake news veracity and generate intuitive explanation graphs, claiming SOTA results.
Democratizing the medieval English legal tradition cs.CV · 2026-05-01 · conditional · none · ref 12
A new dataset and open-source OCR pipeline transcribes medieval English legal manuscripts at up to 88% word accuracy using CNN+LSTM and language model correction.
PaliGemma: A versatile 3B VLM for transfer cs.CV · 2024-07-10 · unverdicted · none · ref 168
PaliGemma is an open 3B VLM based on SigLIP and Gemma that achieves strong performance on nearly 40 diverse open-world tasks including benchmarks, remote-sensing, and segmentation.

GeoNum: Bridging numerical continuity and language semantics via geometric embedding

hub tools

fields

years

verdicts

representative citing papers

citing papers explorer