pith. sign in

arxiv: 2604.22546 · v6 · pith:KYU76IKUnew · submitted 2026-04-24 · 💻 cs.CV

ReLIC-SGG: Relation Lattice Completion for Open-Vocabulary Scene Graph Generation

Pith reviewed 2026-05-14 21:27 UTC · model grok-4.3

classification 💻 cs.CV
keywords scene graph generationopen-vocabulary learningrelation latticeincomplete annotationspositive-unlabeled learningsemantic consistencyvisual language models
0
0 comments X

The pith

ReLIC-SGG infers missing relations in open-vocabulary scene graphs using a semantic predicate lattice.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

The paper addresses incomplete annotations in scene graph generation for open vocabularies, where many true relations go unlabeled and predicates can describe the same interaction at different levels. It introduces a framework that models unannotated object-pair relations as latent variables instead of assuming they are negatives. A semantic relation lattice is constructed to capture how predicates relate through similarity, entailment, and contradiction. This lattice helps infer which missing relations are likely true based on visual-language matches, surrounding graph context, and consistency rules, leading to improved handling of rare and unseen predicates.

Core claim

ReLIC-SGG builds a semantic relation lattice to model similarity, entailment, and contradiction among open-vocabulary predicates, and uses it to infer missing positive relations from visual-language compatibility, graph context, and semantic consistency. A positive-unlabeled graph learning objective reduces false-negative supervision, while lattice-guided decoding produces compact and semantically consistent scene graphs.

What carries the argument

The semantic relation lattice that encodes relationships like similarity, entailment, and contradiction between predicates to infer and complete missing positive relations.

If this is right

  • Improves recognition of rare and unseen predicates on benchmarks.
  • Recovers more missing relations in conventional, open-vocabulary, and panoptic SGG tasks.
  • Reduces the impact of false-negative labels during training via positive-unlabeled learning.
  • Generates scene graphs that are more compact and semantically consistent through guided decoding.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • This method may generalize to other vision tasks where labels are incomplete, such as action recognition or visual question answering.
  • Semantic lattices could be combined with large language models to dynamically expand relation vocabularies.
  • Testing on datasets with varying annotation densities would reveal how much the lattice compensates for human annotation gaps.

Load-bearing premise

That the semantic relation lattice plus visual-language compatibility and graph context can reliably distinguish true missing positives from true negatives without introducing more errors than it removes.

What would settle it

A manual inspection or human evaluation of inferred relations showing that more than half of the newly added relations are incorrect or that overall graph accuracy decreases.

Figures

Figures reproduced from arXiv: 2604.22546 by Amir Hosseini, Sara Farahani, Suiyang Guang, Xinyi Li.

Figure 1
Figure 1. Figure 1: Overall framework of ReLIC-SGG. Given detected objects, the model first proposes open-vocabulary relation candidates for each view at source ↗
read the original abstract

Open-vocabulary scene graph generation (SGG) aims to describe visual scenes with flexible relation phrases beyond a fixed predicate set. Existing methods usually treat annotated triplets as positives and all unannotated object-pair relations as negatives. However, scene graph annotations are inherently incomplete: many valid relations are missing, and the same interaction can be described at different granularities, e.g., \textit{on}, \textit{standing on}, \textit{resting on}, and \textit{supported by}. This issue becomes more severe in open-vocabulary SGG due to the much larger relation space. We propose \textbf{ReLIC-SGG}, a relation-incompleteness-aware framework that treats unannotated relations as latent variables rather than definite negatives. ReLIC-SGG builds a semantic relation lattice to model similarity, entailment, and contradiction among open-vocabulary predicates, and uses it to infer missing positive relations from visual-language compatibility, graph context, and semantic consistency. A positive-unlabeled graph learning objective further reduces false-negative supervision, while lattice-guided decoding produces compact and semantically consistent scene graphs. Experiments on conventional, open-vocabulary, and panoptic SGG benchmarks show that ReLIC-SGG improves rare and unseen predicate recognition and better recovers missing relations.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. ReLIC-SGG proposes a relation-incompleteness-aware framework for open-vocabulary scene graph generation. It treats unannotated object-pair relations as latent variables rather than negatives, constructs a semantic relation lattice to model predicate similarity/entailment/contradiction, and infers missing positives from visual-language compatibility, graph context, and semantic consistency. A positive-unlabeled learning objective is introduced, and lattice-guided decoding is used to produce consistent graphs. Experiments on conventional, open-vocabulary, and panoptic SGG benchmarks are reported to show gains on rare and unseen predicates plus better recovery of missing relations.

Significance. If the lattice-based inference is shown to add more true positives than false positives, the work would meaningfully address a core limitation of SGG datasets (incomplete annotations) and improve open-vocabulary predicate recognition. The combination of semantic lattice modeling with positive-unlabeled learning is a targeted response to granularity and missing-relation issues that prior methods largely ignore.

major comments (2)
  1. [§3] §3 (Method): The central claim that the semantic relation lattice reliably infers missing positives rests on the assumption that lattice similarity plus visual-language and graph context distinguish true missing relations from true negatives. No quantitative validation (e.g., precision of inferred relations on a held-out verified subset) is provided to confirm net error reduction, which is load-bearing for the reported gains on rare/unseen predicates.
  2. [§4] §4 (Experiments): Improvements on rare and unseen predicates are reported across benchmarks, but the results lack ablations that isolate the lattice inference component from the positive-unlabeled objective and decoding strategy. Without these controls it is unclear whether the lattice itself drives the claimed recovery of missing relations.
minor comments (2)
  1. [§3.1] Notation for the lattice construction (e.g., how entailment scores are computed for arbitrary open-vocabulary phrases) should be formalized with explicit equations rather than descriptive text.
  2. [§4] Figure captions and table headers would benefit from explicit statements of which metrics are computed only on annotated positives versus on the full inferred graph.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive and insightful comments on our manuscript. We address each major comment point-by-point below, providing clarifications on our design choices and outlining the revisions we plan to incorporate.

read point-by-point responses
  1. Referee: [§3] §3 (Method): The central claim that the semantic relation lattice reliably infers missing positives rests on the assumption that lattice similarity plus visual-language and graph context distinguish true missing relations from true negatives. No quantitative validation (e.g., precision of inferred relations on a held-out verified subset) is provided to confirm net error reduction, which is load-bearing for the reported gains on rare/unseen predicates.

    Authors: We agree that a direct quantitative validation of the lattice inference precision on a held-out verified subset would provide stronger support for the claim of net error reduction. While the reported gains on rare and unseen predicates across multiple benchmarks provide indirect evidence that the combination of lattice similarity, visual-language compatibility, and graph context effectively recovers true positives, we acknowledge the value of explicit verification. In the revised manuscript, we will add an analysis on a manually verified subset of inferred relations, reporting precision to demonstrate that the lattice inference yields more true positives than false positives. revision: yes

  2. Referee: [§4] §4 (Experiments): Improvements on rare and unseen predicates are reported across benchmarks, but the results lack ablations that isolate the lattice inference component from the positive-unlabeled objective and decoding strategy. Without these controls it is unclear whether the lattice itself drives the claimed recovery of missing relations.

    Authors: We thank the referee for highlighting the need for clearer component isolation. The current results show overall improvements from the full framework, but we agree that targeted ablations would better attribute the gains to the lattice inference. In the revision, we will include additional ablation experiments that disable the lattice-based inference (while keeping the positive-unlabeled objective and lattice-guided decoding active) to isolate and quantify its specific contribution to recovering missing relations on rare and unseen predicates. revision: yes

Circularity Check

0 steps flagged

No significant circularity in derivation chain

full rationale

The paper introduces ReLIC-SGG by constructing a semantic relation lattice to model entailment and contradiction among open-vocabulary predicates, then infers latent positives via visual-language compatibility, graph context, and a positive-unlabeled learning objective. No equations, derivations, or self-citations are shown that reduce the claimed improvements in rare/unseen predicate recognition or missing-relation recovery to quantities defined by the method's own fitted inputs or prior self-referential results. The framework relies on independently motivated components evaluated against external benchmarks, rendering the central claims self-contained rather than circular by construction.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 1 invented entities

Framework rests on the assumption that a hand-constructed or learned semantic lattice accurately encodes entailment and contradiction among arbitrary predicates; no independent verification of lattice quality is described.

axioms (1)
  • domain assumption Unannotated object-pair relations can be treated as latent variables whose positive status is inferable from visual-language compatibility and lattice semantics.
    Core modeling choice stated in abstract; if false, the positive-unlabeled objective collapses.
invented entities (1)
  • semantic relation lattice no independent evidence
    purpose: Model similarity, entailment, and contradiction among open-vocabulary predicates to infer missing positives.
    New structure introduced to guide inference; independent evidence of its accuracy is not provided in abstract.

pith-pipeline@v0.9.0 · 5527 in / 1139 out tokens · 31234 ms · 2026-05-14T21:27:15.145639+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Lean theorems connected to this paper

Citations machine-checked in the Pith Canon. Every link opens the source theorem in the public Lean library.

What do these tags mean?
matches
The paper's claim is directly supported by a theorem in the formal canon.
supports
The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
extends
The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
uses
The paper appears to rely on the theorem as machinery.
contradicts
The paper's claim conflicts with a theorem or certificate in the canon.
unclear
Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.