ReLIC-SGG: Relation Lattice Completion for Open-Vocabulary Scene Graph Generation
Pith reviewed 2026-05-14 21:27 UTC · model grok-4.3
The pith
ReLIC-SGG infers missing relations in open-vocabulary scene graphs using a semantic predicate lattice.
A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.
Core claim
ReLIC-SGG builds a semantic relation lattice to model similarity, entailment, and contradiction among open-vocabulary predicates, and uses it to infer missing positive relations from visual-language compatibility, graph context, and semantic consistency. A positive-unlabeled graph learning objective reduces false-negative supervision, while lattice-guided decoding produces compact and semantically consistent scene graphs.
What carries the argument
The semantic relation lattice that encodes relationships like similarity, entailment, and contradiction between predicates to infer and complete missing positive relations.
If this is right
- Improves recognition of rare and unseen predicates on benchmarks.
- Recovers more missing relations in conventional, open-vocabulary, and panoptic SGG tasks.
- Reduces the impact of false-negative labels during training via positive-unlabeled learning.
- Generates scene graphs that are more compact and semantically consistent through guided decoding.
Where Pith is reading between the lines
- This method may generalize to other vision tasks where labels are incomplete, such as action recognition or visual question answering.
- Semantic lattices could be combined with large language models to dynamically expand relation vocabularies.
- Testing on datasets with varying annotation densities would reveal how much the lattice compensates for human annotation gaps.
Load-bearing premise
That the semantic relation lattice plus visual-language compatibility and graph context can reliably distinguish true missing positives from true negatives without introducing more errors than it removes.
What would settle it
A manual inspection or human evaluation of inferred relations showing that more than half of the newly added relations are incorrect or that overall graph accuracy decreases.
Figures
read the original abstract
Open-vocabulary scene graph generation (SGG) aims to describe visual scenes with flexible relation phrases beyond a fixed predicate set. Existing methods usually treat annotated triplets as positives and all unannotated object-pair relations as negatives. However, scene graph annotations are inherently incomplete: many valid relations are missing, and the same interaction can be described at different granularities, e.g., \textit{on}, \textit{standing on}, \textit{resting on}, and \textit{supported by}. This issue becomes more severe in open-vocabulary SGG due to the much larger relation space. We propose \textbf{ReLIC-SGG}, a relation-incompleteness-aware framework that treats unannotated relations as latent variables rather than definite negatives. ReLIC-SGG builds a semantic relation lattice to model similarity, entailment, and contradiction among open-vocabulary predicates, and uses it to infer missing positive relations from visual-language compatibility, graph context, and semantic consistency. A positive-unlabeled graph learning objective further reduces false-negative supervision, while lattice-guided decoding produces compact and semantically consistent scene graphs. Experiments on conventional, open-vocabulary, and panoptic SGG benchmarks show that ReLIC-SGG improves rare and unseen predicate recognition and better recovers missing relations.
Editorial analysis
A structured set of objections, weighed in public.
Referee Report
Summary. ReLIC-SGG proposes a relation-incompleteness-aware framework for open-vocabulary scene graph generation. It treats unannotated object-pair relations as latent variables rather than negatives, constructs a semantic relation lattice to model predicate similarity/entailment/contradiction, and infers missing positives from visual-language compatibility, graph context, and semantic consistency. A positive-unlabeled learning objective is introduced, and lattice-guided decoding is used to produce consistent graphs. Experiments on conventional, open-vocabulary, and panoptic SGG benchmarks are reported to show gains on rare and unseen predicates plus better recovery of missing relations.
Significance. If the lattice-based inference is shown to add more true positives than false positives, the work would meaningfully address a core limitation of SGG datasets (incomplete annotations) and improve open-vocabulary predicate recognition. The combination of semantic lattice modeling with positive-unlabeled learning is a targeted response to granularity and missing-relation issues that prior methods largely ignore.
major comments (2)
- [§3] §3 (Method): The central claim that the semantic relation lattice reliably infers missing positives rests on the assumption that lattice similarity plus visual-language and graph context distinguish true missing relations from true negatives. No quantitative validation (e.g., precision of inferred relations on a held-out verified subset) is provided to confirm net error reduction, which is load-bearing for the reported gains on rare/unseen predicates.
- [§4] §4 (Experiments): Improvements on rare and unseen predicates are reported across benchmarks, but the results lack ablations that isolate the lattice inference component from the positive-unlabeled objective and decoding strategy. Without these controls it is unclear whether the lattice itself drives the claimed recovery of missing relations.
minor comments (2)
- [§3.1] Notation for the lattice construction (e.g., how entailment scores are computed for arbitrary open-vocabulary phrases) should be formalized with explicit equations rather than descriptive text.
- [§4] Figure captions and table headers would benefit from explicit statements of which metrics are computed only on annotated positives versus on the full inferred graph.
Simulated Author's Rebuttal
We thank the referee for the constructive and insightful comments on our manuscript. We address each major comment point-by-point below, providing clarifications on our design choices and outlining the revisions we plan to incorporate.
read point-by-point responses
-
Referee: [§3] §3 (Method): The central claim that the semantic relation lattice reliably infers missing positives rests on the assumption that lattice similarity plus visual-language and graph context distinguish true missing relations from true negatives. No quantitative validation (e.g., precision of inferred relations on a held-out verified subset) is provided to confirm net error reduction, which is load-bearing for the reported gains on rare/unseen predicates.
Authors: We agree that a direct quantitative validation of the lattice inference precision on a held-out verified subset would provide stronger support for the claim of net error reduction. While the reported gains on rare and unseen predicates across multiple benchmarks provide indirect evidence that the combination of lattice similarity, visual-language compatibility, and graph context effectively recovers true positives, we acknowledge the value of explicit verification. In the revised manuscript, we will add an analysis on a manually verified subset of inferred relations, reporting precision to demonstrate that the lattice inference yields more true positives than false positives. revision: yes
-
Referee: [§4] §4 (Experiments): Improvements on rare and unseen predicates are reported across benchmarks, but the results lack ablations that isolate the lattice inference component from the positive-unlabeled objective and decoding strategy. Without these controls it is unclear whether the lattice itself drives the claimed recovery of missing relations.
Authors: We thank the referee for highlighting the need for clearer component isolation. The current results show overall improvements from the full framework, but we agree that targeted ablations would better attribute the gains to the lattice inference. In the revision, we will include additional ablation experiments that disable the lattice-based inference (while keeping the positive-unlabeled objective and lattice-guided decoding active) to isolate and quantify its specific contribution to recovering missing relations on rare and unseen predicates. revision: yes
Circularity Check
No significant circularity in derivation chain
full rationale
The paper introduces ReLIC-SGG by constructing a semantic relation lattice to model entailment and contradiction among open-vocabulary predicates, then infers latent positives via visual-language compatibility, graph context, and a positive-unlabeled learning objective. No equations, derivations, or self-citations are shown that reduce the claimed improvements in rare/unseen predicate recognition or missing-relation recovery to quantities defined by the method's own fitted inputs or prior self-referential results. The framework relies on independently motivated components evaluated against external benchmarks, rendering the central claims self-contained rather than circular by construction.
Axiom & Free-Parameter Ledger
axioms (1)
- domain assumption Unannotated object-pair relations can be treated as latent variables whose positive status is inferable from visual-language compatibility and lattice semantics.
invented entities (1)
-
semantic relation lattice
no independent evidence
Lean theorems connected to this paper
-
IndisputableMonolith/Foundation/AbsoluteFloorClosure.lean, IndisputableMonolith/Cost/FunctionalEquation.leanreality_from_one_distinction, washburn_uniqueness_aczel unclear?
unclearRelation between the paper passage and the cited Recognition theorem.
builds a semantic relation lattice to model similarity, entailment, and contradiction among open-vocabulary predicates, and uses it to infer missing positive relations from visual-language compatibility, graph context, and semantic consistency
What do these tags mean?
- matches
- The paper's claim is directly supported by a theorem in the formal canon.
- supports
- The theorem supports part of the paper's argument, but the paper may add assumptions or extra steps.
- extends
- The paper goes beyond the formal theorem; the theorem is a base layer rather than the whole result.
- uses
- The paper appears to rely on the theorem as machinery.
- contradicts
- The paper's claim conflicts with a theorem or certificate in the canon.
- unclear
- Pith found a possible connection, but the passage is too broad, indirect, or ambiguous to say the theorem truly supports the claim.
discussion (0)
Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.