pith. machine review for the scientific record. sign in

arxiv: 1602.00753 · v1 · submitted 2016-02-02 · 💻 cs.AI · cs.CV

Recognition: unknown

Are Elephants Bigger than Butterflies? Reasoning about Sizes of Objects

Authors on Pith no claims yet
classification 💻 cs.AI cs.CV
keywords sizeinformationsizesvisualhumanmethodobjectsreasoning
0
0 comments X
read the original abstract

Human vision greatly benefits from the information about sizes of objects. The role of size in several visual reasoning tasks has been thoroughly explored in human perception and cognition. However, the impact of the information about sizes of objects is yet to be determined in AI. We postulate that this is mainly attributed to the lack of a comprehensive repository of size information. In this paper, we introduce a method to automatically infer object sizes, leveraging visual and textual information from web. By maximizing the joint likelihood of textual and visual observations, our method learns reliable relative size estimates, with no explicit human supervision. We introduce the relative size dataset and show that our method outperforms competitive textual and visual baselines in reasoning about size comparisons.

This paper has not been read by Pith yet.

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Forward citations

Cited by 1 Pith paper

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

  1. Language Is Not All You Need: Aligning Perception with Language Models

    cs.CL 2023-02 conditional novelty 7.0

    Kosmos-1 shows strong zero-shot and few-shot results on language tasks, image captioning, visual QA, OCR-free document understanding, and image recognition guided by text instructions.