pith. sign in

Learning Graphical Models of Images, Videos and Their Spatial Transformations

2 Pith papers cite this work. Polarity classification is still indexing.

2 Pith papers citing it
abstract

Mixtures of Gaussians, factor analyzers (probabilistic PCA) and hidden Markov models are staples of static and dynamic data modeling and image and video modeling in particular. We show how topographic transformations in the input, such as translation and shearing in images, can be accounted for in these models by including a discrete transformation variable. The resulting models perform clustering, dimensionality reduction and time-series analysis in a way that is invariant to transformations in the input. Using the EM algorithm, these transformation-invariant models can be fit to static data and time series. We give results on filtering microscopy images, face and facial pose clustering, handwritten digit modeling and recognition, video clustering, object tracking, and removal of distractions from video sequences.

fields

cs.CV 2

years

2026 2

verdicts

UNVERDICTED 2

clear filters

representative citing papers

Perceptual 3D Simulation With Physical World Modeling

cs.CV · 2026-06-25 · unverdicted · novelty 5.0

P3Sim integrates a probabilistic physical world model with geometric conditioning and persistent memory to simulate 3D scenes under partial observations and incomplete transforms.

citing papers explorer

Showing 2 of 2 citing papers after filters.

  • Perceptual 3D Simulation With Physical World Modeling cs.CV · 2026-06-25 · unverdicted · none · ref 6 · internal anchor

    P3Sim integrates a probabilistic physical world model with geometric conditioning and persistent memory to simulate 3D scenes under partial observations and incomplete transforms.

  • Physical Object Understanding with a Physically Controllable World Model cs.CV · 2026-05-30 · unverdicted · none · ref 12 · internal anchor

    Autoregressive probabilistic world models trained on raw videos yield emergent object segmentation, 3D controllability, and physical relationship inference via multi-future motion correlation analysis.