pith. sign in

arxiv: 2606.26863 · v1 · pith:JQT5DJRNnew · submitted 2026-06-25 · 💻 cs.CV

Rolling Shutter Relative Pose Estimation Made Practical

Pith reviewed 2026-06-26 05:38 UTC · model grok-4.3

classification 💻 cs.CV
keywords rolling shutterrelative pose estimationaffine correspondencesalgebraic solvertwo-view geometryRANSACcomputer vision
0
0 comments X

The pith

Rolling shutter relative pose estimation becomes practical with a solver that needs only seven affine correspondences.

A machine-rendered reading of the paper's core claim, the machinery that carries it, and where it could break.

Rolling shutter cameras dominate consumer devices yet relative pose solvers have stayed impractical because they require at least twenty point matches, inflating the cost of robust estimation. The paper derives new RS-corrected affine constraints that incorporate the coupling between point perturbations and the row-dependent essential matrix, supplying two extra equations per correspondence. These constraints support a linearized algebraic solver that recovers pose and rolling-shutter motion parameters from seven affine correspondences by exploiting the small physical size of the rolling-shutter effects. The solver projects out twelve rolling-shutter unknowns via the null space and solves the resulting degree-twenty system with action matrices. If the approach holds, robust rolling-shutter pose estimation becomes fast enough for everyday use while also recovering translational velocity that point-only methods cannot reliably obtain.

Core claim

We derive RS-corrected affine constraints that account for the coupling between point perturbations and the row-dependent essential matrix, providing two equations per correspondence beyond the standard epipolar constraint. Building on these constraints, we develop a linearized algebraic solver that estimates pose and RS motion from only 7 ACs. The solver exploits the physical smallness of RS parameters to linearize the constraints, eliminates the 12 RS unknowns via null-space projection, and solves the remaining degree-20 system via action matrices in 1.2 ms. On the TUM RS benchmark the method achieves the best pose and RS parameter accuracy among tested solvers and supplies accurate transl

What carries the argument

RS-corrected affine constraints that couple point perturbations with the row-dependent essential matrix and supply two additional equations per correspondence.

If this is right

  • The solver achieves the best pose and RS parameter accuracy among all tested methods on the TUM RS benchmark.
  • It uniquely recovers accurate translational velocity estimates that remain poorly conditioned from point correspondences alone.
  • Accuracy remains comparable to the standard five-point algorithm when the same solver is applied to global-shutter data on EuRoC MAV.
  • Each solve completes in 1.2 milliseconds, making RANSAC-based robust estimation feasible.

Where Pith is reading between the lines

These are editorial extensions of the paper, not claims the author makes directly.

  • The linearization step may extend to other small-parameter camera models such as those with mild radial distortion.
  • Accurate per-frame velocity estimates could improve motion prediction inside rolling-shutter video pipelines or visual odometry.
  • Lower sample size in RANSAC could allow rolling-shutter geometry to be used inside real-time mobile SLAM systems.

Load-bearing premise

The physical size of rolling-shutter parameters is small enough that linearizing the constraints around them loses negligible accuracy.

What would settle it

Running the seven-affine-correspondence solver on the TUM RS benchmark and checking whether its pose, rolling-shutter parameter, and translational velocity errors are lower than those of prior rolling-shutter solvers while also matching five-point accuracy on the EuRoC global-shutter set.

Figures

Figures reproduced from arXiv: 2606.26863 by Daniel Barath.

Figure 1
Figure 1. Figure 1: Affine correspondences on a rolling shutter image pair [PITH_FULL_IMAGE:figures/full_fig_p002_1.png] view at source ↗
Figure 2
Figure 2. Figure 2: Rolling shutter two-view geometry. A 3D point P is observed at rows τ1 and τ2 in two cameras whose poses vary during readout. Each camera is parameterized by a reference pose (R, t) and per-frame angular and translational velocities (ωk, vk), yielding 17 degrees of freedom in total. Affine correspondences in geometry estimation. An AC augments a point match with the local 2 × 2 affine transformation betwee… view at source ↗
Figure 3
Figure 3. Figure 3: Synthetic solver evaluation. Median errors over 500 trials for four metrics (rows: rotation, translation, ω error, v error) as a function of three noise parameters (columns: point noise, affine noise, RS magnitude α). GS-5PC is excluded from the RS parameter rows as it does not estimate RS parameters. RS-7AC achieves pose accuracy competi￾tive with RS-44PC while using 6× fewer correspondences, and is the o… view at source ↗
read the original abstract

Rolling shutter (RS) cameras equip virtually all consumer devices, yet RS-aware relative pose estimation has remained impractical: the state-of-the-art solver requires a minimum of 20 point correspondences, making RANSAC-based robust estimation prohibitively expensive due to the exponential dependence of the iteration count on the sample size. We make RS relative pose estimation practical by introducing affine correspondences (ACs) into the RS two-view geometry. We derive novel \emph{RS-corrected affine constraints} that account for the coupling between point perturbations and the row-dependent essential matrix, providing two equations per correspondence beyond the standard epipolar constraint. Building on these constraints, we develop a linearized algebraic solver that estimates pose and RS motion from only 7 ACs. The solver exploits the physical smallness of RS parameters to linearize the constraints, eliminates the 12 RS unknowns via null-space projection, and solves the remaining degree-20 system via action matrices in 1.2\,ms. On the TUM RS benchmark, our method achieves the best pose and RS parameter accuracy among all tested methods and, uniquely among RS solvers, provides accurate translational velocity estimates -- which are poorly conditioned from point correspondences alone due to a $\vec{v}$-$\vec{t}$ coupling. On the global-shutter EuRoC MAV dataset, the solver achieves comparable accuracy to the standard 5-point algorithm, demonstrating that it generalizes well to the GS setting. Code is at https://github.com/danini/rolling_shutter_made_practical.

Editorial analysis

A structured set of objections, weighed in public.

Desk editor's note, referee report, simulated authors' rebuttal, and a circularity audit. Tearing a paper down is the easy half of reading it; the pith above is the substance, this is the friction.

Referee Report

2 major / 2 minor

Summary. The paper claims that rolling-shutter relative pose estimation can be made practical by introducing affine correspondences (ACs). It derives novel RS-corrected affine constraints that couple point perturbations to the row-dependent essential matrix, yielding two extra equations per AC beyond the epipolar constraint. These are linearized under the assumption that RS velocities are small, the 12 RS unknowns are eliminated by null-space projection, and the resulting degree-20 system is solved via action matrices from only 7 ACs in 1.2 ms. On the TUM RS benchmark the method reportedly outperforms prior RS solvers in pose and RS-parameter accuracy and uniquely recovers accurate translational velocity; on EuRoC it matches the standard 5-point algorithm.

Significance. If the linearization is valid within the operating regime of real RS sequences, the reduction from 20 to 7 correspondences would make robust RS-aware estimation computationally feasible for the first time, removing a long-standing practical barrier. The public code release and the reported ability to recover translational velocity (normally poorly conditioned from points alone) are concrete strengths that would be cited by follow-up work.

major comments (2)
  1. [§4] §4 (Linearized algebraic solver): the central linearization drops all quadratic and higher terms in the RS velocities after forming the RS-corrected affine constraints. No remainder bound, perturbation analysis, or numerical check of the neglected terms against correspondence noise is supplied, nor are the actual rotational/translational velocity magnitudes measured on the TUM sequences reported relative to the linearization threshold. Because the accuracy advantage and the translational-velocity recovery both rest on this approximation, the omission is load-bearing.
  2. [§5.2] §5.2 (TUM RS experiments): the claim that the method “achieves the best pose and RS parameter accuracy among all tested methods” is presented without an ablation that isolates the contribution of the linearization versus the use of ACs. A controlled comparison that re-linearizes a 20-point solver or that reports bias versus ground-truth velocity magnitude would be required to substantiate the superiority.
minor comments (2)
  1. [§3] The notation for the row-dependent essential matrix and the affine correction terms is introduced without an explicit summary table; a compact notation table would improve readability.
  2. [§4.3] The action-matrix construction is stated to be degree-20, but the precise monomial basis and the eigenvalue extraction step are not detailed; a short algorithmic box would help readers reproduce the 1.2 ms timing.

Simulated Author's Rebuttal

2 responses · 0 unresolved

We thank the referee for the constructive feedback. We address each major comment below and indicate the planned revisions.

read point-by-point responses
  1. Referee: [§4] §4 (Linearized algebraic solver): the central linearization drops all quadratic and higher terms in the RS velocities after forming the RS-corrected affine constraints. No remainder bound, perturbation analysis, or numerical check of the neglected terms against correspondence noise is supplied, nor are the actual rotational/translational velocity magnitudes measured on the TUM sequences reported relative to the linearization threshold. Because the accuracy advantage and the translational-velocity recovery both rest on this approximation, the omission is load-bearing.

    Authors: We agree that a formal perturbation analysis and remainder bound are absent from the current manuscript. The linearization is motivated by the physical smallness of RS velocities, which is supported by the method's performance on real sequences. In the revision we will add the measured rotational and translational velocity magnitudes on the TUM RS sequences together with a numerical check of the size of the neglected quadratic terms relative to typical correspondence noise. revision: yes

  2. Referee: [§5.2] §5.2 (TUM RS experiments): the claim that the method “achieves the best pose and RS parameter accuracy among all tested methods” is presented without an ablation that isolates the contribution of the linearization versus the use of ACs. A controlled comparison that re-linearizes a 20-point solver or that reports bias versus ground-truth velocity magnitude would be required to substantiate the superiority.

    Authors: The reported superiority is shown via direct comparison to existing point-based RS solvers on the benchmark. To isolate the linearization contribution we will add, in the revised manuscript, an ablation that reports bias as a function of ground-truth velocity magnitude on the TUM data and, where feasible, a controlled comparison against a re-linearized higher-point formulation. revision: yes

Circularity Check

0 steps flagged

Derivation chain is self-contained algebraic construction with no circular reductions

full rationale

The paper derives novel RS-corrected affine constraints from the geometry of row-dependent essential matrices and point perturbations, then applies a standard small-parameter linearization (explicitly justified by physical smallness of RS velocities), followed by null-space elimination of the 12 RS unknowns and reduction to a degree-20 action-matrix solver. None of these steps reduce the output to fitted inputs, self-citations, or renamed known results by construction; the linearization is an explicit approximation rather than a data-driven fit, and external benchmark validation (TUM RS, EuRoC) supplies independent content. No load-bearing self-citation or self-definitional loop is present in the provided derivation.

Axiom & Free-Parameter Ledger

0 free parameters · 1 axioms · 0 invented entities

The method rests on standard algebraic-geometry primitives (null-space projection, action matrices) plus one domain assumption about small rolling-shutter motion; no new entities or fitted constants are introduced in the abstract.

axioms (1)
  • domain assumption Physical smallness of RS parameters permits linearization of the constraints
    Invoked to obtain the linearized system solved by action matrices

pith-pipeline@v0.9.1-grok · 5790 in / 1140 out tokens · 28179 ms · 2026-06-26T05:38:11.083208+00:00 · methodology

discussion (0)

Sign in with ORCID, Apple, or X to comment. Anyone can read and Pith papers without signing in.

Reference graph

Works this paper leans on

34 extracted references · 22 canonical work pages

  1. [1]

    In: IEEE International Conference on Computer Vision (ICCV)

    Ait-Aider, O., Berry, F.: Structure and kinematics triangulation with a rolling shutter stereo rig. In: IEEE International Conference on Computer Vision (ICCV). pp. 1835–1840 (2009).https://doi.org/10.1109/ICCV.2009.5459336

  2. [2]

    IEEE Transactions on Pattern Analysis and Machine Intelligence42(6), 1439–1452 (2020).https://doi.org/10.1109/TPAMI.2019.2894395

    Albl, C., Kukelova, Z., Larsson, V., Pajdla, T.: Rolling shutter camera absolute pose. IEEE Transactions on Pattern Analysis and Machine Intelligence42(6), 1439–1452 (2020).https://doi.org/10.1109/TPAMI.2019.2894395

  3. [3]

    : Embedded phase shifting: Robust phase shifting with embedded signals

    Albl, C., Kukelova, Z., Pajdla, T.: R6P – rolling shutter absolute pose problem. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 2292–2300 (2015).https://doi.org/10.1109/CVPR.2015.7298842

  4. [5]

    arXiv preprint arXiv:2506.04803 (2025)

    Barath, D.: SupeRANSAC: One RANSAC to rule them all. arXiv preprint arXiv:2506.04803 (2025)

  5. [6]

    Pattern Recognition Letters94, 7–14 (2017).https://doi.org/10.1016/j.patrec.2017

    Barath, D., Hajder, L.: A theory of point-wise homography estimation. Pattern Recognition Letters94, 7–14 (2017).https://doi.org/10.1016/j.patrec.2017. 05.007 16 D. Barath

  6. [7]

    IEEE Transactions on Image Processing27(11), 5328–5337 (2018)

    Barath, D., Hajder, L.: Efficient recovery of essential matrix from two affine cor- respondences. IEEE Transactions on Image Processing27(11), 5328–5337 (2018). https://doi.org/10.1109/TIP.2018.2849866

  7. [8]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Barath, D., Matas, J.: Graph-cut RANSAC. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 6733–6741 (2018).https://doi. org/10.1109/CVPR.2018.00704

  8. [9]

    In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)

    Barath, D., Noskova, J., Ivashechkin, M., Matas, J.: MAGSAC++, a fast, reliable and accurate robust estimator. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1301–1309 (2020).https://doi.org/10. 1109/CVPR42600.2020.00138

  9. [10]

    In: De Bruijne, M., Cattin, P.C., Cotin, S., Padoy, N., Speidel, S., Zheng, Y., Essert, C

    Barath, D., Polic, M., Förstner, W., Sattler, T., Pajdla, T., Kukelova, Z.: Mak- ing affine correspondences work in camera geometry computation. In: European Conference on Computer Vision (ECCV). Lecture Notes in Computer Science, vol. 12356, pp. 723–740. Springer (2020).https://doi.org/10.1007/978-3-030- 58621-8_42

  10. [11]

    In: IEEE/CVF International Con- ference on Computer Vision (ICCV)

    Barroso-Laguna, A., Riba, E., Ponsa, D., Mikolajczyk, K.: Key.net: Keypoint de- tection by handcrafted and learned CNN filters. In: IEEE/CVF International Con- ference on Computer Vision (ICCV). pp. 5836–5844 (2019).https://doi.org/10. 1109/ICCV.2019.00593

  11. [12]

    Computer Vision and Image Understanding122, 105–114 (2014).https://doi

    Bentolila,J.,Francos,J.M.:Conicepipolarconstraintsfromaffinecorrespondences. Computer Vision and Image Understanding122, 105–114 (2014).https://doi. org/10.1016/j.cviu.2014.01.003

  12. [13]

    The International Journal of Robotics Research (IJRR)35(10), 1157–1163 (2016)

    Burri, M., Nikolic, J., Gohl, P., Schneider, T., Rehder, J., Omari, S., Achtelik, M.W., Siegwart, R.: The EuRoC micro aerial vehicle datasets. The International Journal of Robotics Research (IJRR)35(10), 1157–1163 (2016)

  13. [14]

    In: DAGM Sympo- sium on Pattern Recognition

    Chum, O., Matas, J., Kittler, J.: Locally optimized RANSAC. In: DAGM Sympo- sium on Pattern Recognition. Lecture Notes in Computer Science, vol. 2781, pp. 236–243. Springer (2003).https://doi.org/10.1007/978-3-540-45243-0_31

  14. [15]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Chum, O., Matas, J.: Matching with PROSAC – progressive sample consensus. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 220–226 (2005).https://doi.org/10.1109/CVPR.2005.221

  15. [16]

    A Piecewise Rotation of the Circle, IPR Maps and Their Connection with Translation Surfaces

    Cox,D.A.,Little,J.,O’Shea,D.:Ideals,Varieties,andAlgorithms:AnIntroduction to Computational Algebraic Geometry and Commutative Algebra. Undergraduate Texts in Mathematics, Springer, 4th edn. (2015).https://doi.org/10.1007/978- 3-319-16721-3

  16. [17]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Dai, Y., Li, H., Kneip, L.: Rolling shutter camera relative pose: Generalized epipo- lar geometry. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 4132–4140 (2016).https://doi.org/10.1109/CVPR.2016.448

  17. [18]

    In: IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR)

    Edstedt, J., Sun, Q., Bökman, G., Wadenbäck, M., Felsberg, M.: RoMa: Robust dense feature matching. In: IEEE/CVF Conference on Computer Vision and Pat- tern Recognition (CVPR). pp. 19790–19800 (2024).https://doi.org/10.1109/ CVPR52733.2024.01871

  18. [19]

    Communi- cations of the ACM24(6), 381–395 (1981).https://doi.org/10.1145/358669

    Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communi- cations of the ACM24(6), 381–395 (1981).https://doi.org/10.1145/358669. 358692

  19. [20]

    RLAIF-V: open-source AI feedback leads to super GPT-4V trustworthiness

    Hahn, M.A., Kohn, K., Marigliano, O., Pajdla, T.: Order-one rolling shutter cam- eras. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). pp. 27007–27016 (2025).https://doi.org/10.1109/CVPR52734.2025. 02515 RS Relative Pose Made Practical 17

  20. [21]

    In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR)

    Hedborg, J., Forssén, P.E., Felsberg, M., Ringaby, E.: Rolling shutter bundle adjustment. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 1434–1441 (2012).https://doi.org/10.1109/CVPR.2012.6247831

  21. [22]

    In: IEEE/CVF International Conference on Computer Vision (ICCV) (2025)

    Hruby, P., Pollefeys, M.: Single-scanline relative pose estimation for rolling shutter cameras. In: IEEE/CVF International Conference on Computer Vision (ICCV) (2025)

  22. [23]

    In: European Conference on Computer Vision (ECCV)

    Kukelova, Z., Bujnak, M., Pajdla, T.: Automatic generator of minimal problem solvers. In: European Conference on Computer Vision (ECCV). Lecture Notes in Computer Science, vol. 5304, pp. 302–315. Springer (2008)

  23. [24]

    arXiv preprint arXiv:1904.06770 (2019)

    Lee, C.R., Yoon, J.H., Park, M.G., Yoon, K.J.: Gyroscope-aided relative pose es- timation for rolling shutter cameras. arXiv preprint arXiv:1904.06770 (2019)

  24. [25]

    In: 6th Workshop on Omnidirectional Vision, Camera Networks and Non-Classical Cameras (OMNIVIS) (2005)

    Meingast, M., Geyer, C., Sastry, S.: Geometric models of rolling-shutter cameras. In: 6th Workshop on Omnidirectional Vision, Camera Networks and Non-Classical Cameras (OMNIVIS) (2005)

  25. [26]

    In- ternational Journal of Computer Vision60(1), 63–86 (2004).https://doi.org/ 10.1023/B:VISI.0000027790.02288.f2

    Mikolajczyk, K., Schmid, C.: Scale & affine invariant interest point detectors. In- ternational Journal of Computer Vision60(1), 63–86 (2004).https://doi.org/ 10.1023/B:VISI.0000027790.02288.f2

  26. [27]

    In: Advances in Neural Infor- mation Processing Systems (NeurIPS)

    Mishchuk, A., Mishkin, D., Radenović, F., Matas, J.: Working hard to know your neighbor’s margins: Local descriptor learning loss. In: Advances in Neural Infor- mation Processing Systems (NeurIPS). pp. 4826–4837 (2017)

  27. [28]

    Lecture Notes in Computer Science, vol

    Mishkin, D., Radenović, F., Matas, J.: Repeatability is not enough: Learning affine regionsviadiscriminability.In:EuropeanConferenceonComputerVision(ECCV). Lecture Notes in Computer Science, vol. 11213, pp. 284–300. Springer (2018)

  28. [29]

    IEEE Transactions on Pattern Analysis and Machine Intelligence26(6), 756–770 (2004)

    Nistér, D.: An efficient solution to the five-point relative pose problem. IEEE Transactions on Pattern Analysis and Machine Intelligence26(6), 756–770 (2004). https://doi.org/10.1109/TPAMI.2004.17

  29. [30]

    In: International Conference on Pattern Recognition (ICPR)

    Perdoch, M., Matas, J., Chum, O.: Epipolar geometry from two correspondences. In: International Conference on Pattern Recognition (ICPR). vol. 4, pp. 215–219 (2006).https://doi.org/10.1109/ICPR.2006.497

  30. [31]

    Deep Residual Learning for Image Recognition

    Raposo, C., Barreto, J.P.: Theory and practice of structure-from-motion using affine correspondences. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). pp. 5470–5478 (2016).https://doi.org/10.1109/CVPR. 2016.590

  31. [32]

    Vespignani, J

    Saurer, O., Pollefeys, M., Lee, G.H.: A minimal solution to the rolling shutter pose estimation problem. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 1328–1334 (2015).https://doi.org/10.1109/IROS. 2015.7353540

  32. [33]

    Scaling robot supervision to hundreds of hours with RoboTurk: Robotic manipulation dataset through human reasoning and dexterity,

    Schubert, D., Demmel, N., von Stumberg, L., Usenko, V., Cremers, D.: Rolling- shutter modelling for direct visual-inertial odometry. In: IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS). pp. 2462–2469 (2019). https://doi.org/10.1109/IROS40897.2019.8968539

  33. [34]

    Sun, P., Guan, B., Yu, Z., Shang, Y., Yu, Q., Barath, D.: Learning affine corre- spondencesbyintegratinggeometricconstraints.In:ProceedingsoftheIEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 27038–27048 (2025)

  34. [35]

    In: IEEE International Conference on Computer Vision (ICCV)

    Zhuang, B., Cheong, L.F., Lee, G.H.: Rolling-shutter-aware differential SfM and image rectification. In: IEEE International Conference on Computer Vision (ICCV). pp. 948–956 (2017).https://doi.org/10.1109/ICCV.2017.108