Synthetic Defocus and Look-Ahead Autofocus for Casual Videography

Xuaner Zhang , Kevin Matzen , Vivien Nguyen , Dillon Yao , You Zhang , Ren Ng

Authors on Pith no claims yet

classification 💻 cs.CV cs.GR

keywords focusvideoautofocuscasualcinemacreatedatasetdeliver

read the original abstract

In cinema, large camera lenses create beautiful shallow depth of field (DOF), but make focusing difficult and expensive. Accurate cinema focus usually relies on a script and a person to control focus in realtime. Casual videographers often crave cinematic focus, but fail to achieve it. We either sacrifice shallow DOF, as in smartphone videos; or we struggle to deliver accurate focus, as in videos from larger cameras. This paper is about a new approach in the pursuit of cinematic focus for casual videography. We present a system that synthetically renders refocusable video from a deep DOF video shot with a smartphone, and analyzes future video frames to deliver context-aware autofocus for the current frame. To create refocusable video, we extend recent machine learning methods designed for still photography, contributing a new dataset for machine training, a rendering model better suited to cinema focus, and a filtering solution for temporal coherence. To choose focus accurately for each frame, we demonstrate autofocus that looks at upcoming video frames and applies AI-assist modules such as motion, face, audio and saliency detection. We also show that autofocus benefits from machine learning and a large-scale video dataset with focus annotation, where we use our RVR-LAAF GUI to create this sizable dataset efficiently. We deliver, for example, a shallow DOF video where the autofocus transitions onto each person before she begins to speak. This is impossible for conventional camera autofocus because it would require seeing into the future.

This paper has not been read by Pith yet.

discussion (0)

Forward citations

Cited by 2 Pith papers

Reviewed papers in the Pith corpus that reference this work. Sorted by Pith novelty score.

Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
cs.CV 2026-05 unverdicted novelty 6.0

MagicBokeh is a diffusion framework using focus-aware masked attention and a degradation-aware depth module to jointly optimize bokeh rendering and super-resolution for efficient photorealistic results on low-resoluti...
Towards Photorealistic and Efficient Bokeh Rendering via Diffusion Framework
cs.CV 2026-05 unverdicted novelty 6.0

MagicBokeh uses a single diffusion model with alternative training, focus-aware masked attention, and degradation-aware depth estimation to produce photorealistic bokeh on low-res zoomed images.