GenSfM: Structure-from-Motion with a Non-Parametric Camera Model


CVPR2025 (Highlight)

Yihan Wang1*     Linfei Pan2*     Marc Pollefeys2,3     Viktor Larsson4    

     1EPFL      2ETH Zurich      3Microsoft      4Lund University

*Equal Contribution




Abstract

In this paper, we present a new generic Structure-from-Motion pipeline, GenSfM, that uses a non-parametric camera projection model. The model is self-calibrated during the reconstruction process and can fit a wide variety of cameras, ranging from simple low-distortion pinhole cameras to more extreme optical systems such as fisheye or catadioptric cameras. The key component in our framework is an adaptive calibration procedure that can estimate partial calibrations, only modeling regions of the image where sufficient constraints are available. In experiments, we show that our method achieves comparable accuracy to traditional Structure-from-Motion pipelines in easy scenarios, and outperforms them in cases where they are unable to self-calibrate their parametric models. Code could be found here.




Video




Overview

The overall design of the framework follows classical incremental pipelines. Without known calibration or specific parametric model, we collect initial 2D-3D correspondences with radial alignment constraint as in Hruby et al.. As images iteratively registered to the 3D model, we progressively calibrate the camera by fitting a non-parametric distortion map initialized with implicit distortion model.





Results


Calibration results on BabelCalib



Reconstruction results from pinhole-like images to severely distorted images. The reconstructions in each row come from COLMAP, RadialSfM with pose upgraded and bundle adjusted with implicit distortion model, and our pipeline, from left to right.



Undistortion results with estimated calibration map on fisheye and catadioptric images.







Challenging Reconstructions