## M3DW – Multiview 3D Warps

• A. Del Bue and A. Bartoli, "Multiview 3D Warps," in 13th Internationl Conference on Computer Vision (ICCV 2011), Barcelona, Spain, 2011.
@inproceedings{DelBue:Bartoli:2011,   author = {A. {Del Bue} and A. Bartoli},   title = {Multiview 3D Warps},   booktitle = {13th Internationl Conference on Computer Vision (ICCV 2011), Barcelona, Spain},   year = {2011},   month = {November} }
 Paper (PDF) Supplemental Material (PDF) Video (WMV)

### Introduction

M3DW is a 3D dense modelling approach using single images that bridges classical image warping methods and the Non-rigid Structure from Motion (NRSfM) framework. M3DW combine the advantages of both; they have an explicit 3D component and a set of 3D deformations combined with projection to 2D. They thus capture the dense deforming body’s time-varying shape and camera pose. The advantages over the classical solutions are numerous: thanks to our feature-based estimation method for the multiview 3D warps, one can not only augment the original images but also retarget or clone the observed body’s 3D deformations by changing the pose.

### Method Description

Our algorithm uses as an input a set of 2D image trajectory extracted from a video sequence showing a deforming body:

Video sequence with 2D image point tracks

The 2D image trajectories may also have missing data. In general 2D points are easily occluded when the object is performing strong rigid motion and deformations.

#### M3DW intialisation

The figure shows three stages of the M3DW initialisation

From the point tracks we extract a mean 3D shape using a rigid SfM procedure (left figure). In this way, it is possible to reconstruct an initial template of the 3D deforming body. So we do not have to provide an explicit template manually which can be a difficult task in most real examples.

Then we place a set of control points around the 3D shape (black dots) and we build the 3D warping function directly in the 3D metric space (central figure). In this way we are able to impose multiview relations between the 3D warp and the image correspondences in the video. Differently, classical Image warps are in general restricted to pairwise image relations.

It is then possible to augment the 3D shape with a dense surface obtained by simple interpolation (right figure). When the dense mesh is reprojected into the image plane (green dots) it models the image deformation given the learned 3D warps.

The resulting deforming dense mesh

#### M3DW estimation

The estimation of the M3DW given a set of 2D image trajectory is formulated as an optimisation problem with non-linear constraints given by the camera projection into the image plane. Details can be found in the paper (section 4), in the following we provide a visualization of the main steps. Given the 3D warp function and the multiview relations, we can define the M3DW model as:

$\underbrace{\left[ \begin{array}{c|c|c}Q_1&\ldots&Q_f\end{array} \right]}_{Q} = L E_{\lambda}\underbrace{\left[ \begin{array}{c|c|c}P_{1}&\ldots&P_f\end{array} \right]}_{P}\underbrace{\left[ \begin{array}{cccc}M_{1} &\ldots&0\\\vdots&\ddots &\vdots\\0& \ldots&M_{f}\\\end{array} \right]}_{M}$

where:

• The matrix $Q$ contains the 2D image points that can be partitioned at each frame such as: $Q = \left[ \begin{array}{c|c|c}Q_1&\ldots&Q_f \end{array} \right]$ ;
• The matrices $L$ and $E_{\lambda}$ defines the 3D warp given the initialised 3D template;
• The matrix $P$ contains the 3D control points displacement which model the object shape deformations. The control points position at each frame is given by $\left[ \begin{array}{c|c|c}P_{1}&\ldots&P_f \end{array} \right]$ ;
• The block-diagonal matrix $M$ map the 3D motion of the control points (and thus the deformations) into the video sequence.

Modelling the image deformations results in estimating the control points displacement and the camera projection matrices (a simple orthographic camera model here). The associated cost function is the following:

$\min_{ P, M} \left\| D \odot ( L E_{\lambda} P M - Q) \right\|^2 \:\:\:\: \text{subject to} \:\:\:\: M_i^\top M_i = I_2$

where the equality constraints $M_i^\top M_i = I_2$ refers to an orthographic camera model at each frame $i$. The matrix $D$ masks the missing data in the 2D image trajectories The solution is given by an Augmented Lagrangian Multipliers (ALM) method for matrix factorisations called the BALM.

#### M3DW applications

Shape augmentation: Given the sparse template obtained by rigid SfM, we augment the shape by surface interpolation obtaining a dense 3D mesh. Given the 3D mesh and the learned warping functions it is possible to map the dense deformation field in the video sequence.

The augmentation starting from a 3D sparse mesh and ending with 2D dense image deformations

The figure on the right shows the reprojection of the deforming mesh into the image plane. Notice how the surface bending accurately describes the real image motion.

Deformation cloning: The learned deformation field is independent of the imaging conditions (i.e. the camera pose) and the shape motion. Thus the learned dense warp in the metric space can be easily reused to augment a new sequence or to transform the existing one.

The original surface in blue shows the original deformation estimated from a video sequence. The smaller red mesh is cloned from the original but resized by two and rotated.

Image Retexturing: The video shows the retexturing of the paper image sequence where a synthetic texture is added to the bending paper. Notice that the augmentation is made by first projecting the dense mesh obtained in augmentation and then by retargeting the texture to augment the video with the ICCV 2011 logo.

• A. Del Bue and A. Bartoli, "Multiview 3D Warps," in 13th Internationl Conference on Computer Vision (ICCV 2011), Barcelona, Spain, 2011.
@inproceedings{DelBue:Bartoli:2011,   author = {A. {Del Bue} and A. Bartoli},   title = {Multiview 3D Warps},   booktitle = {13th Internationl Conference on Computer Vision (ICCV 2011), Barcelona, Spain},   year = {2011},   month = {November} }