Copyright 2019 Google LLC.



In [0]:

    
#@title Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

Camera calibration

View source on GitHub

Cameras are complex pieces of hardware that are able to capture 2D images of 3D objects. Indeed, to model real cameras precisely, one need to estimate many parameters including lens distortion, ISO, focal length, and exposure time. In the following, we restrict our focus to the projective camera model.

This model is composed of two parameters that are often referred to as intrinsic parameters:

the principal point, which is the projection of the optical center on the image. Ideally, the principal point is close to the center of the image.
the focal length, which is the distance between the optical center and the image plane. This parameters allows to control the level of zoom.

This notebook illustrates how to use Tensorflow Graphics to estimate the intrinsic parameters of a projective camera. Recovering these parameters is particularly important to perform several tasks, including 3D reconstruction.

In this Colab, the goal is to recover the intrinsic parameters of a camera given an observation and correspondences between the observation and the render of the current solution. Things are kept simple by only inserting a rectangle in the 3D scene, and using it as the source of correspondences during the optimization. The minimization is performed using the Levenberg-Marquardt algorithm.

Setup & Imports

If Tensorflow Graphics is not installed on your system, the following cell can install the Tensorflow Graphics package for you.



In [0]:

    
!pip install tensorflow_graphics

Now that Tensorflow Graphics is installed, let's import everything needed to run the demo contained in this notebook.



In [0]:

    
import tensorflow as tf
import matplotlib.pyplot as plt
import numpy as np

#################################
# Imports the necessary modules #
#################################

from tensorflow_graphics.math.optimizer import levenberg_marquardt
from tensorflow_graphics.rendering.camera import perspective

Understanding the perspective camera model

To illustrate how this model works, we will assume that there is nothing in the scene in front of the camera, but a rectangle. Let's first define a function that will render this rectangle as observed by our camera.



In [0]:

    
def render_rectangle(rectangle_vertices, focal, principal_point, image_dimensions):
  """Renders a rectangle on the image plane.

  Args:
    rectangle_vertices: the four 3D corners of a rectangle.
    focal: the focal lengths of a projective camera.
    principal_point: the position of the principal point on the image plane.
    image_dimensions: the dimensions (in pixels) of the image.

  Returns:
    A 2d image of the 3D rectangle.
  """
  image = np.zeros((int(image_dimensions[0]), int(image_dimensions[1]), 3))
  vertices_2d = perspective.project(rectangle_vertices, focal, principal_point)
  vertices_2d_np = vertices_2d.numpy()
  top_left_corner = np.maximum(vertices_2d_np[0, :], (0, 0)).astype(int)
  bottom_right_corner = np.minimum(
      vertices_2d_np[1, :],
      (image_dimensions[1] - 1, image_dimensions[0] - 1)).astype(int)
  for x in range(top_left_corner[0], bottom_right_corner[0] + 1):
    for y in range(top_left_corner[1], bottom_right_corner[1] + 1):
      c1 = float(bottom_right_corner[0] + 1 -
                 x) / float(bottom_right_corner[0] + 1 - top_left_corner[0])
      c2 = float(bottom_right_corner[1] + 1 -
                 y) / float(bottom_right_corner[1] + 1 - top_left_corner[1])
      image[y, x] = (c1, c2, 1)
  return image

The following cell defines default intrinsic parameters and renders the rectangle using these parameters.



In [0]:

    
# Sets up the vertices of the rectangle.
rectangle_depth = 1000.0
rectangle_vertices = np.array(
    ((-150.0, -75.0, rectangle_depth), (150.0, 75.0, rectangle_depth)))

# Sets up the size of the image plane.
image_width = 400
image_height = 300
image_dimensions = np.array((image_height, image_width), dtype=np.float64)

# Sets the horizontal and vertical focal length to be the same. The focal length
# picked yields a field of view around 50degrees.
focal_lengths = np.array((image_height, image_height), dtype=np.float64)
# Sets the principal point at the image center.
ideal_principal_point = np.array(
    (image_width, image_height), dtype=np.float64) / 2.0

# Let's see what our scene looks like using the intrinsic parameters defined above.
render = render_rectangle(rectangle_vertices, focal_lengths, ideal_principal_point,
                          image_dimensions)
_ = plt.imshow(render)

The focal length and the position of the optical center have very different effects on the final images. Use different configuration of the sliders below to convince you of the effect of each of these parameters.



In [0]:

    
###############
# UI controls #
###############
#@title model parameters { vertical-output: false, run: "auto" }
focal_length_x = 300  #@param { type: "slider", min: 100.0, max: 500.0, step: 1.0 }
focal_length_y = 300  #@param { type: "slider", min: 100.0, max: 500.0, step: 1.0 }
optical_center_x = 200  #@param { type: "slider", min: 0.0, max: 400.0, step: 1.0 }
optical_center_y = 150.0  #@param { type: "slider", min: 0.0, max: 300.0, step: 1.0 }

render = render_rectangle(
    rectangle_vertices,
    np.array((focal_length_x, focal_length_y), dtype=np.float64),
    np.array((optical_center_x, optical_center_y), dtype=np.float64),
    image_dimensions)
_ = plt.imshow(render)

Optimizing intrinsic parameters

Every camera (e.g. smartphone camera) comes with its own set of intrinsics parameters. Among other applications, precise intrinsic parameters are used in 3D scene reconstruction, robotics, and navigation systems.

A common way to estimate intrinsic parameters is to use a known 3D object. Using our current estimate of the intrinsic parameters, we can predict how the known 3D object should 'look', and compare that to the actual observation.

Let's start by defining some helper functions which will help plotting results.



In [0]:

    
def plot_optimization_step(observation, prediction):
  plt.figure(figsize=(20, 10))
  ax = plt.subplot("131")
  ax.set_title("Observation")
  _ = ax.imshow(observation)
  ax = plt.subplot("132")
  ax.set_title("Prediction using estimated intrinsics")
  _ = ax.imshow(prediction)
  ax = plt.subplot("133")
  ax.set_title("Difference image")
  _ = ax.imshow(np.abs(observation - prediction))
  plt.show()


def print_errors(focal_error, center_error):
  print("Error focal length %f" % (focal_error,))
  print("Err principal point %f" % (center_error,))

Let's now define the values of the intrinsic parameters we are looking for, and an initial guess of the values of these parameters.



In [0]:

    
def build_parameters():
  # Constructs the intrinsic parameters we wish to recover.
  real_focal_lengths = focal_lengths * np.random.uniform(0.8, 1.2, size=(2,))
  real_principal_point = ideal_principal_point + (np.random.random(2) -
                                                  0.5) * image_width / 5.0

  # Initializes the first estimate of the intrinsic parameters.
  estimate_focal_lengths = tf.Variable(real_focal_lengths +
                                       (np.random.random(2) - 0.5) *
                                       image_width)
  estimate_principal_point = tf.Variable(real_principal_point +
                                         (np.random.random(2) - 0.5) *
                                         image_width / 4)
  return real_focal_lengths, real_principal_point, estimate_focal_lengths, estimate_principal_point

As described earlier, one can compare how the 3D object would look using the current estimate of the intrinsic parameters, can compare that to the actual observation. The following function captures a distance between these two images which we will seek to minimize.



In [0]:

    
def residuals(estimate_focal_lengths, estimate_principal_point):
  vertices_2d_gt = perspective.project(rectangle_vertices, real_focal_lengths,
                                       real_principal_point)
  vertices_2d_observed = perspective.project(rectangle_vertices,
                                             estimate_focal_lengths,
                                             estimate_principal_point)
  return vertices_2d_gt - vertices_2d_observed

All the pieces are now in place to solve the problem; let's give it a go!

Note: the residuals are minimized using Levenberg-Marquardt, which is particularly indicated for this problem. First order optimizers (e.g. Adam or gradient descent) could also be used.



In [0]:

    
# Samples intrinsic parameters to recover and an initial solution.
real_focal_lengths, real_principal_point, estimate_focal_lengths, estimate_principal_point = build_parameters(
)

# Constructs the observed image.
observation = render_rectangle(rectangle_vertices, real_focal_lengths,
                               real_principal_point, image_dimensions)

# Displays the initial solution.
print("Initial configuration:")
print_errors(
    np.linalg.norm(estimate_focal_lengths - real_focal_lengths),
    np.linalg.norm(estimate_principal_point - real_principal_point))
image = render_rectangle(rectangle_vertices, estimate_focal_lengths,
                         estimate_principal_point, image_dimensions)
plot_optimization_step(observation, image)

# Optimization.
_, (estimate_focal_lengths,
    estimate_principal_point) = levenberg_marquardt.minimize(
        residuals, (estimate_focal_lengths, estimate_principal_point), 1)

print("Predicted configuration:")
print_errors(
    np.linalg.norm(estimate_focal_lengths - real_focal_lengths),
    np.linalg.norm(estimate_principal_point - real_principal_point))
image = render_rectangle(rectangle_vertices, estimate_focal_lengths,
                         estimate_principal_point, image_dimensions)
plot_optimization_step(observation, image)