In [1]:
### Import libaries
import numpy as np
import cv2
import glob
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import pickle
import time
import background as bg # import background.py
from IPython.display import HTML
%matplotlib inline
I start by preparing "object points", which will be the (x, y, z) coordinates of the chessboard corners in the world. Here I am assuming the chessboard is fixed on the (x, y) plane at z=0, such that the object points are the same for each calibration image. Thus, objp
is just a replicated array of coordinates, and objpoints
will be appended with a copy of it every time I successfully detect all chessboard corners in a test image. imgpoints
will be appended with the (x, y) pixel position of each of the corners in the image plane with each successful chessboard detection.
I then used the output objpoints
and imgpoints
to compute the camera calibration and distortion coefficients using the cv2.calibrateCamera()
function. I applied this distortion correction to the test images using the cv2.undistort()
function and obtained these results:
In [ ]:
### Chessboard Corners
# Prepare object points
nx = 9
ny = 6
objp = np.zeros((nx*ny,3), np.float32)
objp[:,:2] = np.mgrid[0:nx, 0:ny].T.reshape(-1,2)
# Object points and Image points from all the images.
objpoints = [] # 3d points in real world space
imgpoints = [] # 2d points in image plane.
chessboard_corners = []
images = glob.glob('camera_cal/calibration*.jpg')
for fname in images:
image = mpimg.imread(fname)
gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
# Find chessboard corners
ret, corners = cv2.findChessboardCorners(gray, (nx,ny), None)
# If found, add object points, image points
if ret == True:
objpoints.append(objp)
imgpoints.append(corners)
# Draw and display the corners
chessboard_corners.append(cv2.drawChessboardCorners(image, (nx,ny), corners, ret))
else:
chessboard_corners.append(image)
In [ ]:
# Create subplots in figure
fig = plt.figure(figsize=(20, 15))
for i in range(1,len(chessboard_corners)+1):
fig.add_subplot(5,4,i)
fig.tight_layout()
plt.imshow(chessboard_corners[i-1])
plt.axis('off')
plt.show()
In [ ]:
# Camera Calibration
img = mpimg.imread('camera_cal/calibration3.jpg')
img_size = (img.shape[1], img.shape[0])
# Calibrate Camera
ret, camera_mtx, camera_dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, img_size, None, None)
# Undistort Test Image
undistort_image = cv2.undistort(img, camera_mtx, camera_dist, None, camera_mtx)
bg.disp_img(img, undistort_image)
In [ ]:
# save and Load Camera Calibration Pickle Data
bg.save()
bg.load()
To demonstrate this step, I will describe how I apply the distortion correction to one of the test images like this one:
It is clearly that there are noticeable differences between the undistorted and corrected images, especially at the edges.
In [ ]:
image = mpimg.imread('test_images/straight_lines1.jpg')
undistorted, src_corners, dst_corners = bg.undistort_img(image)
bg.disp_img(image, undistorted)
Please refer to color_threshold function in background.py file for details on how I transformed the undistorted image into a binary image. These are the steps I took:
In [ ]:
threshold_image = bg.color_threshold(undistorted)
bg.disp_img(undistorted, threshold_image)
Please refer to perspective_transform() located in background.py for details on the components of the function.
What I did:
In [ ]:
warped_original, M_Inv = bg.perspective_transform(image)
bg.disp_img(image, warped_original)
In [ ]:
warped_image, M_Inv = bg.perspective_transform(threshold_image)
bg.disp_img(threshold_image, warped_image)
First of all, in background.py, the line_coordinates() function calculated the histogram of 10 equally sized sections of the thresholded image. For each section I identify the histogram peaks and the corresponding coordinates while filtering out noises.
The indices obtained above were then used to implement a sliding window technique in which each window was utilized to identify which pixels are parts of the lane lines, and the pixel coordinates were then stored in variables associated with each lane.
In identify_lane()function, the pixel coordinates obtained above are used to fit a 2nd order polynomial (using numpy's polyfit
funtion) to obtain the left and right lane lines. These lane lines are then extrapolated based on the image dimensions.
For a clip of video, up to 20 frames were saved (using global variables) and averaged over, to replace the right lane line in any frame where there were very few lane points being detected.
The draw_lane_line()
function drew the lane lines and filled the lane area using OpenCV's polylines()
and fillPoly()
functions on top of a blank image.
This image was then unwarped using OpenCV's warpPerspective()
function. The output of this is shown in the 6th step below.
In [ ]:
left_lane_idx, right_lane_idx = bg.lane_coordinates(warped_image)
image_lane_point = bg.draw_lane_points(warped_image,left_lane_idx, right_lane_idx)
bg.disp_img(warped_image, image_lane_point)
In [ ]:
img_size = [image.shape[1], image.shape[0]]
left_lane_y, right_lane_y, left_fit_x, right_fit_x, left_fit, right_fit = bg.identify_lane(left_lane_idx,
right_lane_idx, img_size)
out_img = bg.draw_curved_line(image_lane_point, right_fit)
out_img_2 = bg.draw_curved_line(out_img,left_fit )
bg.disp_img(warped_image,out_img)
The radius of curvature and the position of vehicle are implemented in the functions lane_curvature()
and distance_from_lane()
as defined in background.py
.
Unit conversions:
xm_per_pix = 3.7/700 # metres per pixel in x dimension
ym_per_pix = 30/720 # metres per pixel in y dimension
A 2nd order polynomial is then fit to the lane pixels converted to meters.
curvature = ((1 + (2*new_fit[0]*y_eval + new_fit[1])**2)**1.5)/np.absolute(2*new_fit[0])
To calculate the distance of the car from the middle of the lane, the average x coordinates of the two lane lines are calculated (using the bottommost points of the lanes)before the ximage center is subtracted. The result times xm_per_pix
equals how much the car deviated from the dead centre of the current lane.
car_position = ((left_lane[-1] + right_lane[-1])//2 - img_center[0]) * xm_per_pix
The above values are then displayed as text on every video frame in the draw_lane_line()
function using OpenCV's putText()
function.
In [ ]:
img = mpimg.imread('test_images/test1.jpg')
final_img = bg.pipeline(img)
bg.disp_img(img, final_img)
In [ ]:
img = mpimg.imread('test_images/test2.jpg')
final_img = bg.pipeline(img)
bg.disp_img(img, final_img)
In [ ]:
img = mpimg.imread('test_images/test3.jpg')
final_img = bg.pipeline(img)
bg.disp_img(img, final_img)
In [ ]:
bg.make_video(video_path = "input_videos/project_video.mp4", file_out = "output_videos/project_output.mp4")
There is no doubt that my pipeline is flawed in a number of ways: Here are some of the observations
There are several ways to make this more robust -
In [2]:
x = [[1,2,3],
[4,5,6],
[7,8,9]]
In [3]:
x
Out[3]:
In [4]:
x = np.array(x)
In [5]:
x
Out[5]:
In [6]:
nonzero= x.nonzero()
In [7]:
nonzero
Out[7]:
In [8]:
nonzeroy=np.array(nonzero[0])
In [9]:
nonzeroy
Out[9]:
In [20]:
nonzerox=np.array(nonzero[1])
In [21]:
nonzerox
Out[21]:
In [39]:
good=((nonzeroy<1)&(nonzerox>0))
In [40]:
good
Out[40]:
In [45]:
good=((nonzeroy<1)&(nonzerox>0)).nonzero()
In [46]:
good
Out[46]:
In [43]:
good=((nonzeroy<1)&(nonzerox>0)).nonzero()[0]
In [44]:
good
Out[44]:
In [30]:
a=[]
In [31]:
a.append(good)
In [32]:
a
Out[32]:
In [33]:
a.append(good)
In [34]:
a
Out[34]:
In [38]:
a[1][0]
Out[38]:
In [ ]: