Project 4: Advanced Lane Finding

Importing all necessary libaries


In [1]:
### Import libaries
import numpy as np
import cv2
import glob
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import pickle
import time
import background as bg  # import background.py
from IPython.display import HTML
%matplotlib inline

Camera Calibration

1. Briefly state how you computed the camera matrix and distortion coefficients. Provide an example of a distortion corrected calibration image.

I start by preparing "object points", which will be the (x, y, z) coordinates of the chessboard corners in the world. Here I am assuming the chessboard is fixed on the (x, y) plane at z=0, such that the object points are the same for each calibration image. Thus, objp is just a replicated array of coordinates, and objpoints will be appended with a copy of it every time I successfully detect all chessboard corners in a test image. imgpoints will be appended with the (x, y) pixel position of each of the corners in the image plane with each successful chessboard detection.

I then used the output objpoints and imgpoints to compute the camera calibration and distortion coefficients using the cv2.calibrateCamera() function. I applied this distortion correction to the test images using the cv2.undistort() function and obtained these results:


In [ ]:
### Chessboard Corners
# Prepare object points
nx = 9
ny = 6
objp = np.zeros((nx*ny,3), np.float32)
objp[:,:2] = np.mgrid[0:nx, 0:ny].T.reshape(-1,2)

# Object points and Image points from all the images.
objpoints = [] # 3d points in real world space
imgpoints = [] # 2d points in image plane.
chessboard_corners = []

images = glob.glob('camera_cal/calibration*.jpg')

for fname in images:
    image = mpimg.imread(fname)
    gray = cv2.cvtColor(image, cv2.COLOR_RGB2GRAY)
    # Find chessboard corners
    ret, corners = cv2.findChessboardCorners(gray, (nx,ny), None)
    
    # If found, add object points, image points
    if ret == True:        
        objpoints.append(objp)
        imgpoints.append(corners)
        # Draw and display the corners
        chessboard_corners.append(cv2.drawChessboardCorners(image, (nx,ny), corners, ret))
    else:
        chessboard_corners.append(image)

Plots all images, only the ones with the correct grid sizes have corners drawn on them.


In [ ]:
# Create subplots in figure
fig = plt.figure(figsize=(20, 15))
for i in range(1,len(chessboard_corners)+1):
    fig.add_subplot(5,4,i)
    fig.tight_layout()
    plt.imshow(chessboard_corners[i-1])
    plt.axis('off')
plt.show()

apply the camera calibration a selected image.


In [ ]:
# Camera Calibration
img = mpimg.imread('camera_cal/calibration3.jpg')
img_size = (img.shape[1], img.shape[0])
# Calibrate Camera
ret, camera_mtx, camera_dist, rvecs, tvecs = cv2.calibrateCamera(objpoints, imgpoints, img_size, None, None)
# Undistort Test Image
undistort_image = cv2.undistort(img, camera_mtx, camera_dist, None, camera_mtx)
bg.disp_img(img, undistort_image)

In [ ]:
# save and Load Camera Calibration Pickle Data
bg.save()
bg.load()

Pipeline (single images)

1. Provide an example of a distortion-corrected image.

To demonstrate this step, I will describe how I apply the distortion correction to one of the test images like this one:

  • Firstly, I load the image
  • Secondly, I apply a function to undistort the original image. The function is defined in background.py

It is clearly that there are noticeable differences between the undistorted and corrected images, especially at the edges.


In [ ]:
image = mpimg.imread('test_images/straight_lines1.jpg')
undistorted, src_corners, dst_corners = bg.undistort_img(image)
bg.disp_img(image, undistorted)

2. Describe how (and identify where in your code) you used color transforms, gradients or other methods to create a thresholded binary image. Provide an example of a binary image result.

Please refer to color_threshold function in background.py file for details on how I transformed the undistorted image into a binary image. These are the steps I took:

  • Firstly, I defined a gradient threshold and a color threshold for sobel operators and HLS colour space.
  • Then, I created a gradient binary (sxbinary) and a threshold colour channel (s_binary)
  • Finally, I combined them together to form a new binary image that combined the best of both worlds.

In [ ]:
threshold_image = bg.color_threshold(undistorted)
bg.disp_img(undistorted, threshold_image)

3. Describe how (and identify where in your code) you performed a perspective transform and provide an example of a transformed image.

Please refer to perspective_transform() located in background.py for details on the components of the function.

What I did:

  • Undistorting the vanilla image.
  • getting the information about the image size.
  • leveraging cv2.getPerspectiveTransform to obtain M, the persepctive transform matrix using the previously defined source points and destination points.
  • perform the perspective transform using cv2.warpPerspective function.
  • The output of perspective_transform function will be the warped image and the inverse perspective transform matrix which will be used later to perform draw lane areas on vanilla images.

In [ ]:
warped_original, M_Inv = bg.perspective_transform(image)
bg.disp_img(image, warped_original)

In [ ]:
warped_image, M_Inv = bg.perspective_transform(threshold_image)
bg.disp_img(threshold_image, warped_image)

4. Describe how (and identify where in your code) you identified lane-line pixels and fit their positions with a polynomial?

  • First of all, in background.py, the line_coordinates() function calculated the histogram of 10 equally sized sections of the thresholded image. For each section I identify the histogram peaks and the corresponding coordinates while filtering out noises.

  • The indices obtained above were then used to implement a sliding window technique in which each window was utilized to identify which pixels are parts of the lane lines, and the pixel coordinates were then stored in variables associated with each lane.

  • In identify_lane()function, the pixel coordinates obtained above are used to fit a 2nd order polynomial (using numpy's polyfit funtion) to obtain the left and right lane lines. These lane lines are then extrapolated based on the image dimensions. For a clip of video, up to 20 frames were saved (using global variables) and averaged over, to replace the right lane line in any frame where there were very few lane points being detected.

  • The draw_lane_line() function drew the lane lines and filled the lane area using OpenCV's polylines() and fillPoly() functions on top of a blank image. This image was then unwarped using OpenCV's warpPerspective() function. The output of this is shown in the 6th step below.


In [ ]:
left_lane_idx, right_lane_idx = bg.lane_coordinates(warped_image)
image_lane_point = bg.draw_lane_points(warped_image,left_lane_idx, right_lane_idx)
bg.disp_img(warped_image, image_lane_point)

In [ ]:
img_size = [image.shape[1], image.shape[0]]
left_lane_y, right_lane_y, left_fit_x, right_fit_x, left_fit, right_fit = bg.identify_lane(left_lane_idx,
                                                                                           right_lane_idx, img_size)
out_img = bg.draw_curved_line(image_lane_point, right_fit)
out_img_2 = bg.draw_curved_line(out_img,left_fit )
bg.disp_img(warped_image,out_img)

5. Describe how (and identify where in your code) you calculated the radius of curvature of the lane and the position of the vehicle with respect to center.

The radius of curvature and the position of vehicle are implemented in the functions lane_curvature() and distance_from_lane() as defined in background.py.

Unit conversions:

xm_per_pix = 3.7/700 # metres per pixel in x dimension

ym_per_pix = 30/720 # metres per pixel in y dimension

A 2nd order polynomial is then fit to the lane pixels converted to meters.

curvature = ((1 + (2*new_fit[0]*y_eval + new_fit[1])**2)**1.5)/np.absolute(2*new_fit[0])

To calculate the distance of the car from the middle of the lane, the average x coordinates of the two lane lines are calculated (using the bottommost points of the lanes)before the ximage center is subtracted. The result times xm_per_pix equals how much the car deviated from the dead centre of the current lane.

car_position = ((left_lane[-1] + right_lane[-1])//2 - img_center[0]) * xm_per_pix

The above values are then displayed as text on every video frame in the draw_lane_line() function using OpenCV's putText() function.

6. Provide an example image of your result plotted back down onto the road such that the lane area is identified clearly.


In [ ]:
img = mpimg.imread('test_images/test1.jpg')
final_img = bg.pipeline(img)
bg.disp_img(img, final_img)

In [ ]:
img = mpimg.imread('test_images/test2.jpg')
final_img = bg.pipeline(img)
bg.disp_img(img, final_img)

In [ ]:
img = mpimg.imread('test_images/test3.jpg')
final_img = bg.pipeline(img)
bg.disp_img(img, final_img)

Creating videos


In [ ]:
bg.make_video(video_path = "input_videos/project_video.mp4", file_out = "output_videos/project_output.mp4")

Discussion

There is no doubt that my pipeline is flawed in a number of ways: Here are some of the observations

  • The video pipeline seems to work best when lane lines are solid (yellow lane line on the left-hand side) and its performance deteriorates significantly when broken lane lines are present (white lane line).
  • The ambient lighting indicated in the input video seems to play an overarching role in the final quality of the lane marking drawn. Poor lighting when tree shadows, overpasses, unfavrouable weather can all wreak havoc on the final output quality, in some cases, rendering the pipeline useless.
  • When lane lines become badly worn, my model was not even able to detect them at all in some cases, this issue is present in challenge video.
  • When the value of the curvature is high as in harder challenge video where corners are short and sharp, the lane detection failed miserably once again.

There are several ways to make this more robust -

  • A better Class structure for each Lane Line to help keep track of previous N frames.
  • Better tuning for gradient based thresholding, exploring different colorspaces.
  • We can attempt better tuning techniques for gradient based thresholds and a variety of colour spaces can also be trialed.
  • Improved perspective transform by not hardcoding the source and destination points. One option is to use hough's transform to identify lanes in a test image and use their end points.

In [2]:
x = [[1,2,3],
     [4,5,6],
     [7,8,9]]

In [3]:
x


Out[3]:
[[1, 2, 3], [4, 5, 6], [7, 8, 9]]

In [4]:
x = np.array(x)

In [5]:
x


Out[5]:
array([[1, 2, 3],
       [4, 5, 6],
       [7, 8, 9]])

In [6]:
nonzero= x.nonzero()

In [7]:
nonzero


Out[7]:
(array([0, 0, 0, 1, 1, 1, 2, 2, 2], dtype=int64),
 array([0, 1, 2, 0, 1, 2, 0, 1, 2], dtype=int64))

In [8]:
nonzeroy=np.array(nonzero[0])

In [9]:
nonzeroy


Out[9]:
array([0, 0, 0, 1, 1, 1, 2, 2, 2], dtype=int64)

In [20]:
nonzerox=np.array(nonzero[1])

In [21]:
nonzerox


Out[21]:
array([0, 1, 2, 0, 1, 2, 0, 1, 2], dtype=int64)

In [39]:
good=((nonzeroy<1)&(nonzerox>0))

In [40]:
good


Out[40]:
array([False,  True,  True, False, False, False, False, False, False], dtype=bool)

In [45]:
good=((nonzeroy<1)&(nonzerox>0)).nonzero()

In [46]:
good


Out[46]:
(array([1, 2], dtype=int64),)

In [43]:
good=((nonzeroy<1)&(nonzerox>0)).nonzero()[0]

In [44]:
good


Out[44]:
array([0], dtype=int64)

In [30]:
a=[]

In [31]:
a.append(good)

In [32]:
a


Out[32]:
[array([0], dtype=int64)]

In [33]:
a.append(good)

In [34]:
a


Out[34]:
[array([0], dtype=int64), array([0], dtype=int64)]

In [38]:
a[1][0]


Out[38]:
0

In [ ]: