Before can begin developing the pipeline for decting lane lines, we need to account for the distortion in our camera. To do that, we need some information about the camera - we need to calibrate it.
By inspecting pictures of a chessboard pattern (predictable/known pattern) we can find the distoration coefficients and camera matrix that we need to undistort images.
In [16]:
import cv2
import numpy as np
import glob
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
%matplotlib inline
In [17]:
ret, cMat, coefs, rvects, tvects = None, None, None, None, None
def calibrate_camera():
im_paths = glob.glob('./camera_cal_images/calibration*.jpg')
cb_shape = (9, 6) # Corners we expect to be detected on the chessboard
obj_points = [] # 3D points in real-world space
img_points = [] # 2D points in the image
for im_path in im_paths:
img = mpimg.imread(im_path)
obj_p = np.zeros((cb_shape[0]*cb_shape[1], 3), np.float32)
coords = np.mgrid[0:cb_shape[0], 0:cb_shape[1]].T.reshape(-1, 2) # x, y coords
obj_p[:,:2] = coords
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
found_all, corners = cv2.findChessboardCorners(gray, cb_shape, None)
if found_all:
obj_points.append(obj_p)
img_points.append(corners)
img = cv2.drawChessboardCorners(img, cb_shape, corners, found_all)
else:
print("Couldn't find all corners in image:", im_path)
return cv2.calibrateCamera(obj_points, img_points, gray.shape, None, None)
ret, cMat, coefs, rvects, tvects = calibrate_camera()
print(coefs)
In [18]:
def side_by_side_plot(im1, im2, im1_title=None, im2_title=None):
f, (ax1, ax2) = plt.subplots(1, 2, figsize=(20, 10))
f.tight_layout()
ax1.imshow(im1)
if im1_title: ax1.set_title(im1_title, fontsize=30)
ax2.imshow(im2)
if im2_title: ax2.set_title(im2_title, fontsize=30)
plt.subplots_adjust(left=0., right=1, top=0.9, bottom=0.)
def big_plot(img):
fig, ax = plt.subplots(figsize=(20, 10))
ax.imshow(img)
In [19]:
def undistort(img):
if cMat is None: raise Exception('The camera needs to be calibrated first.')
return cv2.undistort(img, cMat, coefs, None, cMat)
test_undistort = mpimg.imread('./camera_cal_images/not_enough_corners/calibration1.jpg')
undistorted = undistort(test_undistort)
cv2.imwrite('./example_images/undistorted_checkerboard.jpg', undistorted)
side_by_side_plot(test_undistort, undistorted, 'Original', 'Undistorted')
In [20]:
img = mpimg.imread('./test_images/test2.jpg')
undistorted = undistort(img)
side_by_side_plot(img, undistorted)
Both the chessboard and the road images above show the distorted (left) and undistorted (right). It's much easier to see the effect in the chessbaord image with all those parallel lines, but there are some signs in the road image like the yellow line, the hood of the car at the bottom, and the white car on the right.
In [21]:
def get_sobel_bin(img):
''' "img" should be 1-channel '''
sobel = cv2.Sobel(img, cv2.CV_64F, 1, 0, ksize=9) # x-direction gradient
abs_sobel = np.absolute(sobel)
scaled_sobel = np.uint8(255*abs_sobel/np.max(abs_sobel))
sobel_bin = np.zeros_like(scaled_sobel)
sobel_bin[(scaled_sobel >= 20) & (scaled_sobel <= 100)] = 1
return sobel_bin
def get_threshold(img, show=False):
''' "img" should be an undistorted image '''
# Color-space conversions
gray = cv2.cvtColor(img, cv2.COLOR_RGB2GRAY)
hls = cv2.cvtColor(img, cv2.COLOR_RGB2HLS)
s_channel = hls[:,:,2]
# Sobel gradient binaries
sobel_s_bin = get_sobel_bin(s_channel)
sobel_gray_bin = get_sobel_bin(gray)
sobel_comb_bin = np.zeros_like(sobel_s_bin)
sobel_comb_bin[(sobel_s_bin == 1) | (sobel_gray_bin == 1)] = 1
# HLS S-Channel binary
s_bin = np.zeros_like(s_channel)
s_bin[(s_channel >= 150) & (s_channel <= 255)] = 1
# Combine the binaries
comb_bin = np.zeros_like(sobel_comb_bin)
comb_bin[(sobel_comb_bin == 1) | (s_bin == 1)] = 1
gray_img = np.dstack((gray, gray, gray))
sobel_s_img = np.dstack((sobel_s_bin, sobel_s_bin, sobel_s_bin))*255
sobel_gray_img = np.dstack((sobel_gray_bin, sobel_gray_bin, sobel_gray_bin))*255
sobel_comb_img = np.dstack((sobel_comb_bin, sobel_comb_bin, sobel_comb_bin))*255
s_img = np.dstack((s_bin, s_bin, s_bin))*255
comb_img = np.dstack((comb_bin, comb_bin, comb_bin))*255
if show: side_by_side_plot(img, comb_img, 'Original', 'Thresholded')
return comb_img
Great! Looks like the thresholding method works.
Thresholding seemed to perform best using a combination of the s-channel from HLS color space and the combined sobel-gradient. The combined sobel gradient was created from both grayscale and s-channel images, since they both provided different, useful information.
We will combine this with the undistort method.
In [22]:
img = mpimg.imread('./test_images/test2.jpg')
undist = undistort(img)
threshold_img = get_threshold(undist, show=True)
cv2.imwrite('./example_images/threshold_test2.jpg', threshold_img)
Out[22]:
Now that we can undistort images and threshold them to help isolate the important information, we need to further limit that information by looking only at the portion of the image we care about - the road.
To focus in on the road portion of the image we will shift our perspective to a top-down view of the road in front of the car. To test this function, we'll use a road image with straight lines.
In [23]:
def warp_to_lines(img, show=False):
''' "img" should be an undistorted image. '''
x_shape, y_shape = img.shape[1], img.shape[0]
middle_x = x_shape//2
top_y = 2*y_shape//3
top_margin = 93
bottom_margin = 450
points = [
(middle_x-top_margin, top_y),
(middle_x+top_margin, top_y),
(middle_x+bottom_margin, y_shape),
(middle_x-bottom_margin, y_shape)
]
'''
# This shows the area we are warping to
for i in range(len(points)):
cv2.line(img, points[i-1], points[i], [255, 0, 0], 2)
big_plot(img)
'''
src = np.float32(points)
dst = np.float32([
(middle_x-bottom_margin, 0),
(middle_x+bottom_margin, 0),
(middle_x+bottom_margin, y_shape),
(middle_x-bottom_margin, y_shape)
])
M = cv2.getPerspectiveTransform(src, dst)
Minv = cv2.getPerspectiveTransform(dst, src)
warped = cv2.warpPerspective(img, M, (x_shape, y_shape), flags=cv2.INTER_LINEAR)
if show: side_by_side_plot(img, warped, 'Original', 'Warped Perspective')
return warped, M, Minv
img = mpimg.imread('./test_images/straight_lines2.jpg')
img = undistort(img)
warped, M, Minv = warp_to_lines(img, show=True)
Great! We now have a function that can warp our perspective from a forward-facing road image to a top-down roade image. The warped image will be perfect for isolating lane lines and measuring any curvature.
We will be applying this perspective warp to thresholded images.
In [24]:
img = mpimg.imread('./test_images/test2.jpg')
undist = undistort(img)
thresholded = get_threshold(undist)
warped, M, Minv = warp_to_lines(thresholded, show=True)
cv2.imwrite('./example_images/warped_threshold_test2.jpg', warped)
Out[24]:
In [25]:
warped_bin = np.zeros_like(warped[:,:,0])
warped_bin[(warped[:,:,0] > 0)] = 1
histogram = np.sum(warped_bin[warped_bin.shape[0]//2:,:], axis=0) # Sum the columns in the bottom half of the image
plt.plot(histogram)
Out[25]:
The histogram's primary left and right peaks should be our lane lines. As we follow the lines using a sliding-window search, we will begin our search at these two peaks.
In [26]:
def find_lane(warped, show=False):
# Create a binary version of the warped image
warped_bin = np.zeros_like(warped[:,:,0])
warped_bin[(warped[:,:,0] > 0)] = 1
vis_img = warped.copy() # The image we will draw on to show the lane-finding process
vis_img[vis_img > 0] = 255 # Max out non-black pixels so we can remove them later
# Sum the columns in the bottom portion of the image to create a histogram
histogram = np.sum(warped_bin[warped_bin.shape[0]//2:,:], axis=0)
# Find the left an right right peaks of the histogram
midpoint = histogram.shape[0]//2
left_x = np.argmax(histogram[:midpoint]) # x-position for the left window
right_x = np.argmax(histogram[midpoint:]) + midpoint # x-position for the right window
n_windows = 10
win_height = warped_bin.shape[0]//n_windows
margin = 80 # Determines how wide the window is
pix_to_recenter = margin*2 # If we find this many pixels in our window we will recenter (too few would be a bad recenter)
# Find the non-zero x and y indices
nonzero_ind = warped_bin.nonzero()
nonzero_y_ind = np.array(nonzero_ind[0])
nonzero_x_ind = np.array(nonzero_ind[1])
left_line_ind, right_line_ind = [], []
for win_i in range(n_windows):
win_y_low = warped_bin.shape[0] - (win_i+1)*win_height
win_y_high = warped_bin.shape[0] - (win_i)*win_height
win_x_left_low = max(0, left_x - margin)
win_x_left_high = left_x + margin
win_x_right_low = right_x - margin
win_x_right_high = min(warped_bin.shape[1]-1, right_x + margin)
# Draw the windows on the vis_img
rect_color, rect_thickness = (0, 255, 0), 3
cv2.rectangle(vis_img, (win_x_left_low, win_y_high), (win_x_left_high, win_y_low), rect_color, rect_thickness)
cv2.rectangle(vis_img, (win_x_right_low, win_y_high), (win_x_right_high, win_y_low), rect_color, rect_thickness)
# Record the non-zero pixels within the windows
left_ind = (
(nonzero_y_ind >= win_y_low) &
(nonzero_y_ind <= win_y_high) &
(nonzero_x_ind >= win_x_left_low) &
(nonzero_x_ind <= win_x_left_high)
).nonzero()[0]
right_ind = (
(nonzero_y_ind >= win_y_low) &
(nonzero_y_ind <= win_y_high) &
(nonzero_x_ind >= win_x_right_low) &
(nonzero_x_ind <= win_x_right_high)
).nonzero()[0]
left_line_ind.append(left_ind)
right_line_ind.append(right_ind)
# If there are enough pixels, re-align the window
if len(left_ind) > pix_to_recenter:
left_x = int(np.mean(nonzero_x_ind[left_ind]))
if len(right_ind) > pix_to_recenter:
right_x = int(np.mean(nonzero_x_ind[right_ind]))
# Combine the arrays of line indices
left_line_ind = np.concatenate(left_line_ind)
right_line_ind = np.concatenate(right_line_ind)
# Gather the final line pixel positions
left_x = nonzero_x_ind[left_line_ind]
left_y = nonzero_y_ind[left_line_ind]
right_x = nonzero_x_ind[right_line_ind]
right_y = nonzero_y_ind[right_line_ind]
# Color the lines on the vis_img
vis_img[left_y, left_x] = [254, 0, 0] # 254 so we can isolate the white 255 later
vis_img[right_y, right_x] = [0, 0, 254] # 254 so we can isolate the white 255 later
# Fit a 2nd-order polynomial to the lines
left_fit = np.polyfit(left_y, left_x, 2)
right_fit = np.polyfit(right_y, right_x, 2)
# Get our x/y vals for the fit lines
y_vals = np.linspace(0, warped_bin.shape[0]-1, warped_bin.shape[0])
left_x_vals = left_fit[0]*y_vals**2 + left_fit[1]*y_vals + left_fit[2]
right_x_vals = right_fit[0]*y_vals**2 + right_fit[1]*y_vals + right_fit[2]
if show:
fig, ax = plt.subplots(figsize=(20, 10))
ax.imshow(vis_img)
ax.plot(left_x_vals, y_vals, color='yellow')
ax.plot(right_x_vals, y_vals, color='yellow')
cv2.imwrite('./example_images/lane_detection_warped_test2.jpg', vis_img)
lane_lines_img = vis_img.copy()
lane_lines_img[lane_lines_img == 255] = 0 # This basically removes everything except the colored lane lines
return y_vals, left_x_vals, right_x_vals, left_fit, right_fit, lane_lines_img
y_vals, left_x_vals, right_x_vals, left_fit, right_fit, lane_lines_img = find_lane(warped, show=True)
Sweet! Looks like we were able to identify the lines. A few things happened here as shown in the image above:
With the lane lines identified in the last step, it's time to wrap up the pipeline by taking that information and drawing it back onto the original image. This will in involve a reverse perspective transform, since the line/lane information was found from a warped perspective.
In [27]:
def draw_lane(img, lane_lines_img, y_vals, left_x_vals, right_x_vals, show=False):
# Prepare the x/y points for cv2.fillPoly()
left_points = np.array([np.vstack([left_x_vals, y_vals]).T])
right_points = np.array([np.flipud(np.vstack([right_x_vals, y_vals]).T)])
# right_points = np.array([np.vstack([right_x_vals, y_vals]).T])
points = np.hstack((left_points, right_points))
# Color the area between the lines (the lane)
lane = np.zeros_like(lane_lines_img) # Create a blank canvas to draw the lane on
cv2.fillPoly(lane, np.int_([points]), (0, 255, 0))
warped_lane_info = cv2.addWeighted(lane_lines_img, 1, lane, .3, 0)
unwarped_lane_info = cv2.warpPerspective(warped_lane_info, Minv, (img.shape[1], img.shape[0]))
drawn_img = cv2.addWeighted(img, 1, unwarped_lane_info, 1, 0)
if show: big_plot(drawn_img)
return drawn_img
drawn_img = draw_lane(img, lane_lines_img, y_vals, left_x_vals, right_x_vals, show=True)
In [28]:
def detect_lane_pipe(img, show=False):
''' "img" can be a path or a loaded image (for the movie pipeline) '''
if type(img) is str: img = mpimg.imread(img)
undist = undistort(img)
threshed = get_threshold(undist)
warped, M, Minv = warp_to_lines(threshed)
y_vals, left_x_vals, right_x_vals, left_fit, right_fit, lane_lines_img = find_lane(warped)
drawn_img = draw_lane(img, lane_lines_img, y_vals, left_x_vals, right_x_vals, show=show)
return drawn_img
drawn_img = detect_lane_pipe('./test_images/test2.jpg', show=True)
And there it is. 'detect_lane_pipe()' is our complete pipeline function for taking an image path and detecting the lane in that image.
All those parameters being passed around is becoming a bit unwieldy. Let's consolidate them into a Lane class that will track information about the lane we detected in an image.
Not only will the Lane class be helpful for consolidating code, but it will also be useful later when we apply our pipeline to videos, frame by frame. For now, the focus will be code consolidation.
In [29]:
class Lane():
def __init__(self, y_vals, left_x_vals, right_x_vals, left_fit, right_fit, lane_lines_img):
self.y_vals = y_vals
self.left_x_vals = left_x_vals
self.right_x_vals = right_x_vals
self.left_fit = left_fit
self.right_fit = right_fit
self.lane_lines_img = lane_lines_img
In [30]:
def find_lane(warped, show=False):
# Create a binary version of the warped image
warped_bin = np.zeros_like(warped[:,:,0])
warped_bin[(warped[:,:,0] > 0)] = 1
vis_img = warped.copy() # The image we will draw on to show the lane-finding process
vis_img[vis_img > 0] = 255 # Max out non-black pixels so we can remove them later
# Sum the columns in the bottom portion of the image to create a histogram
histogram = np.sum(warped_bin[warped_bin.shape[0]//2:,:], axis=0)
# Find the left an right right peaks of the histogram
midpoint = histogram.shape[0]//2
left_x = np.argmax(histogram[:midpoint]) # x-position for the left window
right_x = np.argmax(histogram[midpoint:]) + midpoint # x-position for the right window
n_windows = 10
win_height = warped_bin.shape[0]//n_windows
margin = 50 # Determines how wide the window is
pix_to_recenter = margin*2 # If we find this many pixels in our window we will recenter (too few would be a bad recenter)
# Find the non-zero x and y indices
nonzero_ind = warped_bin.nonzero()
nonzero_y_ind = np.array(nonzero_ind[0])
nonzero_x_ind = np.array(nonzero_ind[1])
left_line_ind, right_line_ind = [], []
for win_i in range(n_windows):
win_y_low = warped_bin.shape[0] - (win_i+1)*win_height
win_y_high = warped_bin.shape[0] - (win_i)*win_height
win_x_left_low = max(0, left_x - margin)
win_x_left_high = left_x + margin
win_x_right_low = right_x - margin
win_x_right_high = min(warped_bin.shape[1]-1, right_x + margin)
# Draw the windows on the vis_img
rect_color, rect_thickness = (0, 255, 0), 3
cv2.rectangle(vis_img, (win_x_left_low, win_y_high), (win_x_left_high, win_y_low), rect_color, rect_thickness)
cv2.rectangle(vis_img, (win_x_right_low, win_y_high), (win_x_right_high, win_y_low), rect_color, rect_thickness)
# Record the non-zero pixels within the windows
left_ind = (
(nonzero_y_ind >= win_y_low) &
(nonzero_y_ind <= win_y_high) &
(nonzero_x_ind >= win_x_left_low) &
(nonzero_x_ind <= win_x_left_high)
).nonzero()[0]
right_ind = (
(nonzero_y_ind >= win_y_low) &
(nonzero_y_ind <= win_y_high) &
(nonzero_x_ind >= win_x_right_low) &
(nonzero_x_ind <= win_x_right_high)
).nonzero()[0]
left_line_ind.append(left_ind)
right_line_ind.append(right_ind)
# If there are enough pixels, re-align the window
if len(left_ind) > pix_to_recenter:
left_x = int(np.mean(nonzero_x_ind[left_ind]))
if len(right_ind) > pix_to_recenter:
right_x = int(np.mean(nonzero_x_ind[right_ind]))
# Combine the arrays of line indices
left_line_ind = np.concatenate(left_line_ind)
right_line_ind = np.concatenate(right_line_ind)
# Gather the final line pixel positions
left_x = nonzero_x_ind[left_line_ind]
left_y = nonzero_y_ind[left_line_ind]
right_x = nonzero_x_ind[right_line_ind]
right_y = nonzero_y_ind[right_line_ind]
# Color the lines on the vis_img
vis_img[left_y, left_x] = [254, 0, 0] # 254 so we can isolate the white 255 later
vis_img[right_y, right_x] = [0, 0, 254] # 254 so we can isolate the white 255 later
# Fit a 2nd-order polynomial to the lines
left_fit = np.polyfit(left_y, left_x, 2)
right_fit = np.polyfit(right_y, right_x, 2)
# Get our x/y vals for the fit lines
y_vals = np.linspace(0, warped_bin.shape[0]-1, warped_bin.shape[0])
left_x_vals = left_fit[0]*y_vals**2 + left_fit[1]*y_vals + left_fit[2]
right_x_vals = right_fit[0]*y_vals**2 + right_fit[1]*y_vals + right_fit[2]
if show:
fig, ax = plt.subplots(figsize=(20, 10))
ax.imshow(vis_img)
ax.plot(left_x_vals, y_vals, color='yellow')
ax.plot(right_x_vals, y_vals, color='yellow')
lane_lines_img = vis_img.copy()
lane_lines_img[lane_lines_img == 255] = 0 # This basically removes everything except the colored lane lines
# NEW - we can now just return a Lane obj
return Lane(y_vals, left_x_vals, right_x_vals, left_fit, right_fit, lane_lines_img)
# NEW - This function can now accept a Lane obj
def draw_lane(img, lane, show=False):
# Prepare the x/y points for cv2.fillPoly()
left_points = np.array([np.vstack([lane.left_x_vals, lane.y_vals]).T])
right_points = np.array([np.flipud(np.vstack([lane.right_x_vals, lane.y_vals]).T)])
# right_points = np.array([np.vstack([right_x_vals, y_vals]).T])
points = np.hstack((left_points, right_points))
# Color the area between the lines (the lane)
filled_lane = np.zeros_like(lane.lane_lines_img) # Create a blank canvas to draw the lane on
cv2.fillPoly(filled_lane, np.int_([points]), (0, 255, 0))
warped_lane_info = cv2.addWeighted(lane.lane_lines_img, 1, filled_lane, .3, 0)
unwarped_lane_info = cv2.warpPerspective(warped_lane_info, Minv, (img.shape[1], img.shape[0]))
drawn_img = cv2.addWeighted(img, 1, unwarped_lane_info, 1, 0)
if show: big_plot(drawn_img)
return drawn_img
def detect_lane_pipe(img, show=False):
''' "img" can be a path or a loaded image (for the movie pipeline) '''
if type(img) is str: img = mpimg.imread(img)
undist = undistort(img)
threshed = get_threshold(undist)
warped, M, Minv = warp_to_lines(threshed)
lane = find_lane(warped) # NEW - using the new lane class
return draw_lane(img, lane, show=show) # NEW - using the new lane class
drawn_img = detect_lane_pipe('./test_images/test2.jpg', show=True)
Now the pipeline identifies the lane while using the Lane class. This makes the code a little cleaner and allows us to store and use previous lane information.
Ok, so we have some code that's neat and makes a cool image, but is it useful? How do we use this code or image to direct a self-driving car? It turns out we can use the fit lines for each lane line to accomplish two things:
Measuring the curvature of the lane and how far the car is from the center of the lane helps us determine steering angles.
Let's take a look at how we'll get these values.
We'll want to add the curvature and offset values to the Lane class, so we'll expand it a bit. In the process, let's remove the constructor arguments, because that's getting a bit unwieldy too.
After that, we'll expand the "find_lane()" function to also calculate these new values and add them to the Lane class it already returns.
Finally, we'll display these new values in text on the final drawn image.
In [31]:
class Lane():
def __init__(self):
self.y_vals = None
self.left_x_vals = None
self.right_x_vals = None
self.lane_lines_img = None
self.left_curvature = None
self.right_curvature = None
self.offset = None
def calc_curvature(y_to_fit, x_to_fit, y_eval):
# Conversion factors for pixels to meters
m_per_pix_y, m_per_pix_x = 30/720, 3.7/700
# Fit a new polynomial to world-space (in meters)
fit = np.polyfit(y_to_fit*m_per_pix_y, x_to_fit*m_per_pix_x, 2)
curvature = ((1 + (2*fit[0]*(y_eval*m_per_pix_y) + fit[1])**2)**1.5) / np.absolute(2*fit[0])
return curvature
def calc_offset(left_x, right_x, img_center_x):
lane_width = abs(left_x - right_x)
lane_center_x = (left_x + right_x)//2
pix_offset = img_center_x - lane_center_x
lane_width_m = 3.7 # How wide we expect the lane to be in meters
return lane_width_m * (pix_offset/lane_width)
def find_lane(warped, show=False):
# Create a binary version of the warped image
warped_bin = np.zeros_like(warped[:,:,0])
warped_bin[(warped[:,:,0] > 0)] = 1
vis_img = warped.copy() # The image we will draw on to show the lane-finding process
vis_img[vis_img > 0] = 255 # Max out non-black pixels so we can remove them later
# Sum the columns in the bottom portion of the image to create a histogram
histogram = np.sum(warped_bin[warped_bin.shape[0]//2:,:], axis=0)
# Find the left an right right peaks of the histogram
midpoint = histogram.shape[0]//2
left_x = np.argmax(histogram[:midpoint]) # x-position for the left window
right_x = np.argmax(histogram[midpoint:]) + midpoint # x-position for the right window
n_windows = 10
win_height = warped_bin.shape[0]//n_windows
margin = 50 # Determines how wide the window is
pix_to_recenter = margin*2 # If we find this many pixels in our window we will recenter (too few would be a bad recenter)
# Find the non-zero x and y indices
nonzero_ind = warped_bin.nonzero()
nonzero_y_ind = np.array(nonzero_ind[0])
nonzero_x_ind = np.array(nonzero_ind[1])
left_line_ind, right_line_ind = [], []
for win_i in range(n_windows):
win_y_low = warped_bin.shape[0] - (win_i+1)*win_height
win_y_high = warped_bin.shape[0] - (win_i)*win_height
win_x_left_low = max(0, left_x - margin)
win_x_left_high = left_x + margin
win_x_right_low = right_x - margin
win_x_right_high = min(warped_bin.shape[1]-1, right_x + margin)
# Draw the windows on the vis_img
rect_color, rect_thickness = (0, 255, 0), 3
cv2.rectangle(vis_img, (win_x_left_low, win_y_high), (win_x_left_high, win_y_low), rect_color, rect_thickness)
cv2.rectangle(vis_img, (win_x_right_low, win_y_high), (win_x_right_high, win_y_low), rect_color, rect_thickness)
# Record the non-zero pixels within the windows
left_ind = (
(nonzero_y_ind >= win_y_low) &
(nonzero_y_ind <= win_y_high) &
(nonzero_x_ind >= win_x_left_low) &
(nonzero_x_ind <= win_x_left_high)
).nonzero()[0]
right_ind = (
(nonzero_y_ind >= win_y_low) &
(nonzero_y_ind <= win_y_high) &
(nonzero_x_ind >= win_x_right_low) &
(nonzero_x_ind <= win_x_right_high)
).nonzero()[0]
left_line_ind.append(left_ind)
right_line_ind.append(right_ind)
# If there are enough pixels, re-align the window
if len(left_ind) > pix_to_recenter:
left_x = int(np.mean(nonzero_x_ind[left_ind]))
if len(right_ind) > pix_to_recenter:
right_x = int(np.mean(nonzero_x_ind[right_ind]))
# Combine the arrays of line indices
left_line_ind = np.concatenate(left_line_ind)
right_line_ind = np.concatenate(right_line_ind)
# Gather the final line pixel positions
left_x = nonzero_x_ind[left_line_ind]
left_y = nonzero_y_ind[left_line_ind]
right_x = nonzero_x_ind[right_line_ind]
right_y = nonzero_y_ind[right_line_ind]
# Color the lines on the vis_img
vis_img[left_y, left_x] = [254, 0, 0] # 254 so we can isolate the white 255 later
vis_img[right_y, right_x] = [0, 0, 254] # 254 so we can isolate the white 255 later
# Fit a 2nd-order polynomial to the lines
left_fit = np.polyfit(left_y, left_x, 2)
right_fit = np.polyfit(right_y, right_x, 2)
# Get our x/y vals for the fit lines
y_vals = np.linspace(0, warped_bin.shape[0]-1, warped_bin.shape[0])
left_x_vals = left_fit[0]*y_vals**2 + left_fit[1]*y_vals + left_fit[2]
right_x_vals = right_fit[0]*y_vals**2 + right_fit[1]*y_vals + right_fit[2]
# Calculate real-world curvature for each lane line
left_curvature = calc_curvature(left_y, left_x, np.max(y_vals))
right_curvature = calc_curvature(right_y, right_x, np.max(y_vals))
# Calculate the center-lane offset of the car
offset = calc_offset(left_x_vals[-1], right_x_vals[-1], warped.shape[1]//2)
if show:
fig, ax = plt.subplots(figsize=(20, 10))
ax.imshow(vis_img)
ax.plot(left_x_vals, y_vals, color='yellow')
ax.plot(right_x_vals, y_vals, color='yellow')
lane_lines_img = vis_img.copy()
lane_lines_img[lane_lines_img == 255] = 0 # This basically removes everything except the colored lane lines
# Build the Lane object to return
lane = Lane()
lane.y_vals = y_vals
lane.left_x_vals = left_x_vals
lane.right_x_vals = right_x_vals
lane.lane_lines_img = lane_lines_img
lane.left_curvature = left_curvature
lane.right_curvature = right_curvature
lane.offset = offset
return lane
def draw_values(img, lane, show=False):
font = cv2.FONT_HERSHEY_SIMPLEX
scale = .7
color = (0, 0, 0)
line_type = cv2.LINE_AA
cv2.putText(
img,
'Left, Right Curvature: {:.2f}m, {:.2f}m'.format(lane.left_curvature, lane.right_curvature),
(20, 40), # origin point
font,
scale,
color,
lineType=line_type
)
cv2.putText(
img,
'Center-Lane Offset: {:.2f}m'.format(lane.offset),
(20, 80), # origin point
font,
scale,
color,
lineType=line_type
)
if show: big_plot(img)
return img
def draw_lane(img, lane, show=False):
# Prepare the x/y points for cv2.fillPoly()
left_points = np.array([np.vstack([lane.left_x_vals, lane.y_vals]).T])
right_points = np.array([np.flipud(np.vstack([lane.right_x_vals, lane.y_vals]).T)])
# right_points = np.array([np.vstack([right_x_vals, y_vals]).T])
points = np.hstack((left_points, right_points))
# Color the area between the lines (the lane)
filled_lane = np.zeros_like(lane.lane_lines_img) # Create a blank canvas to draw the lane on
cv2.fillPoly(filled_lane, np.int_([points]), (0, 255, 0))
warped_lane_info = cv2.addWeighted(lane.lane_lines_img, 1, filled_lane, .3, 0)
unwarped_lane_info = cv2.warpPerspective(warped_lane_info, Minv, (img.shape[1], img.shape[0]))
drawn_img = cv2.addWeighted(img, 1, unwarped_lane_info, 1, 0)
drawn_img = draw_values(drawn_img, lane)
if show: big_plot(drawn_img)
return drawn_img
drawn_img = detect_lane_pipe('./test_images/test2.jpg', show=True)
Great! The image displays like normal, but now the curvature and offset measurements are displayed in the top-left corner.
Using this pipeline we could annotate entire videos of driving frame-by-frame.
Unforunately, not all frames of the video provide us with enough information to perfectly detect the lane. To prevent erratic behavior, we need to implement some notion of filtering and smoothing. Basically, we'll assume things don't change drastically, and if they do then we resume previous behavior.
For filtering we need to look at previous Lane detections and determine whether the current Lane detection is appropriate. If the current is too far off from the previous, something's wrong.
For smoothing we will use as a final result the average of the current and previously stored findings. This will ensure our updates don't differ sharply from the previous result and will produce a smoother video.
In [32]:
# Filtering - Tests whether a new lane is appropriate compared to previously found lanes
def is_good_lane(lane):
global prev_lanes
good_line_diff = True
good_lane_area = True
# Measure the total x-pixel difference between the new and the old
if len(prev_lanes) > 0: # If we don't have any previous lanes yet, just assume True
prev_x_left = prev_lanes[0].left_x_vals
prev_x_right = prev_lanes[0].right_x_vals
current_x_left = lane.left_x_vals
current_x_right = lane.right_x_vals
left_diff = np.sum(np.absolute(prev_x_left - current_x_left))
right_diff = np.sum(np.absolute(prev_x_right - current_x_right))
lane_pixel_margin = 50 # How much different the new lane's x-values can be from the last lane
diff_threshold = lane_pixel_margin*len(prev_x_left)
if left_diff > diff_threshold or right_diff > diff_threshold:
print(diff_threshold, int(left_diff), int(right_diff))
print()
good_line_diff = False
# Make sure the area between the lane lines is appropriate (not too small or large)
lane_area = np.sum(np.absolute(np.subtract(lane.right_x_vals, lane.left_x_vals)))
area_min, area_max = 400000, 800000 # Area thesholds
if lane_area < area_min or lane_area > area_max:
print('Bad lane area:', lane_area)
good_lane_area = False
return (good_line_diff and good_lane_area)
# Smoothing - Averages the stored lanes for a smoothing effect
def get_avg_lane():
global prev_lanes
if len(prev_lanes) == 0: return None
elif len(prev_lanes) == 1: return prev_lanes[0]
else: # More than 1 previous result to average together
n_lanes = len(prev_lanes)
new_lane = prev_lanes[0]
avg_lane = Lane() # The averaged lane we will return (with some defaults)
avg_lane.y_vals = new_lane.y_vals
avg_lane.lane_lines_img = new_lane.lane_lines_img
# Average the left and right lanes' x-values
left_avg = new_lane.left_x_vals
right_avg = new_lane.right_x_vals
for i in range(1, n_lanes):
left_avg = np.add(left_avg, prev_lanes[i].left_x_vals)
right_avg = np.add(right_avg, prev_lanes[i].right_x_vals)
avg_lane.left_x_vals = left_avg / n_lanes
avg_lane.right_x_vals = right_avg / n_lanes
# Average the curvatures and offsets
avg_lane.left_curvature = sum([lane.left_curvature for lane in prev_lanes])/n_lanes
avg_lane.right_curvature = sum([lane.right_curvature for lane in prev_lanes])/n_lanes
avg_lane.offset = sum([lane.offset for lane in prev_lanes])/n_lanes
return avg_lane
def detect_lane_pipe(img, show=False):
''' "img" can be a path or a loaded image (for the movie pipeline) '''
global prev_lanes
global n_bad_lanes
if type(img) is str: img = mpimg.imread(img)
n_lanes_to_keep = 5
# To speed up processing, if we've had an easy time detecting the lane, do half the updates
if len(prev_lanes) == n_lanes_to_keep and n_bad_lanes == 0:
n_bad_lanes += 1
return draw_lane(img, get_avg_lane(), show=show)
undist = undistort(img)
threshed = get_threshold(undist)
warped, M, Minv = warp_to_lines(threshed)
lane = find_lane(warped)
if is_good_lane(lane): # If the lane is good (compared to previous lanes), add it to the list
n_bad_lanes = 0
prev_lanes.insert(0, lane)
if len(prev_lanes) > n_lanes_to_keep: prev_lanes.pop()
else:
n_bad_lanes += 1
# If we get stuck on some bad lanes, don't reinforce it, just clear them out.
if n_bad_lanes >= 12:
print('Resetting: too many bad lanes.')
n_bad_lanes = 0
prev_lanes = []
# If we start with some bad lanes, this will just skip the drawing
if len(prev_lanes) == 0: return img
return draw_lane(img, get_avg_lane(), show=show)
Great! "is_good_lane()" tells us whether the lane was appropriate based on two factors:
"get_avg_lane()" does exactly that - it averages the current and previous fit lines together to create a smoothing effect so the lane doesn't jump around and look jittery.
While single-image processing won't make use of the filtering and smoothing functions just implemented, let's make sure the pipeline still works as expected.
In [35]:
n_bad_lanes = 0
prev_lanes = []
drawn_img = detect_lane_pipe('./test_images/test2.jpg', show=True)
cv2.imwrite('./example_images/final.jpg', cv2.cvtColor(drawn_img, cv2.COLOR_RGB2BGR))
Out[35]:
With the pipeline still working as expected for single images, it's time to test the filtering and smoothing functions on some videos.
To annotate an entire video we will use the convenient "moviepy" library. It will load an existing video file, break it up into frames, pass each frame through the lane detection pipeline, and then stitch them all back together into a new video. Let's get to it!
In [19]:
# Import everything needed to edit/save/watch video clips
from moviepy.editor import VideoFileClip
from IPython.display import HTML
In [ ]:
n_bad_lanes = 0
prev_lines = []
f_name = 'project_video.mp4'
video = VideoFileClip('./videos/{}'.format(f_name)) # Load the original video
video = video.fl_image(detect_lane_pipe) # Pipe the video frames through the lane-detection pipeline
# video = video.subclip(39, 43) # Only process a portion
%time video.write_videofile('./videos_out/{}'.format(f_name), audio=False) # Write the new video
In [ ]:
# Challenge Video 1
n_bad_lanes = 0
prev_lines = []
f_name = 'challenge_video.mp4'
video = VideoFileClip('./videos/{}'.format(f_name)) # Load the original video
video = video.fl_image(detect_lane_pipe) # Pipe the video frames through the lane-detection pipeline
# video = video.subclip(39, 43) # Only process a portion
%time video.write_videofile('./videos_out/{}'.format(f_name), audio=False) # Write the new video
In [ ]:
# Challenge Video 2
n_bad_lanes = 0
prev_lines = []
f_name = 'harder_challenge_video.mp4'
video = VideoFileClip('./videos/{}'.format(f_name)) # Load the original video
video = video.fl_image(detect_lane_pipe) # Pipe the video frames through the lane-detection pipeline
# video = video.subclip(0, 10) # Only process a portion
%time video.write_videofile('./videos_out/{}'.format(f_name), audio=False) # Write the new video
The video outputs are available in the project repo in the "saved_videos" folder. They can also be viewed on YouTube:
Project Video
If you look at the videos, the pipeline works very well on the standard project video. I'm very happy the pipeline doesn't cut out on either of the bridges where the pavement color changes and there are lots of shadows. One thing I noticed while developing the pipeline on this video is that the s-channel image, while normally better, isn't sufficient for the Sobel-gradient when thresholding. However, calculating the Sobel-gradient using both grey and s-channel images enabled the lines to be reliably detected.
Challenge Video
The pipeline actually does pretty decent in the challenge video, too, but a couple things cause it to slip up.
First, there are vertical lines in the pavement either where two different pavement types have been joined (even different colors) or where the pavement has been damaged. Sometimes the pipeline confuses the right dotted lane for these other lines. It helped with this problem when I implemented lane-area filtering, but it's still not perfect. I think limiting the area in which lines are searched for might solve this problem.
Second, the underpass causes a huge loss of light, making the lanes VERY difficult to detect. I didn't experiment with thresholding techniques on images from this area, but that's where I would start to solve the issue. I'm a little skeptical that thresholding alone would solve it, though.
Harder Challenge Video
The harder challenge video lives up its name! The pipeline doesn't do well on this one. While I might feel save in a car using my pipeline on the first challenge video's road, I'm not going anywhere a car using my pipeline on the harder challenge video's road! There are two things to note about the harder challenge video.
First, the pipeline clearly has the most trouble with the sharper turns. In part, I think this could be fixed by tweaking the perspective warp area. I didn't check, but I bet for the sharp turns a lot of one of the lines isn't even in the image to be detected. I think a harder but more robust solution for this problem is to either turn the camera with the car or use more cameras. On a sharp right turn, a camera facing that direction (perhaps with a better perspective, too) would provide better information.
Second, due to all the trees, the lighting varies drastically in this image. Sometimes you actually see the interior of the car due to the glare. Unexpectedly, this causes isssues. I was actually surprised to find that the pipeline wasn't too thrown off by this. Maybe some thresholding changes would help, but even from a human perspective those kinds of conditions can be difficult. Using more or different sensors would be one option. For a camera-only solution, techniques to remember and anticipate better could be implemented so that a future-looking plan could be executed when the sensors aren't providing good data.