In this section, we'll use linear regression to predict life expectancy from body mass index (BMI)

For our linear regression model, we'll be using scikit-learn's LinearRegression class.

This class provides the function fit() to fit the model to our data. Fitting the model means finding the best line that fits the training data.

And we will make prediction using the predict() function.

Discover the data

We'll be working with data on the average life expectancy at birth and the average BMI for males across the world. It includes three columns, containing the following data:

  • Country – The country the person was born in.
  • Life expectancy – The average life expectancy at birth for a person in that country.
  • BMI – The mean BMI of males in that country.

Load the Libraries


In [1]:
import pandas as pd
from sklearn import linear_model
import matplotlib.pyplot as plt

Load the Data


In [2]:
bmi_life_data = pd.read_csv('bmi_and_life_expectancy.csv')
x_values = bmi_life_data[['BMI']]
y_values = bmi_life_data[['Life expectancy']]
z_values = bmi_life_data[['Country']]

Linear Regression Model


In [3]:
# Make and fit the linear regression model
# Fit the model and Assign it to bmi_life_model
bmi_life_model = linear_model.LinearRegression()
bmi_life_model.fit(x_values, y_values)


Out[3]:
LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)

Make the Prediction


In [4]:
# Mak a prediction using the model
# For example, predict life expectancy for a BMI value of 21,07
laos_life_exp = bmi_life_model.predict(21.07)
print(laos_life_exp)


[[ 60.29220012]]