In [1]:
from IPython.core.display import HTML
styles = open("Style.css").read()
HTML(styles)
Out[1]:
In [2]:
import pandas as pd
import numpy as np
import rpy2.robjects as robjects
In [3]:
pi = robjects.r('pi')
pi[0]
Out[3]:
In [4]:
%load_ext rmagic
Run linear regression in R, print out a summary, and pass the result variable error back to Python:
In [5]:
%%R -o error
set.seed(10)
y<-c(1:1000)
x1<-c(1:1000)*runif(1000,min=0,max=2)
x2<-(c(1:1000)*runif(1000,min=0,max=2))^2
x3<-log(c(1:1000)*runif(1000,min=0,max=2))
all_data<-data.frame(y,x1,x2,x3)
positions <- sample(nrow(all_data),size=floor((nrow(all_data)/4)*3))
training<- all_data[positions,]
testing<- all_data[-positions,]
lm_fit<-lm(y~x1+x2+x3,data=training)
print(summary(lm_fit))
predictions<-predict(lm_fit,newdata=testing)
error<-sqrt((sum((testing$y-predictions)^2))/nrow(testing))
In [6]:
print error
First we create the data in R:
In [7]:
%%R -o training,testing
set.seed(10)
y<-c(1:1000)
x1<-c(1:1000)*runif(1000,min=0,max=2)
x2<-(c(1:1000)*runif(1000,min=0,max=2))^2
x3<-log(c(1:1000)*runif(1000,min=0,max=2))
all_data<-data.frame(y,x1,x2,x3)
positions <- sample(nrow(all_data),size=floor((nrow(all_data)/4)*3))
training<- all_data[positions,]
testing<- all_data[-positions,]
The variables training and testing are now available as numpy array in Python namespace due to the -o flag in the cell above. We'll create pandas DataFrame from them:
In [8]:
tr = pd.DataFrame(dict(zip(['y', 'x1', 'x2', 'x3'], training)))
te = pd.DataFrame(dict(zip(['y', 'x1', 'x2', 'x3'], testing)))
tr.head()
Out[8]:
Create linear regression model, print a summary:
In [9]:
from statsmodels.formula.api import ols
lm = ols('y ~ x1 + x2 + x3', tr).fit()
lm.summary()
Out[9]:
Predict and compute RMSE:
In [10]:
pred = lm.predict(te)
error = sqrt((sum((te.y - pred)**2)) / len(te))
error
Out[10]:
First we create data (numpy array) in Python:
In [11]:
X = np.array([0,1,2,3,4])
Y = np.array([3,5,4,6,7])
We pass them into R using the -i flag, run linear regression in R, print a summary and plot, output the result back in Python:
In [12]:
%%R -i X,Y -o XYcoef
XYlm = lm(Y~X)
XYcoef = coef(XYlm)
print(summary(XYlm))
par(mfrow=c(2,2))
plot(XYlm)
We also pass the model coefficients from R as variable XYcoef:
In [13]:
XYcoef
Out[13]:
In [ ]: