R in Jupyter

install conda environment via the src/install.bash script in this repo



In [1]:

    
library(dplyr)
library(ggplot2)









    



Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

    filter, lag

The following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union



In [2]:

    
head(economics)









    Out[2]:





date pce pop psavert uempmed unemploy

	1 1967-06-30 507.8 198712 9.8 4.5 2944
	2 1967-07-31 510.9 198911 9.8 4.7 2945
	3 1967-08-31 516.7 199113 9 4.6 2958
	4 1967-09-30 513.3 199311 9.8 4.9 3143
	5 1967-10-31 518.5 199498 9.7 4.7 3066
	6 1967-11-30 526.2 199657 9.4 4.8 3018



In [3]:

    
a <- ggplot(data = economics, aes(x = date, y = unemploy))
a <- a + geom_line()
a



In [4]:

    
a <- ggplot(data = economics, aes(x = date, y = unemploy))
a <- a + geom_line()
a <- a + geom_smooth(method = "loess")
a

R-gotchas!

http://stackoverflow.com/questions/34191059/why-are-nas-in-newly-created-data-frame-when-using-logical-selection



In [5]:

    
df <- data.frame(birth_state=c("Illinois", "Arizona", NA),
                 data_scientist=c("Kevin", "Matt", "Jonathan"))



In [6]:

    
df









    Out[6]:





birth_state data_scientist

	1 Illinois Kevin
	2 Arizona Matt
	3 NA Jonathan



In [7]:

    
df$birth_state == "Arizona"









    Out[7]:





	FALSE
	TRUE
	NA



In [8]:

    
just_Arizona <- df[df$birth_state=="Arizona",]



In [9]:

    
just_Arizona









    Out[9]:





birth_state data_scientist

	2 Arizona Matt
	NA NA NA



In [10]:

    
really_just_Arizona <- df[which(df$birth_state=="Arizona"),]



In [11]:

    
really_just_Arizona









    Out[11]:





birth_state data_scientist

	2 Arizona Matt



In [ ]:

	date	pce	pop	psavert	uempmed	unemploy
1	1967-06-30	507.8	198712	9.8	4.5	2944
2	1967-07-31	510.9	198911	9.8	4.7	2945
3	1967-08-31	516.7	199113	9	4.6	2958
4	1967-09-30	513.3	199311	9.8	4.9	3143
5	1967-10-31	518.5	199498	9.7	4.7	3066
6	1967-11-30	526.2	199657	9.4	4.8	3018