Chemometrics


Julien Wist / 2017 / Universidad del Valle
Andrés Bernal / 2017 / ???

An up-to-date version of this notebook can be found here: https://github.com/jwist/chemometrics/


In [1]:
options(repr.plot.width=4, repr.plot.height=4)

In [169]:
# we load a few packages
library(ggplot2)
library(corrplot)
library(reshape2)
library(caret)
library(MASS) # for LDA
library(klaR) # for pls
library(pls)
library(e1071)# for pls
library(pROC) # for pls
library(vegan)
require(scales)
require(gridExtra);


corrplot 0.84 loaded
Loading required package: lattice

Attaching package: ‘pls’

The following object is masked from ‘package:caret’:

    R2

The following object is masked from ‘package:corrplot’:

    corrplot

The following object is masked from ‘package:stats’:

    loadings

Type 'citation("pROC")' for a citation.

Attaching package: ‘pROC’

The following objects are masked from ‘package:stats’:

    cov, smooth, var

Loading required package: permute
This is vegan 2.5-2

Attaching package: ‘vegan’

The following object is masked from ‘package:pls’:

    scores

The following object is masked from ‘package:klaR’:

    rda

The following object is masked from ‘package:caret’:

    tolerance

Loading required package: scales
Loading required package: gridExtra

Let's load some NMR data


In [4]:
#rm(list=ls(all=TRUE))

load(url('https://github.com/jwist/chemometrics/raw/master/datasets/coffeeMulti.rda'))
load(url('https://github.com/jwist/chemometrics/raw/master/datasets/ppm.binned.rda'))

In [5]:
t<-coffeeMulti$gc$country; #levels(t)<-c("o", "c","o" )
# we perform ANOVA on the nmr data. 
#d <- data.frame("x"=I(as.matrix(coffeeMulti$nmrBin)),"country"=coffeeMulti$gc$country)
d <- data.frame("x"=I(as.matrix(coffeeMulti$nmrBin)),"country"=t)
colnames(d$x) <- round(ppm.binned,3) # put ppm of bins as tag for data column. This will be helpfull for tracking variables.
fit <- aov(x ~ country, data=d)
resultsF <- unlist( lapply(summary(fit),function(x) x[[4]][1]) ) # we extract F-values
# extract the p-value of K-W test for all NMR variables
#resultsK_W.p <- apply(d$x, 2, function(x) kruskal.test(x ~ coffeeMulti$gc$country)$p.value)
resultsK_W.p <- apply(d$x, 2, function(x) kruskal.test(x ~ t)$p.value)


# we look at some results to have an idea
b <- head(order(resultsF, decreasing = TRUE), 200) # select the most discriminant variable for further analysis
best <- b[1]
#best = 1150
d <- data.frame("x"=coffeeMulti$nmrBin[,best],"country"=t)
#d <- data.frame("x"=coffeeMulti$nmrBin[,best],"country"=coffeeMulti$gc$country)

In [6]:
boxplot(tapply(d$x,d$country,function(x) x))



In [7]:
# here we plot the distribution for each class
x <- tapply(d$x,d$country,function(x) x)
xlims <- range(unlist(tapply(d$x,d$country,function(x) unlist(range(density(x)$x)))))
ylims <- range(unlist(tapply(d$x,d$country,function(x) unlist(range(density(x)$y)))))
plot(density(x[1][[1]]),type='l', xlim=xlims, ylim=ylims, col=2)
lines(density(x[2][[1]]), col=3)
lines(density(x[3][[1]]), col=4)
legend(min(xlims),max(ylims),levels(d$country), col=2:4, pch="*")



In [8]:
ukeyHSD(fit, "country", ordered = TRUE);
plot(TukeyHSD(fit, "country"));



In [9]:
F <- resultsF > 4
index <- 1:length(resultsF)
plot(ppm.binned[F],as.numeric(coffeeMulti$nmrBin[1,F]), type="n", ylim=c(-1e7,5e8), xlim=c(7.5,5.5))
lines(ppm.binned,as.numeric(coffeeMulti$nmrBin[1,]), col=1)
for (i in c(4,6,8,15,30)) {
  F <- resultsF > i
  index <- 1:length(resultsF)
  points(ppm.binned[F],resultsF[F]*5e6,cex=0.5, col=i)
}



In [10]:
options(repr.plot.width=8, repr.plot.height=3)

In [11]:
par(mfrow=c(1,2))

x <- runif(200,min=-1,max=1)
y <- x^2
plot(x,y,pch=16)
rug(x,side=1,col="grey")
rug(y,side=2,col="grey")
#cor(x,y);
#chisq.test(x,y) ;

x <- runif(200,min=-1,max=1)
y <- x
plot(x,y,pch=16)
rug(x,side=1,col="grey")
rug(y,side=2,col="grey")
#cor(x,y);
#chisq.test(x,y) ;


Despite in both cases $x$ and $y$ are dependent, in the left case they are completely uncorrelated. So being uncorrelated is not the same as being independent, but independant variables are uncorrelated. Two variables are considered independent when their joint probability distribution is equal to the product of their individual marginal probability distributions: $$P_{X,Y}(x,y)=P_X(x)P_Y(y)$$

Two variables are uncorrelated when their correlation coefficient is zero or when their covariance is equal to zero.

high number of variable

  • The number of available variables (predictors) is much higher that the number of observation
  • overfitting
  • reduce the number of variables a task that is achieved by step-wise regression, etc.

multivariate analysis

However there are others and more robust alternatives to these methods that aknowledge the redundancy of information spanned by collinear predictors and try to convert collinearity of predictors (an issue) into reliability (a plusvalue).


In [15]:
options(repr.plot.width=4, repr.plot.height=4)
load('/Users/jul/metabo/paper_IRMS/resultsF.rda')
load('/Users/jul/metabo/paper_IRMS/coffeeMulti.rda')
load('/Users/jul/metabo/paper_IRMS/ppm.binned.rda')
colnames(coffeeMulti$nmrBin) <- round(ppm.binned,3)
country <- coffeeMulti$gc$country 
irms <- abs(coffeeMulti$irms$mean) 
vite <- coffeeMulti$gc$vitE
caff <- coffeeMulti$gc$caffeine 
fumeth <- coffeeMulti$gc$furanmethanol
ffilter <- resultsF > 15
x <- cbind(irms,
           vite,
           caff,
           fumeth,
           as.matrix(coffeeMulti$nmrBin[,ffilter]))
is(x)


Warning message in readChar(con, 5L, useBytes = TRUE):
“cannot open compressed file '/home/jul/metabo/paper_IRMS/resultsF.rda', probable reason 'No such file or directory'”
Error in readChar(con, 5L, useBytes = TRUE): cannot open the connection
Traceback:

1. load("/home/jul/metabo/paper_IRMS/resultsF.rda")
2. readChar(con, 5L, useBytes = TRUE)

In [13]:
corrplot::corrplot(cor(cbind(irms, vite, caff, fumeth)),
                   method='number')


Error in cbind(irms, vite, caff, fumeth): object 'irms' not found
Traceback:

1. corrplot::corrplot(cor(cbind(irms, vite, caff, fumeth)), method = "number")
2. cor(cbind(irms, vite, caff, fumeth))
3. is.data.frame(x)
4. cbind(irms, vite, caff, fumeth)

In [13]:
options(repr.plot.width=5, repr.plot.height=4)
qplot(x=Var1, y=Var2, data=melt(cor(x)), fill=value, geom="tile")



In [14]:
M <- cor(x)
M[M > 0.9] <- NA
qplot(x=Var1, y=Var2, data=melt(M), fill=value, geom="tile")



In [44]:
C <- findCorrelation(cor(x), 
                     cutoff = .90, 
                     verbose = FALSE)
#x <- x[ , -C ]

fvalues <- c(8.879,4.14,31.87,4.76,resultsF[ffilter])
# to ensure that both x and f-value with the same number of Var
fvalues <- fvalues[ -C ] 
x <- scale(x, center=TRUE)

FF <- c(7,20,22,24)
d <- data.frame("x"=I(as.matrix(x[-FF,])), "country"=country[-FF])
#d <- d[d$country!="Colombia",]

pca <- prcomp(d$x)

In [45]:
options(repr.plot.width=8, repr.plot.height=4)
pc <- c(1,2)
COLOR <- c(2:4)
ev <- c(round(pca$sdev[pc[1]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[2]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[3]]/sum(pca$sdev)*100,0))
plot(pca$x[,pc[1]], pca$x[,pc[2]], col=COLOR[d$country], cex=0.7, 
     xlab=paste0("PC ", pc[1], " (", ev[pc[1]], "%)"), 
     ylab=paste0("PC ", pc[2], " (", ev[pc[2]], "%)"))
legend("topright", legend=levels(country), col=COLOR, pch=1)



In [43]:
options(repr.plot.width=8, repr.plot.height=4)
pc <- c(1,2)
COLOR <- c(2:4)
ev <- c(round(pca$sdev[pc[1]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[2]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[3]]/sum(pca$sdev)*100,0))
plot(pca$x[,pc[1]], pca$x[,pc[2]], col=COLOR[d$country], cex=0.7, 
     xlab=paste0("PC ", pc[1], " (", ev[pc[1]], "%)"), 
     ylab=paste0("PC ", pc[2], " (", ev[pc[2]], "%)"))
legend("topright", legend=levels(country), col=COLOR, pch=1)



In [38]:
options(repr.plot.width=8, repr.plot.height=4)
pc <- c(1,2)
COLOR <- c(2:4)
ev <- c(round(pca$sdev[pc[1]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[2]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[3]]/sum(pca$sdev)*100,0))
plot(pca$x[,pc[1]], pca$x[,pc[2]], col=COLOR[d$country], cex=0.7, 
     xlab=paste0("PC ", pc[1], " (", ev[pc[1]], "%)"), 
     ylab=paste0("PC ", pc[2], " (", ev[pc[2]], "%)"))
legend("topright", legend=levels(country), col=COLOR, pch=1)



In [36]:
options(repr.plot.width=8, repr.plot.height=4)
pc <- c(1,2)
COLOR <- c(2:4)
ev <- c(round(pca$sdev[pc[1]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[2]]/sum(pca$sdev)*100,0),
        round(pca$sdev[pc[3]]/sum(pca$sdev)*100,0))
plot(pca$x[,pc[1]], pca$x[,pc[2]], col=COLOR[d$country], cex=0.7, 
     xlab=paste0("PC ", pc[1], " (", ev[pc[1]], "%)"), 
     ylab=paste0("PC ", pc[2], " (", ev[pc[2]], "%)"))
legend("topright", legend=levels(country), col=COLOR, pch=1)



In [23]:
options(repr.plot.width=8, repr.plot.height=4)
pc <- c(1,3)
COLOR <- c(2:4)
PCH <- c(1,16)
pca <- prcomp(d$x)
#pca <- prcomp(d$x,scale=FALSE,center=TRUE)
layout(matrix(c(1,2), 1, 2, byrow = TRUE), widths=c(3,3), heights=c(1)) 
op <- par(mar = c(4,4,1,1))
biplot(pca,cex=0.7,cex.lab=0.7,cex.axis=0.7)
plot(pca$x[,pc[1]], pca$x[,pc[2]], col=COLOR[d$country], cex=PCH[1], xlab=paste0("PC ", pc[1], " (", round(pca$sdev[pc[1]]/sum(pca$sdev)*100,0), "%)"), ylab=paste0("PC ", pc[2], " (", round(pca$sdev[pc[2]]/sum(pca$sdev)*100,0), "%)"))
legend("topright", legend=levels(country), col=COLOR, pch=1)
par(op)



In [24]:
pairs(pca$x[,1:3],col=d$country)



In [25]:
options(repr.plot.width=4, repr.plot.height=4)
prin_comp<-rda(d$x, scale=TRUE)
pca_scores<-scores(prin_comp)
plot(pca_scores$sites[,1],pca_scores$sites[,2], pch=21, bg=as.numeric(d$country),
    arrows(0,0,pca_scores$species[,1],pca_scores$species[,2],lwd=1,length=0.2),
    xlab=paste0("PC ", pc[1], " (", round(pca$sdev[pc[1]]/sum(pca$sdev)*100,0), "%)"), ylab=paste0("PC ", pc[2], " (", round(pca$sdev[pc[2]]/sum(pca$sdev)*100,0), "%)"))
ordiellipse(prin_comp,d$country,conf=0.99)
text(pca_scores$species[,1],pca_scores$species[,2],colnames(d$x),cex=0.7,pos=4,col="red")



In [26]:
op <- par(mar = c(4,4,1,1))
plot(pca$rotation[,1], type='l', ylim=range(pca$rotation[,1:3]))
lines(pca$rotation[,2], type='l', col=2)
lines(pca$rotation[,3], type='l', col=3)
par(op)



In [27]:
options(repr.plot.width=8, repr.plot.height=4)
layout(matrix(c(1,2,3), 1, 3, byrow = TRUE), 
       widths=c(3,3,3), 
       heights=c(1)) 
op <- par(mar = c(4,4,0,0))
plot(pca$rotation[,1],pca$rotation[,2]); 
plot(pca$rotation[,1],pca$rotation[,3]); 
plot(pca$rotation[,2],pca$rotation[,3])
par(op)



In [28]:
options(repr.plot.width=4, repr.plot.height=4)
aload <- abs(pca$rotation)
contrib <- sweep(aload, 2, colSums(aload), "/")

plot(fvalues[],contrib[,pc[1]],
     xlab="F-values calculated from ANOVA", 
     ylab=paste0("contribution to PC",pc[1]))
text(fvalues[],contrib[,pc[1]],colnames(d$x),cex=0.7,pos=4,col="red")
legend('topleft', 
       legend=paste("corr=",round(cor(fvalues,contrib[,pc[1]]),2)))


Error in xy.coords(x, y, xlabel, ylabel, log): 'x' and 'y' lengths differ
Error in text.default(fvalues[], contrib[, pc[1]], colnames(d$x), cex = 0.7, : plot.new has not been called yet
Error in cor(fvalues, contrib[, pc[1]]): incompatible dimensions

In [29]:
x <- cbind(irms,vite,caff,fumeth,as.matrix(coffeeMulti$nmrBin[,ffilter]))
C <- findCorrelation(cor(x), cutoff = .90, verbose = FALSE)
fvalues <- c(8.879,4.14,31.87,4.76,resultsF[ffilter]) # we apply the filter
x <- scale(x, center=TRUE)
d <- data.frame("x"=I(as.matrix(x)), "country"=country)
rownames(d$x)<-1:nrow(d) #to have the numbers according to the index in the matrix
pca <- prcomp(d$x)
aload <- abs(pca$rotation)
contrib <- sweep(aload, 2, colSums(aload), "/")
op <- par(mar = c(4,4,0,0))

In [30]:
plot(fvalues[],contrib[,pc[1]],xlab="F-values calculated from ANOVA", ylab=paste0("contribution to PC",pc[1]))
text(fvalues[],contrib[,pc[1]],colnames(d$x),cex=0.7,pos=4,col="red")
legend('topleft', legend=paste("corr=",round(cor(fvalues,contrib[,pc[1]]),2)))
par(op)


conclusions

  • irms, gc-ms and nmr are not complementary
  • the same molecules are detected and therefore no additional value is gained

PLS


In [31]:
library(caret)
options(repr.plot.width=4, repr.plot.height=4)

In [32]:
m<-matrix(sample(0:140,140), nrow= 20,ncol =7, byrow="TRUE")

species<-c("1", "1", "1","1", "1", "1","1", "1", "1","1",
           "-1","-1","-1","-1","-1","-1","-1","-1","-1","-1")

M<-data.frame(m,species)
X<-m
Y<-as.numeric(species)

In [33]:
M[c(1,2,19,20),]


Out[33]:
X1X2X3X4X5X6X7species
154 63 17 10851 12088 1
266 10596 2 12132 90 1
1912774 11479 45 130113-1
2033 11937 40 82 68 0 -1

In [34]:
nf = dim(X)[1]
T = c()
P = c()
C = c()
W = c()

In [35]:
for (j in 1:nf) {
  w = (t(X) %*% Y) %*% solve(t(Y) %*% Y)
  w1 = t(w) %*% w
  w2 = abs(sqrt(w1))
  w = w %*% solve(w2)
  t = (X %*% w)
  t1 = t(t) %*% t
  c = t(Y) %*% t %*% solve(t1)
  c1 = t(c) %*% c
  u = Y %*% c %*% solve(c1)
  u1 = t(u) %*% u
  u2 = abs(sqrt(u1))
  p = (t(X) %*% t) %*% solve(t1)
  X = X - (t)%*%t(p)
  T = matrix(c(T, t))
  P = matrix(c(P, p))
  C = matrix(c(C, c))
  W = matrix(c(W, w))
}

In [36]:
T = matrix(T, ncol = nf)
T = scale(T, scale = TRUE, center = TRUE)
P = matrix(P, ncol = nf)
C = matrix(C, ncol = nf)
W = matrix(W, ncol = nf)

In [37]:
plot(T[,1], T[,2], col = factor(species))
text(T[,1], T[,2], species, pos = 2)


o-PLS


In [38]:
m<-matrix(sample(6e6:7e6,140), nrow= 20,ncol =7, byrow="TRUE")

In [39]:
species<-c("1", "1", "1","1", "1", "1","1", "1", "1","1",
           "-1","-1","-1","-1","-1","-1","-1","-1","-1","-1")

M<-data.frame(m,species)
X<-m
Y<-as.numeric(species)
nf=15

T = c()
P = c()
C = c()
W = c()
Tortho = c()
Portho = c()
Wortho = c()
Cortho = c()

In [40]:
for (j in 1:nf) {
  w = (t(X) %*% Y) %*% solve(t(Y) %*% Y)
  w=w%*%solve(norm(w))
  t = (X %*% w) %*% solve(t(w) %*% w)
  c = t(Y) %*% t %*% solve(t(t) %*% t)
  u = Y %*% c %*% solve(t(c) %*% c)
  p = (t(X) %*% t) %*% solve(t(t) %*% t)
  wortho = p - w
  wortho= p - t(((t(w)%*%p)%*% solve(t(w)%*%w))%*% t(w))
  wortho= wortho%*%solve(norm(wortho))
  tortho = X %*% wortho %*% solve(t(wortho) %*% wortho)
  portho = t(X) %*% tortho %*% solve(t(tortho) %*% tortho)
  cortho = t(Y) %*% tortho %*% solve(t(tortho) %*% tortho)
  X = X - tortho %*% t(portho)
  T = matrix(c(T, t))
  P = matrix(c(P, p))
  C = matrix(c(C, c))
  W = matrix(c(W, w))
  Tortho = matrix(c(Tortho, tortho))
  Portho = matrix(c(Portho, portho))
  Wortho = matrix(c(Wortho, wortho))
  Cortho = matrix(c(Cortho, cortho))
}

In [41]:
T = matrix(T, ncol = nf)
T = scale(T, scale = TRUE, center = TRUE)
P = matrix(P, ncol = nf)
C = matrix(C, ncol = nf)
W = matrix(W, ncol = nf)
Tortho = matrix(Tortho, ncol = nf)
Tortho = scale(Tortho, scale = TRUE, center = TRUE)
Portho = matrix(Portho, ncol = nf)
Wortho = matrix(Wortho, ncol = nf)
Xortho = Tortho %*% t(Portho)

In [42]:
plot(T[,1], Tortho[, 1], pch = 19, col=factor(species))
text(T[, 1], Tortho[,1],species, cex=1, pos=2)



In [51]:
m<-matrix(sample(6e6:7e6,14000), nrow= 20,ncol =700, byrow="TRUE")

In [52]:
species<-c("1", "1", "1","1", "1", "1","1", "1", "1","1",
           "-1","-1","-1","-1","-1","-1","-1","-1","-1","-1")

M<-data.frame(m,species)
X<-m
Y<-as.numeric(species)
nf=15

T = c()
P = c()
C = c()
W = c()
Tortho = c()
Portho = c()
Wortho = c()
Cortho = c()
for (j in 1:nf) {
  w = (t(X) %*% Y) %*% solve(t(Y) %*% Y)
  w=w%*%solve(norm(w))
  t = (X %*% w) %*% solve(t(w) %*% w)
  c = t(Y) %*% t %*% solve(t(t) %*% t)
  u = Y %*% c %*% solve(t(c) %*% c)
  p = (t(X) %*% t) %*% solve(t(t) %*% t)
  wortho = p - w
  wortho= p - t(((t(w)%*%p)%*% solve(t(w)%*%w))%*% t(w))
  wortho= wortho%*%solve(norm(wortho))
  tortho = X %*% wortho %*% solve(t(wortho) %*% wortho)
  portho = t(X) %*% tortho %*% solve(t(tortho) %*% tortho)
  cortho = t(Y) %*% tortho %*% solve(t(tortho) %*% tortho)
  X = X - tortho %*% t(portho)
  T = matrix(c(T, t))
  P = matrix(c(P, p))
  C = matrix(c(C, c))
  W = matrix(c(W, w))
  Tortho = matrix(c(Tortho, tortho))
  Portho = matrix(c(Portho, portho))
  Wortho = matrix(c(Wortho, wortho))
  Cortho = matrix(c(Cortho, cortho))
}
T = matrix(T, ncol = nf)
T = scale(T, scale = TRUE, center = TRUE)
P = matrix(P, ncol = nf)
C = matrix(C, ncol = nf)
W = matrix(W, ncol = nf)
Tortho = matrix(Tortho, ncol = nf)
Tortho = scale(Tortho, scale = TRUE, center = TRUE)
Portho = matrix(Portho, ncol = nf)
Wortho = matrix(Wortho, ncol = nf)
Xortho = Tortho %*% t(Portho)

In [53]:
plot(T[,1], Tortho[, 1], pch = 19, col=factor(species))
text(T[, 1], Tortho[,1],species, cex=1, pos=2)


Conclusions

  • this is a code for teaching (not validated)
  • we can get inspiration for creating a new publication style

In [236]:
m<-matrix(sample(1e6:7e6,28800), nrow= 64,ncol =450, byrow="TRUE")

In [237]:
64*45


2880

In [238]:
colnames(m) <- paste0("x",seq_along(1:450))

In [239]:
rownames(m) <- paste0("y",seq_along(1:64))

In [240]:
mData <- data.frame("x" = m)

In [247]:
s <- paste0("x.x",seq(2,63,1), "+", collapse = '')
s <- substr(s, 1, nchar(s)-1)
s


'x.x2+x.x3+x.x4+x.x5+x.x6+x.x7+x.x8+x.x9+x.x10+x.x11+x.x12+x.x13+x.x14+x.x15+x.x16+x.x17+x.x18+x.x19+x.x20+x.x21+x.x22+x.x23+x.x24+x.x25+x.x26+x.x27+x.x28+x.x29+x.x30+x.x31+x.x32+x.x33+x.x34+x.x35+x.x36+x.x37+x.x38+x.x39+x.x40+x.x41+x.x42+x.x43+x.x44+x.x45+x.x46+x.x47+x.x48+x.x49+x.x50+x.x51+x.x52+x.x53+x.x54+x.x55+x.x56+x.x57+x.x58+x.x59+x.x60+x.x61+x.x62+x.x63'

In [248]:
model <- lm(paste("x.x1 ~", s), data=mData)
summary(model)


Call:
lm(formula = paste("x.x1 ~", s), data = mData)

Residuals:
     y1      y2      y3      y4      y5      y6      y7      y8      y9     y10 
 228444  248729  557359 -763495  -92660  610082  365486 -587470  -59516 -273532 
    y11     y12     y13     y14     y15     y16     y17     y18     y19     y20 
-651450 -567236  685098  409284   92661 -279554  -41412  681456 -302780 -314626 
    y21     y22     y23     y24     y25     y26     y27     y28     y29     y30 
 729304 -572922 -411326 -202207  428707  174516  120759 1212497 -599863  154313 
    y31     y32     y33     y34     y35     y36     y37     y38     y39     y40 
 388374 -778428 -393978  133684  -18933 -890221   93630  463111 -194247 -305744 
    y41     y42     y43     y44     y45     y46     y47     y48     y49     y50 
-887493  141877 -842055  193078 -660011   48905 -313482  563133  101061   97946 
    y51     y52     y53     y54     y55     y56     y57     y58     y59     y60 
-388915  349020 -209368  533258   74180  554217  650975   69209 -110065  510699 
    y61     y62     y63     y64 
 410185 -117145   24239 -269341 

Coefficients:
              Estimate Std. Error t value Pr(>|t|)
(Intercept) -1.821e+07  6.911e+07  -0.264    0.836
x.x2         3.789e-01  1.674e+00   0.226    0.858
x.x3         1.910e-01  1.520e+00   0.126    0.920
x.x4        -8.649e-01  1.427e+00  -0.606    0.653
x.x5        -1.709e-01  1.464e+00  -0.117    0.926
x.x6         3.168e-02  1.133e+00   0.028    0.982
x.x7        -2.802e-01  9.881e-01  -0.284    0.824
x.x8        -4.358e-01  2.026e+00  -0.215    0.865
x.x9        -1.007e+00  2.565e+00  -0.393    0.762
x.x10        1.637e-01  1.019e+00   0.161    0.899
x.x11       -7.309e-01  2.102e+00  -0.348    0.787
x.x12       -7.967e-01  1.219e+00  -0.653    0.632
x.x13       -7.707e-02  8.856e-01  -0.087    0.945
x.x14        1.914e-01  1.074e+00   0.178    0.888
x.x15        4.558e-01  1.146e+00   0.398    0.759
x.x16       -2.308e-01  1.200e+00  -0.192    0.879
x.x17        3.424e-01  2.052e+00   0.167    0.895
x.x18       -2.482e-01  9.426e-01  -0.263    0.836
x.x19       -6.005e-01  1.377e+00  -0.436    0.738
x.x20        6.038e-01  8.329e-01   0.725    0.601
x.x21        2.958e-01  1.488e+00   0.199    0.875
x.x22        7.384e-01  2.399e+00   0.308    0.810
x.x23       -9.216e-01  1.762e+00  -0.523    0.693
x.x24        2.608e-01  1.235e+00   0.211    0.867
x.x25        9.870e-01  2.377e+00   0.415    0.749
x.x26        4.835e-01  1.407e+00   0.344    0.789
x.x27        2.981e-01  2.573e+00   0.116    0.927
x.x28        1.006e+00  1.774e+00   0.567    0.672
x.x29        1.494e-01  8.847e-01   0.169    0.893
x.x30        3.485e-01  8.606e-01   0.405    0.755
x.x31        6.421e-02  1.231e+00   0.052    0.967
x.x32       -9.711e-02  1.696e+00  -0.057    0.964
x.x33        8.155e-01  2.013e+00   0.405    0.755
x.x34       -4.847e-01  1.225e+00  -0.396    0.760
x.x35        6.480e-02  7.780e-01   0.083    0.947
x.x36        4.807e-01  1.742e+00   0.276    0.829
x.x37        8.239e-01  2.380e+00   0.346    0.788
x.x38       -3.876e-01  1.289e+00  -0.301    0.814
x.x39       -4.342e-01  1.161e+00  -0.374    0.772
x.x40        1.845e-01  1.284e+00   0.144    0.909
x.x41        7.603e-01  2.027e+00   0.375    0.772
x.x42       -4.295e-01  6.610e-01  -0.650    0.633
x.x43        1.193e-01  8.105e-01   0.147    0.907
x.x44        7.352e-01  1.182e+00   0.622    0.646
x.x45       -7.436e-01  8.543e-01  -0.870    0.544
x.x46        1.869e-01  1.510e+00   0.124    0.922
x.x47        1.190e+00  3.689e+00   0.323    0.801
x.x48        4.283e-01  1.272e+00   0.337    0.793
x.x49       -8.510e-02  1.054e+00  -0.081    0.949
x.x50       -1.190e-01  7.982e-01  -0.149    0.906
x.x51        6.494e-02  1.157e+00   0.056    0.964
x.x52       -8.586e-02  6.975e-01  -0.123    0.922
x.x53        9.418e-02  1.574e+00   0.060    0.962
x.x54        1.913e-01  1.587e+00   0.121    0.924
x.x55       -5.927e-01  2.145e+00  -0.276    0.828
x.x56       -2.929e-02  1.784e+00  -0.016    0.990
x.x57        6.238e-01  1.370e+00   0.455    0.728
x.x58        1.346e-01  1.345e+00   0.100    0.937
x.x59        8.996e-01  9.900e-01   0.909    0.530
x.x60        5.015e-01  2.225e+00   0.225    0.859
x.x61       -2.985e-01  7.780e-01  -0.384    0.767
x.x62        4.703e-01  1.233e+00   0.381    0.768
x.x63       -3.954e-01  1.046e+00  -0.378    0.770

Residual standard error: 3688000 on 1 degrees of freedom
Multiple R-squared:  0.9341,	Adjusted R-squared:  -3.15 
F-statistic: 0.2287 on 62 and 1 DF,  p-value: 0.9594

In [216]:
pt(1.8,df=62)


0.961636965220729

In [217]:
cor.test(mData$x.x1, mData$x.x23)
plot(mData$x.x1, mData$x.x23)


	Pearson's product-moment correlation

data:  mData$x.x1 and mData$x.x23
t = -1.1521, df = 638, p-value = 0.2497
alternative hypothesis: true correlation is not equal to 0
95 percent confidence interval:
 -0.12263152  0.03205036
sample estimates:
        cor 
-0.04556369 

In [ ]:


In [183]:
qplot(x=Var1, y=Var2, data=melt(cor(mData)), fill=value, geom="tile")



In [ ]:


In [46]:
#library(devtools)
#install_github("jwist/rLims")
library(rLims)
account <- list(server='mylims.univalle.edu.co',user='jessica.medina@correounivalle.edu.co',key='PDfEZzKo3K')
metaData <- lims.getUserSamplesMetaData(account=account)
metaData <- lims.selectSamples(metaData,c(7:8))
Filter <- list('experiment'=c('noesygpps1dcomp'))
data <- lims.getSpectra(metaData, Filter=Filter,OP='AND',dry=FALSE)
plot(data$spectra[[2]])
#dataSet <- lims.createDataSet(data)


Loading required package: rjson

Attaching package: 'rLims'

The following object is masked from 'package:ggplot2':

    lims

Warning message:
In lims.getParameters(entries): NA found in parameters, please check
[1] "number of spectra ordered by experiment type:"

                     cpmgpr1dcomp     hmbcgplpndqf  hsqcdietgpsisp2 
               3              104                2                2 
        mlevphpr  noesygpphwgx1jw      noesygppr1d noesygpps1d.comp 
               2                2               66                2 
 noesygpps1dcomp             zg30           zgpg30 
              96                1                2 
[1] "number of spectra ordered by solvents:"

                        CDCl3 COFFEEcalctemp     COFFEEmeoh            D2O 
             2              4              5             78             11 
        H2OD2O           MeOD 
           165             17 
[1] "number of spectra ordered by nucleus:"

13C  1H   C 
  2 278   2 
[1] "number of spectra ordered by temperature:"

  300 300.1 300.2 300.3 300.5 
  246    18     9     2     1 
[1] "total number of spectra:  282"
Warning message:
In file(file, "rt"): unable to connect to 'mylims.univalle.edu.co' on port 80.
Error in file(file, "rt"): cannot open the connection
Error in data$spectra: object of type 'closure' is not subsettable

In [ ]: