6. Latent-mixture models

  • 싸이그래머 / 인지모델링:베이지안[1]
  • 김무성

Contents

  • 6.1 Exam scores
  • 6.2 Exam scores with individual differences
  • 6.3 Twenty questions
  • 6.4 The two-country quiz
  • 6.5 Assessment of malingering
  • 6.6 Individual differences in malingering
  • 6.7 Alzheimer’s recall test cheating

6.1 Exam scores

  • Suppose a group of 15 people sit an exam made up of 40 true-or-false questions, and they get 21, 17, 21, 18, 22, 31, 31, 34, 34, 35, 35, 36, 39, 36, and 35 right. These scores suggest that the first 5 people were just guessing, but the last 10 had some level of knowledge.
  • One way to make statistical inferences along these lines is to assume there are two different groups of people.
    • These groups have different probabilities of success, with
      • the guessing group having a probability of 0.5, and
      • the knowledge group having a probability greater than 0.5.
    • Whether each person belongs to the first or the second group is a latent or unobserved variable that can take just two values.
  • Using this approach, the goal is to infer
    • to which group each person belongs, and also
    • the rate of success for the knowledge group.

  • The number of correct answers for the ith person is ki,
  • and is out of n = 40.
  • The probability of success on each question for the ith person is the rate θi.
    • This rate is either ψ,if the person is in the guessing group,
    • or φ if the person is in the knowledge group.
  • Which group the ith person belongs to is determined by a binary indicator variable zi, with zi = 0 if the ith person is in the guessing group, and zi = 1 if the ith person is in the knowledge group.
  • This type of model is known as a latent-mixture model,
    • because the data are assumed to be generated by two different processes that combine or mix, and
    • important properties of that mixture are unobserved or latent.
    • In this case,
      • the two components that mix are the guessing and knowledge processes, and
      • the group membership of each person is latent.

JAGS code


In [40]:
library(R2jags)

In [41]:
# Exam Scores
s = "
model{
  # Each Person Belongs To One Of Two Latent Groups
  for (i in 1:p){
    z[i] ~ dbern(0.5)
  }
  # First Group Guesses
  psi <- 0.5
  # Second Group Has Some Unknown Greater Rate Of Success
  phi ~ dbeta(1,1)I(0.5,1)
  # Data Follow Binomial With Rate Given By Each Person's Group Assignment
  for (i in 1:p){
    theta[i] <- equals(z[i],0)*psi+equals(z[i],1)*phi
    k[i] ~ dbin(theta[i],n)
  }
}
"

In [43]:
k <- c(21,17,21,18,22,31,31,34,34,35,35,36,39,36,35)
p <- length(k) #number of people
n <- 40 # number of questions

In [44]:
data <- list("p", "k", "n") # to be passed on to JAGS
myinits <- list(
  list(phi = 0.75, z = round(runif(p)))) # Initial group assignment

In [45]:
# parameters to be monitored:
parameters <- c("phi","z")

In [46]:
# The following command calls JAGS with specific options.
# For a detailed description see the R2jags documentation.
samples <- jags(data, inits=myinits, parameters,
                textConnection(s), n.chains=1, n.iter=1000, 
                n.burnin=1, n.thin=1, DIC=T)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.


module glm loaded
Compiling model graph
   Resolving undeclared variables
   Allocating nodes
   Graph Size: 112

Initializing model


In [47]:
samples


Out[47]:
Inference for Bugs model at "4", fit using jags,
 1 chains, each with 1000 iterations (first 1 discarded)
 n.sims = 999 iterations saved
         mu.vect sd.vect   2.5%    25%    50%    75%  97.5%
phi        0.863   0.017  0.827  0.853  0.863  0.875  0.896
z[1]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[2]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[3]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[4]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[5]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[6]       0.993   0.083  1.000  1.000  1.000  1.000  1.000
z[7]       0.991   0.095  1.000  1.000  1.000  1.000  1.000
z[8]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[9]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[10]      1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[11]      1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[12]      1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[13]      1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[14]      1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[15]      1.000   0.000  1.000  1.000  1.000  1.000  1.000
deviance  68.625   2.002 67.450 67.564 67.892 68.763 74.421

DIC info (using the rule, pD = var(deviance)/2)
pD = 2.0 and DIC = 70.6
DIC is an estimate of expected predictive error (lower deviance is better).

In [48]:
plot(samples)


Stan code


In [50]:
library(rstan)

In [51]:
model <- "
// Exam Scores
data { 
  int<lower=1> p;
  int<lower=0> k[p];
  int<lower=1> n;
}
transformed data {
  real psi;

  // First Group Guesses
  psi <- .5;
}
parameters {
  // Second Group Has Some Unknown Greater Rate Of Success
  real<lower=.5,upper=1> phi; 
} 
transformed parameters {
  vector[2] lp_parts[p];

  // Data Follow Binomial With Rate Given By Each Person's Group Assignment
  for (i in 1:p) {
    lp_parts[i,1] <- log(.5) + binomial_log(k[i], n, phi);
    lp_parts[i,2] <- log(.5) + binomial_log(k[i], n, psi); 
  }
}
model {
  for (i in 1:p)
    increment_log_prob(log_sum_exp(lp_parts[i]));  
}
generated quantities {
  int<lower=0,upper=1> z[p];
  
  for (i in 1:p) {
    vector[2] prob;
    prob <- softmax(lp_parts[i]);
    z[i] <- bernoulli_rng(prob[1]);
  }
}"

In [52]:
k <- c(21, 17, 21, 18, 22, 31, 31, 34, 34, 35, 35, 36, 39, 36, 35)
p <- length(k)  # number of people
n <- 40  # number of questions

In [53]:
data <- list(p=p, k=k, n=n) # to be passed on to Stan

In [54]:
myinits <- list(
  list(phi=.75))
  
parameters <- c("phi", "z")  # parameters to be monitored

In [56]:
# The following command calls Stan with specific options.
# For a detailed description type "?rstan".
samples <- stan(model_code=model,   
                data=data, 
                init=myinits,  # If not specified, gives random inits
                pars=parameters,
                iter=1000, 
                chains=1, 
                thin=1,
                # warmup = 100,  # Stands for burn-in; Default = iter/2
                # seed = 123  # Setting seed; Default is random seed
)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.


TRANSLATING MODEL 'model' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'model' NOW.
In file included from file37947c79fd3.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:17:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev.hpp:5:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'static' function 'set_zero_all_adjoints' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration]
    static void set_zero_all_adjoints() {
                ^
In file included from file37947c79fd3.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:21:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/io/dump.hpp:26:14: warning: function 'product' is not needed and will not be emitted [-Wunneeded-internal-declaration]
      size_t product(std::vector<size_t> dims) {
             ^
2 warnings generated.

SAMPLING FOR MODEL 'model' NOW (CHAIN 1).

Iteration:   1 / 1000 [  0%]  (Warmup)
Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Iteration: 300 / 1000 [ 30%]  (Warmup)
Iteration: 400 / 1000 [ 40%]  (Warmup)
Iteration: 500 / 1000 [ 50%]  (Warmup)
Iteration: 501 / 1000 [ 50%]  (Sampling)
Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)
#  Elapsed Time: 0.049615 seconds (Warm-up)
#                0.049192 seconds (Sampling)
#                0.098807 seconds (Total)


In [57]:
print(samples, digits=3)


Inference for Stan model: model.
1 chains, each with iter=1000; warmup=500; thin=1; 
post-warmup draws per chain=500, total post-warmup draws=500.

         mean se_mean    sd    2.5%     25%     50%     75%   97.5% n_eff  Rhat
phi     0.863   0.001 0.017   0.826   0.853   0.864   0.876   0.892   207 1.000
z[1]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000   500   NaN
z[2]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000   500   NaN
z[3]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000   500   NaN
z[4]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000   500   NaN
z[5]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000   500   NaN
z[6]    0.996   0.003 0.063   1.000   1.000   1.000   1.000   1.000   500 1.002
z[7]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[8]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[9]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[10]   1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[11]   1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[12]   1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[13]   1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[14]   1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
z[15]   1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000   500   NaN
lp__  -46.901   0.039 0.618 -48.641 -47.095 -46.642 -46.476 -46.425   246 1.000

Samples were drawn using NUTS(diag_e) at Sun Jul 19 00:38:49 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

In [58]:
plot(samples)


6.2 Exam scores with individual differences

The previous example shows how sampling can model data as coming from a mixture of sources, and infer properties of these latent groups. But the specific model has at least one big weakness, which is that it assumes all the people in the knowledge group have exactly the same rate of success on the questions.

  • One straightforward way to allow for individual differences in the knowledge group is to extend the model hierarchically.
  • This involves drawing the success rate for each of the people in the knowledge group from an over-arching distribution.
    • One convenient (but not perfect) choice for this “individual differences” distribution is a Gaussian.
    • It is a natural statistical model for individual variation, at least in the absence of any richer theory.
    • But it has the problem of allowing for success rates below zero and above one.
    • An inelegant but practical and effective way to deal with this is simply to restrict the sampled success rates to the valid range.

JAGS code


In [88]:
# Exam Scores With Individual Differences
s <- "
model{
  # Data Follow Binomial With Rate Given By Each Person's Group Assignment
  for (i in 1:p){
    theta[i] <- equals(z[i],0)*psi+equals(z[i],1)*phi[i]
    k[i] ~ dbin(theta[i],n)
  }
  # Each Person Belongs To One Of Two Latent Groups
  for (i in 1:p){
    z[i] ~ dbern(0.5)
  }
  # The Second Group Allows Individual Differences
  for (i in 1:p){
    # Second Group Drawn From A Censored Gaussian Distribution
    phi[i] ~ dnorm(mu,lambda)T(0,1)
  }   
  # First Group Guesses
  psi <- 0.5
  # Second Group Mean, Precision (And Standard Deviation)
  mu ~ dbeta(1,1)T(.5,1) # >0.5 Average Success Rate
  lambda ~ dgamma(.001,.001)
  sigma <- 1/sqrt(lambda) 
  # Posterior Predictive For Second Group
  predphi ~ dnorm(mu,lambda)T(0,1)
}
"

In [89]:
k <- c(21,17,21,18,22,31,31,34,34,35,35,36,39,36,35)
p <- length(k) #number of people
n <- 40 # number of questions

In [90]:
data <- list("p", "k", "n") # to be passed on to JAGS

In [91]:
myinits <- list(
  list(mu = 0.75, lambda = 1, z = round(runif(p)))) # Initial group assignment

In [92]:
# parameters to be monitored: 
parameters <- c("predphi","theta","z","mu","sigma")

In [93]:
# The following command calls JAGS with specific options.
# For a detailed description see the R2jags documentation.
samples <- jags(data, inits=myinits, parameters,
                textConnection(s), n.chains=1, n.iter=2000, 
                n.burnin=1000, n.thin=1, DIC=T)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.


Compiling model graph
   Resolving undeclared variables
   Allocating nodes
   Graph Size: 132

Initializing model


In [94]:
samples


Out[94]:
Inference for Bugs model at "6", fit using jags,
 1 chains, each with 2000 iterations (first 1000 discarded)
 n.sims = 1000 iterations saved
          mu.vect sd.vect   2.5%    25%    50%    75%  97.5%
mu          0.868   0.033  0.796  0.851  0.867  0.883  0.943
predphi     0.862   0.064  0.703  0.831  0.865  0.899  0.978
sigma       0.058   0.035  0.018  0.035  0.049  0.071  0.146
theta[1]    0.501   0.012  0.500  0.500  0.500  0.500  0.500
theta[2]    0.500   0.004  0.500  0.500  0.500  0.500  0.500
theta[3]    0.501   0.014  0.500  0.500  0.500  0.500  0.500
theta[4]    0.500   0.003  0.500  0.500  0.500  0.500  0.500
theta[5]    0.501   0.015  0.500  0.500  0.500  0.500  0.500
theta[6]    0.823   0.053  0.714  0.798  0.833  0.857  0.895
theta[7]    0.823   0.047  0.724  0.799  0.830  0.854  0.893
theta[8]    0.854   0.038  0.772  0.831  0.858  0.879  0.919
theta[9]    0.853   0.036  0.778  0.832  0.857  0.878  0.917
theta[10]   0.865   0.038  0.780  0.842  0.867  0.891  0.930
theta[11]   0.864   0.038  0.778  0.842  0.865  0.892  0.932
theta[12]   0.876   0.039  0.792  0.852  0.875  0.902  0.948
theta[13]   0.921   0.040  0.843  0.891  0.924  0.955  0.986
theta[14]   0.876   0.036  0.801  0.853  0.876  0.901  0.943
theta[15]   0.864   0.037  0.783  0.842  0.867  0.889  0.930
z[1]        0.017   0.129  0.000  0.000  0.000  0.000  0.000
z[2]        0.002   0.045  0.000  0.000  0.000  0.000  0.000
z[3]        0.007   0.083  0.000  0.000  0.000  0.000  0.000
z[4]        0.004   0.063  0.000  0.000  0.000  0.000  0.000
z[5]        0.008   0.089  0.000  0.000  0.000  0.000  0.000
z[6]        0.993   0.083  1.000  1.000  1.000  1.000  1.000
z[7]        0.997   0.055  1.000  1.000  1.000  1.000  1.000
z[8]        1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[9]        1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[10]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[11]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[12]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[13]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[14]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[15]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
deviance   64.737   3.689 58.902 62.066 64.392 67.069 72.691

DIC info (using the rule, pD = var(deviance)/2)
pD = 6.8 and DIC = 71.5
DIC is an estimate of expected predictive error (lower deviance is better).

In [95]:
plot(samples)


Stan code


In [96]:
model <- "
// Exam Scores With Individual Differences
data { 
  int<lower=1> p;
  int<lower=0> k[p];
  int<lower=1> n;
}
transformed data {
  real psi;

  // First Group Guesses
  psi <- .5;
}
parameters {
  vector<lower=0,upper=1>[p] phi;
  real<lower=.5,upper=1> mu;  // Second Group Mean
  real<lower=0> lambda;
  real<lower=0,upper=1> predphi;
} 
transformed parameters {
  vector[2] lp_parts[p];
  real<lower=0> sigma;

  // Second Group Standard Deviation
  sigma <- inv_sqrt(lambda);
  
  // Data Follow Binomial With Rate Given By Each Person's Group Assignment
  // Each Person Belongs To One Of Two Latent Groups
  for (i in 1:p) {
    lp_parts[i, 1] <- log(.5) + binomial_log(k[i], n, phi[i]);
    lp_parts[i, 2] <- log(.5) + binomial_log(k[i], n, psi); 
  }
}
model {
  // Second Group Precision
  lambda ~ gamma(.001, .001);
  // Posterior Predictive For Second Group
  predphi ~ normal(mu, sigma)T[0,1];
  // Second Group Drawn From A Censored Gaussian Distribution
  for (i in 1:p)
    phi[i] ~ normal(mu, sigma)T[0,1];
  
  for (i in 1:p)  
    increment_log_prob(log_sum_exp(lp_parts[i]));
}
generated quantities {
  int<lower=0,upper=1> z[p];
  vector<lower=0,upper=1>[p] theta;

  for (i in 1:p) {
    vector[2] prob;
    prob <- softmax(lp_parts[i]);
    z[i] <- bernoulli_rng(prob[1]);
    theta[i] <- (z[i] == 0) * psi + (z[i] == 1) * phi[i];
  }
}"

In [97]:
k <- c(21, 17, 21, 18, 22, 31, 31, 34, 34, 35, 35, 36, 39, 36, 35)
p <- length(k)  # number of people
n <- 40  # number of questions

In [98]:
data <- list(p=p, k=k, n=n) # to be passed on to Stan

In [99]:
myinits <- list(
  list(phi=runif(p, .5, 1), mu=.75, lambda=1, predphi=.75))

In [100]:
# parameters to be monitored: 
parameters <- c("predphi", "theta", "z", "mu", "sigma")

In [101]:
# The following command calls Stan with specific options.
# For a detailed description type "?rstan".
samples <- stan(model_code=model,   
                data=data, 
                init=myinits,  # If not specified, gives random inits
                pars=parameters,
                iter=20000, 
                chains=1, 
                thin=1,
                # warmup = 100,  # Stands for burn-in; Default = iter/2
                # seed = 123  # Setting seed; Default is random seed
)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.


TRANSLATING MODEL 'model' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'model' NOW.
In file included from file3791a3e1763.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:17:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev.hpp:5:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'static' function 'set_zero_all_adjoints' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration]
    static void set_zero_all_adjoints() {
                ^
In file included from file3791a3e1763.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:21:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/io/dump.hpp:26:14: warning: function 'product' is not needed and will not be emitted [-Wunneeded-internal-declaration]
      size_t product(std::vector<size_t> dims) {
             ^
2 warnings generated.

SAMPLING FOR MODEL 'model' NOW (CHAIN 1).

Iteration:     1 / 20000 [  0%]  (Warmup)
Iteration:  2000 / 20000 [ 10%]  (Warmup)
Iteration:  4000 / 20000 [ 20%]  (Warmup)
Iteration:  6000 / 20000 [ 30%]  (Warmup)
Iteration:  8000 / 20000 [ 40%]  (Warmup)
Iteration: 10000 / 20000 [ 50%]  (Warmup)
Iteration: 10001 / 20000 [ 50%]  (Sampling)
Iteration: 12000 / 20000 [ 60%]  (Sampling)
Iteration: 14000 / 20000 [ 70%]  (Sampling)
Iteration: 16000 / 20000 [ 80%]  (Sampling)
Iteration: 18000 / 20000 [ 90%]  (Sampling)
Iteration: 20000 / 20000 [100%]  (Sampling)
#  Elapsed Time: 5.86379 seconds (Warm-up)
#                4.72673 seconds (Sampling)
#                10.5905 seconds (Total)


In [102]:
print(samples, digits=3)


Inference for Stan model: model.
1 chains, each with iter=20000; warmup=10000; thin=1; 
post-warmup draws per chain=10000, total post-warmup draws=10000.

             mean se_mean    sd    2.5%     25%     50%     75%   97.5% n_eff
predphi     0.860   0.001 0.068   0.709   0.823   0.863   0.903   0.981  4950
theta[1]    0.501   0.000 0.011   0.500   0.500   0.500   0.500   0.500  9898
theta[2]    0.500   0.000 0.005   0.500   0.500   0.500   0.500   0.500 10000
theta[3]    0.501   0.000 0.014   0.500   0.500   0.500   0.500   0.500  8757
theta[4]    0.500   0.000 0.005   0.500   0.500   0.500   0.500   0.500  9832
theta[5]    0.502   0.000 0.020   0.500   0.500   0.500   0.500   0.500  8859
theta[6]    0.817   0.001 0.050   0.706   0.793   0.824   0.851   0.894  3876
theta[7]    0.816   0.001 0.051   0.709   0.791   0.822   0.849   0.893  4632
theta[8]    0.852   0.001 0.040   0.765   0.827   0.855   0.881   0.923  4407
theta[9]    0.852   0.001 0.040   0.763   0.828   0.855   0.880   0.924  5344
theta[10]   0.865   0.001 0.039   0.782   0.841   0.867   0.892   0.935  5342
theta[11]   0.864   0.001 0.039   0.781   0.839   0.867   0.892   0.934  4919
theta[12]   0.877   0.001 0.039   0.795   0.852   0.879   0.904   0.949  3024
theta[13]   0.923   0.001 0.041   0.839   0.895   0.927   0.956   0.989  1167
theta[14]   0.877   0.001 0.040   0.794   0.852   0.880   0.905   0.950  3292
theta[15]   0.864   0.001 0.039   0.780   0.839   0.866   0.891   0.935  5052
z[1]        0.008   0.001 0.087   0.000   0.000   0.000   0.000   0.000  9095
z[2]        0.004   0.001 0.062   0.000   0.000   0.000   0.000   0.000  7290
z[3]        0.010   0.001 0.098   0.000   0.000   0.000   0.000   0.000  7957
z[4]        0.004   0.001 0.060   0.000   0.000   0.000   0.000   0.000  9539
z[5]        0.016   0.001 0.124   0.000   0.000   0.000   0.000   0.000  7601
z[6]        0.996   0.001 0.066   1.000   1.000   1.000   1.000   1.000 10000
z[7]        0.995   0.001 0.072   1.000   1.000   1.000   1.000   1.000  8907
z[8]        1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000 10000
z[9]        1.000   0.000 0.010   1.000   1.000   1.000   1.000   1.000 10000
z[10]       1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000 10000
z[11]       1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000 10000
z[12]       1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000 10000
z[13]       1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000 10000
z[14]       1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000 10000
z[15]       1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000 10000
mu          0.871   0.001 0.036   0.811   0.849   0.867   0.887   0.966  1499
sigma       0.065   0.001 0.040   0.026   0.039   0.054   0.079   0.170   994
lp__      -40.907   0.279 6.629 -53.916 -45.749 -40.580 -35.935 -29.197   566
           Rhat
predphi   1.000
theta[1]  1.000
theta[2]  1.000
theta[3]  1.000
theta[4]  1.000
theta[5]  1.000
theta[6]  1.000
theta[7]  1.000
theta[8]  1.001
theta[9]  1.000
theta[10] 1.000
theta[11] 1.000
theta[12] 1.000
theta[13] 1.000
theta[14] 1.000
theta[15] 1.000
z[1]      1.000
z[2]      1.000
z[3]      1.000
z[4]      1.000
z[5]      1.000
z[6]      1.000
z[7]      1.000
z[8]        NaN
z[9]      1.000
z[10]       NaN
z[11]       NaN
z[12]       NaN
z[13]       NaN
z[14]       NaN
z[15]       NaN
mu        1.000
sigma     1.000
lp__      1.000

Samples were drawn using NUTS(diag_e) at Sun Jul 19 01:21:06 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

In [103]:
plot(samples)


6.3 Twenty questions

JAGS code


In [59]:
# Twenty Questions
s = "
model{
  # Correctness Of Each Answer Is Bernoulli Trial
  for (i in 1:np){
    for (j in 1:nq){
      k[i,j] ~ dbern(theta[i,j])
    }
  }
  # Probability Correct Is Product Of Question By Person Rates
  for (i in 1:np){
    for (j in 1:nq){
      theta[i,j] <- p[i]*q[j]
    }
  }
  # Priors For People and Questions
  for (i in 1:np){
    p[i] ~ dbeta(1,1)
  }
  for (j in 1:nq){
    q[j] ~ dbeta(1,1)
  }
}
"

In [65]:
# Twenty Questions (NA)
s_NA = "
model{
  # Correctness Of Each Answer Is Bernoulli Trial
  for (i in 1:np){
    for (j in 1:nq){
      k[i,j] ~ dbern(theta[i,j])
    }
  }
  # Probability Correct Is Product Of Question By Person Rates
  for (i in 1:np){
    for (j in 1:nq){
      theta[i,j] <- p[i]*q[j]
    }
  }
  # Priors For People and Questions
  for (i in 1:np){
    p[i] ~ dbeta(1,1)
  }
  for (j in 1:nq){
    q[j] ~ dbeta(1,1)
  }
  NA.array[1] <- k[1,13]
  NA.array[2] <- k[8,5]
  NA.array[3] <- k[10,18]
}
"

In [60]:
dset <- 1

if (dset==1)
{
  k <- c(1,1,1,1,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,0,
        0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,1,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,
        0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,
        1,0,1,1,0,1,1,1,0,1,0,0,1,0,0,0,0,1,0,0,
        1,1,0,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,0,
        0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,1,1,0,0,0,0,1,0,1,0,0,1,0,0,0,0,1,0,1,
        1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0)
}

if (dset==2)
{
  k <- c(1,1,1,1,0,0,1,1,0,1,0,0,NA,0,0,1,0,1,0,0,
        0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,1,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,
        0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,
        1,0,1,1,0,1,1,1,0,1,0,0,1,0,0,0,0,1,0,0,
        1,1,0,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,0,
        0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
        0,0,0,0,NA,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,1,1,0,0,0,0,1,0,1,0,0,1,0,0,0,0,1,0,1,
        1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,NA,0,0)
}

In [61]:
k <- matrix(k, nrow=10, byrow=T)

In [62]:
np <- nrow(k)
nq <- ncol(k)

In [63]:
data <- list("np","nq","k") # to be passed on to JAGS

In [64]:
myinits <- list(
  list(q = runif(nq), p = runif(np)))

In [67]:
if (dset==1)
{
# parameters to be monitored:
  parameters <- c("p", "q")

  # The following command calls JAGS with specific options.
  # For a detailed description see the R2jags documentation.
  samples <- jags(data, inits=myinits, parameters,
                  textConnection(s), n.chains=1, n.iter=10000, 
                  n.burnin=1, n.thin=1, DIC=T)
  # Now the values for the monitored parameters are in the "samples" object, 
  # ready for inspection.
}

if (dset==2)
{
# parameters to be monitored:
  parameters <- c("p", "q", "NA.array")

  # The following command calls JAGS with specific options.
  # For a detailed description the R2jags documentation.
  samples <- jags(data, inits=myinits, parameters,
                  textConnection(s_NA), n.chains=1, n.iter=10000, 
                  n.burnin=1, n.thin=1, DIC=T)
  # Now the values for the monitored parameters are in the "samples" object, 
  # ready for inspection.
}


Compiling model graph
   Resolving undeclared variables
   Allocating nodes
   Graph Size: 433

Initializing model


In [68]:
samples


Out[68]:
Inference for Bugs model at "6", fit using jags,
 1 chains, each with 10000 iterations (first 1 discarded)
 n.sims = 9999 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%
p[1]       0.890   0.095   0.647   0.842   0.917   0.963   0.997
p[2]       0.275   0.139   0.063   0.171   0.256   0.358   0.600
p[3]       0.478   0.170   0.182   0.353   0.467   0.593   0.831
p[4]       0.357   0.150   0.110   0.245   0.340   0.455   0.688
p[5]       0.843   0.120   0.552   0.772   0.868   0.939   0.995
p[6]       0.823   0.127   0.527   0.744   0.847   0.924   0.992
p[7]       0.177   0.113   0.023   0.091   0.156   0.242   0.448
p[8]       0.090   0.086   0.002   0.026   0.065   0.128   0.318
p[9]       0.722   0.159   0.392   0.611   0.734   0.847   0.979
p[10]      0.479   0.170   0.181   0.354   0.471   0.594   0.836
q[1]       0.731   0.177   0.345   0.614   0.753   0.876   0.988
q[2]       0.698   0.182   0.313   0.569   0.714   0.845   0.978
q[3]       0.768   0.157   0.419   0.666   0.794   0.894   0.989
q[4]       0.646   0.204   0.239   0.497   0.658   0.809   0.978
q[5]       0.152   0.134   0.004   0.049   0.115   0.216   0.497
q[6]       0.310   0.183   0.045   0.168   0.280   0.423   0.733
q[7]       0.747   0.162   0.393   0.635   0.766   0.876   0.985
q[8]       0.819   0.144   0.466   0.738   0.853   0.934   0.994
q[9]       0.284   0.169   0.041   0.153   0.256   0.389   0.674
q[10]      0.849   0.127   0.528   0.780   0.881   0.949   0.995
q[11]      0.155   0.138   0.005   0.051   0.117   0.216   0.517
q[12]      0.422   0.191   0.105   0.274   0.405   0.551   0.830
q[13]      0.858   0.123   0.539   0.795   0.891   0.955   0.996
q[14]      0.157   0.141   0.005   0.050   0.116   0.226   0.524
q[15]      0.150   0.135   0.004   0.047   0.111   0.214   0.492
q[16]      0.484   0.209   0.127   0.323   0.472   0.633   0.912
q[17]      0.157   0.141   0.004   0.048   0.116   0.225   0.530
q[18]      0.762   0.172   0.376   0.650   0.793   0.901   0.991
q[19]      0.156   0.138   0.005   0.050   0.118   0.225   0.511
q[20]      0.304   0.178   0.044   0.164   0.273   0.415   0.716
deviance 144.453   7.994 130.286 138.824 143.904 149.478 161.413

DIC info (using the rule, pD = var(deviance)/2)
pD = 32.0 and DIC = 176.4
DIC is an estimate of expected predictive error (lower deviance is better).

In [69]:
plot(samples)

# When you use WinBUGS and you want to sample the missing data, you cannot 
# just return "k" to R, as you can with Matlab. Most likely, the reason is 
# that "k" is a matrix, and, say, k[1,13] is a single number. WinBUGS will 
# sample k[1,13] when it contains "NA", but R will not accept a sequence of 
# samples for the cell k[1,13]. This means the following:
# 1. Using R2WinBUGS, you can tell WinBUGS to monitor and return "k". 
# When you set "debug=T", you can inspect the results, but, when you return
# to R, WinBUGS is unable to pass on its results.
# 2. One clumsy way around this problem is illustrated in the code
# TwentyQuestions_NA.txt. 
# Note that JAGS does not seem to have this problem and you can just
# examine "k".


Stan code


In [71]:
#### Notes to Stan model #######################################################
## 1) There are two models in this script. The first one is not able to 
##    incorporate missing values as is required for exercise 6.3.2. The second
##    model is able to do that. Actually you can use the second model on both 
##    datasets - with or without NAs. 
## 2) Models with missing values can be tricky. It's important to know that
##    Stan treats variables either as know or unknown, therefore you have to
##    separate both parts first and then make it work somehow. Chapter on
##    Missing Data in Stan manual is useful.
################################################################################

In [72]:
model1 <- "
// Twenty Questions
data { 
  int<lower=0> np;  # rows
  int<lower=0> nq;  # columns
  int k[np,nq];
}
parameters {
  real<lower=0,upper=1> p[np];
  real<lower=0,upper=1> q[nq];
} 
transformed parameters {
  real<lower=0,upper=1> theta[np, nq];

  // Probability Correct Is Product Of Question By Person Rates
  for (i in 1:np)
    for (j in 1:nq)
      theta[i,j] <- p[i] * q[j];
}
model {
  // Priors For People and Questions
  p ~ beta(1, 1);
  q ~ beta(1, 1);
    
  // Correctness Of Each Answer Is Bernoulli Trial
  for (i in 1:np)
    for (j in 1:nq)
      k[i, j] ~ bernoulli(theta[i,j]);
}"

In [73]:
model2 <- "
# Twenty Questions
data { 
  int<lower=0> np;  // rows
  int<lower=0> nq;  // columns
  int k[np,nq];
  int k_na[np,nq];  // locating NAs in k
  int n_na;         // number of NAs
}
parameters {
  real<lower=0,upper=1> p[np];
  real<lower=0,upper=1> q[nq];
} 
transformed parameters {
  real<lower=0,upper=1> theta[np,nq];

  // Probability Correct Is Product Of Question By Person Rates
  for (i in 1:np)
    for (j in 1:nq)
      theta[i,j] <- p[i] * q[j];
}
model {
  // Priors For People and Questions
  p ~ beta(1, 1);
  q ~ beta(1, 1);
    
  // Correctness Of Each Answer Is Bernoulli Trial
  for (i in 1:np)
    for (j in 1:nq)
      if (k_na[i,j] == 0)     // If k[i,j] is not missing
        k[i,j] ~ bernoulli(theta[i,j]);
}
generated quantities {
  int na_array[n_na];
  int index;
  
  index <- 1;
  for (i in 1:np) {
    for (j in 1:nq) {   
      if (k_na[i,j] == 1) {   // If k[i,j] is missing
        na_array[index] <- bernoulli_rng(theta[i,j]);
        index <- index + 1;
      }
    }
  }
}"

In [75]:
dset <- 1  # Chooes dataset/model

if (dset == 1) {
  k <- c(1,1,1,1,0,0,1,1,0,1,0,0,1,0,0,1,0,1,0,0,
        0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,1,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,
        0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,
        1,0,1,1,0,1,1,1,0,1,0,0,1,0,0,0,0,1,0,0,
        1,1,0,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,0,
        0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
        0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,1,1,0,0,0,0,1,0,1,0,0,1,0,0,0,0,1,0,1,
        1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,0,0,0)
}

if (dset == 2) {
  k <- c(1,1,1,1,0,0,1,1,0,1,0,0,NA,0,0,1,0,1,0,0,
        0,1,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,0,1,0,0,0,1,1,0,0,0,0,1,0,0,0,0,0,0,0,
        0,0,0,0,0,0,1,0,1,1,0,0,0,0,0,0,0,0,0,0,
        1,0,1,1,0,1,1,1,0,1,0,0,1,0,0,0,0,1,0,0,
        1,1,0,1,0,0,0,1,0,1,0,1,1,0,0,1,0,1,0,0,
        0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,
        0,0,0,0,NA,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,
        0,1,1,0,0,0,0,1,0,1,0,0,1,0,0,0,0,1,0,1,
        1,0,0,0,0,0,1,0,0,1,0,0,1,0,0,0,0,NA,0,0)
}

In [76]:
k <- matrix(k, nrow=10, byrow=T)
np <- nrow(k)
nq <- ncol(k)

In [77]:
if (dset == 1) {
  model <- model1
  
  data <- list(k=k, np=np, nq=nq)  # to be passed on to Stan
  parameters <- c("p", "q")  # parameters to be monitored
  
} else if (dset == 2) {
  model <- model2
  k_na <- is.na(k) + 0  # Matrix locating missing values: 1 = missing
  n_na <- sum(k_na)  # number of missing values
  k[is.na(k)] <- 99  # some numeric value, since Stan doesn't eat NAs
  
  # to be passed on to Stan:
  data <- list(k=k, np=np, nq=nq, k_na=k_na, n_na=n_na)  
  parameters <- c("p", "q", "na_array")  # parameters to be monitored   
}

In [78]:
myinits <- list(
  list(q=runif(nq), p=runif(np)))

# The following command calls Stan with specific options.
# For a detailed description type "?rstan".
samples <- stan(model_code=model,   
                data=data, 
                init=myinits,  # If not specified, gives random inits
                pars=parameters,
                iter=10000, 
                chains=1, 
                thin=1,
                # warmup = 100,  # Stands for burn-in; Default = iter/2
                # seed = 123  # Setting seed; Default is random seed
)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.


TRANSLATING MODEL 'model' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'model' NOW.
In file included from file37946b52b6.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:17:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev.hpp:5:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'static' function 'set_zero_all_adjoints' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration]
    static void set_zero_all_adjoints() {
                ^
In file included from file37946b52b6.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:21:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/io/dump.hpp:26:14: warning: function 'product' is not needed and will not be emitted [-Wunneeded-internal-declaration]
      size_t product(std::vector<size_t> dims) {
             ^
2 warnings generated.

SAMPLING FOR MODEL 'model' NOW (CHAIN 1).

Iteration:    1 / 10000 [  0%]  (Warmup)
Iteration: 1000 / 10000 [ 10%]  (Warmup)
Iteration: 2000 / 10000 [ 20%]  (Warmup)
Iteration: 3000 / 10000 [ 30%]  (Warmup)
Iteration: 4000 / 10000 [ 40%]  (Warmup)
Iteration: 5000 / 10000 [ 50%]  (Warmup)
Iteration: 5001 / 10000 [ 50%]  (Sampling)
Iteration: 6000 / 10000 [ 60%]  (Sampling)
Iteration: 7000 / 10000 [ 70%]  (Sampling)
Iteration: 8000 / 10000 [ 80%]  (Sampling)
Iteration: 9000 / 10000 [ 90%]  (Sampling)
Iteration: 10000 / 10000 [100%]  (Sampling)
#  Elapsed Time: 1.7506 seconds (Warm-up)
#                1.8743 seconds (Sampling)
#                3.62491 seconds (Total)


In [79]:
print(samples, digits=3)


Inference for Stan model: model.
1 chains, each with iter=10000; warmup=5000; thin=1; 
post-warmup draws per chain=5000, total post-warmup draws=5000.

          mean se_mean    sd     2.5%      25%      50%      75%    97.5% n_eff
p[1]     0.890   0.001 0.094    0.655    0.841    0.915    0.963    0.996  5000
p[2]     0.275   0.002 0.141    0.064    0.169    0.252    0.361    0.605  4115
p[3]     0.478   0.003 0.171    0.183    0.349    0.468    0.596    0.836  3445
p[4]     0.353   0.002 0.150    0.109    0.243    0.336    0.447    0.683  3931
p[5]     0.847   0.002 0.118    0.567    0.776    0.872    0.941    0.995  5000
p[6]     0.821   0.002 0.124    0.540    0.745    0.842    0.921    0.990  5000
p[7]     0.173   0.002 0.110    0.024    0.088    0.154    0.230    0.452  4705
p[8]     0.093   0.001 0.086    0.003    0.030    0.068    0.130    0.320  5000
p[9]     0.721   0.002 0.158    0.398    0.608    0.732    0.844    0.979  5000
p[10]    0.478   0.003 0.174    0.179    0.351    0.466    0.591    0.856  3123
q[1]     0.726   0.003 0.179    0.335    0.604    0.751    0.872    0.987  5000
q[2]     0.693   0.003 0.180    0.318    0.568    0.711    0.837    0.980  5000
q[3]     0.772   0.002 0.158    0.407    0.671    0.798    0.898    0.992  5000
q[4]     0.649   0.003 0.209    0.235    0.494    0.662    0.822    0.982  3676
q[5]     0.151   0.002 0.137    0.004    0.047    0.115    0.219    0.505  5000
q[6]     0.315   0.003 0.188    0.042    0.169    0.284    0.434    0.747  4395
q[7]     0.749   0.002 0.158    0.402    0.646    0.765    0.877    0.984  5000
q[8]     0.820   0.002 0.145    0.471    0.735    0.851    0.937    0.996  5000
q[9]     0.288   0.003 0.171    0.037    0.156    0.260    0.391    0.688  4049
q[10]    0.854   0.002 0.122    0.559    0.788    0.885    0.949    0.996  5000
q[11]    0.153   0.002 0.139    0.003    0.047    0.114    0.218    0.512  5000
q[12]    0.426   0.003 0.192    0.103    0.282    0.410    0.555    0.835  4156
q[13]    0.859   0.002 0.120    0.549    0.795    0.890    0.951    0.996  5000
q[14]    0.155   0.002 0.142    0.004    0.048    0.115    0.221    0.537  5000
q[15]    0.154   0.002 0.138    0.004    0.049    0.114    0.219    0.514  5000
q[16]    0.484   0.003 0.207    0.134    0.323    0.475    0.628    0.912  3850
q[17]    0.151   0.002 0.136    0.004    0.047    0.112    0.217    0.502  5000
q[18]    0.753   0.003 0.179    0.345    0.633    0.787    0.902    0.989  5000
q[19]    0.153   0.002 0.139    0.004    0.047    0.114    0.217    0.510  4143
q[20]    0.299   0.003 0.180    0.041    0.163    0.270    0.406    0.713  4131
lp__  -137.317   0.110 4.494 -147.398 -140.050 -136.912 -134.106 -129.794  1669
       Rhat
p[1]  1.000
p[2]  1.000
p[3]  1.001
p[4]  1.001
p[5]  1.000
p[6]  1.000
p[7]  1.000
p[8]  1.000
p[9]  1.001
p[10] 1.000
q[1]  1.000
q[2]  1.000
q[3]  1.000
q[4]  1.000
q[5]  1.000
q[6]  1.000
q[7]  1.000
q[8]  1.000
q[9]  1.000
q[10] 1.000
q[11] 1.000
q[12] 1.000
q[13] 1.000
q[14] 1.000
q[15] 1.000
q[16] 1.001
q[17] 1.000
q[18] 1.000
q[19] 1.000
q[20] 1.000
lp__  1.001

Samples were drawn using NUTS(diag_e) at Sun Jul 19 01:08:43 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

In [80]:
plot(samples)


Exercise 6.3.2

Now suppose that three of the answers were not recorded, for what- ever reason. Our new data set, with missing data, now takes the form shown in Table 6.2. Bayesian inference will automatically make predictions about these missing values (i.e., “fill in the blanks”) by using the same probabilis- tic model that generated the observed data. Missing data are entered as nan (“not a number”) in Matlab, and NA (“not available”) in R or WinBUGS. Including the variable k as one to monitor when sampling will then provide posterior values for the missing values. That is, it provides information about the relative likelihood of the missing values being each of the possible alter- natives, using the statistical model and the available data. Look through the Matlab or R code to see how all of this is implemented in the second data set. Run the code, and interpret the posterior distributions for the three missing values. Are they reasonable inferences?

6.4 The two-country quiz

JAGS code


In [112]:
# The Two Country Quiz
s <- "
model{
  # Probability of Answering Correctly
  alpha ~ dunif(0,1)    # Match
  beta ~ dunif(0,alpha) # Mismatch   
  # Group Membership For People and Questions
  for (i in 1:nx){
    x[i] ~ dbern(0.5)
    x1[i] <- x[i]+1
  }
  for (j in 1:nz){
    z[j] ~ dbern(0.5)
    z1[j] <- z[j]+1
  }   
  # Probability Correct For Each Person-Question Combination By Groups
  for (i in 1:nx){
    for (j in 1:nz){
      theta[i,j,1,1] <- alpha
      theta[i,j,1,2] <- beta
      theta[i,j,2,1] <- beta
      theta[i,j,2,2] <- alpha
    }
  }   
  # Data Are Bernoulli By Rate
  for (i in 1:nx){
    for (j in 1:nz){
      k[i,j] ~ dbern(theta[i,j,x1[i],z1[j]])
    }
  }   
}
"

In [113]:
s_NA1 <- ""

In [114]:
s_NA2 <- ""

In [115]:
dset <- 1

if (dset==1)
{
k <- c(1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      0,1,1,0,0,1,0,0,
      0,1,1,0,0,1,1,0,
      1,0,0,1,1,0,0,1,
      0,0,0,1,1,0,0,1,
      0,1,0,0,0,1,1,0,
      0,1,1,1,0,1,1,0)
k <- matrix(k, nrow=8, byrow=T)
}

if (dset==2)
{
k <- c(1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      0,1,1,0,0,1,0,0,
      0,1,1,0,0,1,1,0,
      1,0,0,1,1,0,0,1,
      0,0,0,1,1,0,0,1,
      0,1,0,0,0,1,1,0,
      0,1,1,1,0,1,1,0,
      1,0,0,1,NA,NA,NA,NA,
      0,NA,NA,NA,NA,NA,NA,NA,
      NA,NA,NA,NA,NA,NA,NA,NA)
k <- matrix(k, nrow=11, byrow=T)
}

if (dset==3)
{
k <- c(1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      1,0,0,1,1,0,0,1,
      0,1,1,0,0,1,0,0,
      0,1,1,0,0,1,1,0,
      1,0,0,1,1,0,0,1,
      0,0,0,1,1,0,0,1,
      0,1,0,0,0,1,1,0,
      0,1,1,1,0,1,1,0,
      1,0,0,1,NA,NA,NA,NA,
      0,NA,NA,NA,NA,NA,NA,NA,
      NA,NA,NA,NA,NA,NA,NA,NA)
k <- matrix(k, nrow=21, byrow=T)
}

In [116]:
nx <- nrow(k)
nz <- ncol(k)

In [117]:
data <- list("nx","nz","k") # to be passed on to JAGS

In [118]:
inits <- list(
  list(z = round(runif(nz)), x = round(runif(nx)), alpha=0.5, beta=0.5))

In [119]:
if (dset==1)
{
# parameters to be monitored:
  parameters <- c("z", "x", "alpha", "beta")

  # The following command calls JAGS with specific options.
  # For a detailed description see the R2jags documentation.
  samples <- jags(data, inits, parameters,
                  textConnection(s), n.chains=1, n.iter=2000, 
                  n.burnin=1000, n.thin=1, DIC=T)
  # Now the values for the monitored parameters are in the "samples" object, 
  # ready for inspection.
}

if (dset==2)
{
# parameters to be monitored:
  parameters <- c("z", "x", "alpha", "beta", "NA.LP1", "NA.LP2", "NA.LP3")

  # The following command calls JAGS with specific options.
  # For a detailed description the R2jags documentation.
  # model.file ="TwoCountryQuiz_NA1.txt"  
  samples <- jags(data, inits, parameters,
                  textConnection(s_NA1), n.chains=1, 
                  n.iter=2000, n.burnin=1000, n.thin=1, DIC=T)
  # Now the values for the monitored parameters are in the "samples" object, 
  # ready for inspection.
}

if (dset==3)
{
# parameters to be monitored:
  parameters <- c("z", "x", "alpha", "beta", "NA.LP1", "NA.LP2", "NA.LP3")

  # The following command calls JAGS with specific options.
  # For a detailed description the R2jags documentation.
  # model.file ="TwoCountryQuiz_NA2.txt"  
  samples <- jags(data, inits, parameters,
                  textConnection(s_NA2), n.chains=1, 
                  n.iter=2000, n.burnin=1000, n.thin=1, DIC=T)
  # Now the values for the monitored parameters are in the "samples" object, 
  # ready for inspection.
}


Compiling model graph
   Resolving undeclared variables
   Allocating nodes
   Graph Size: 167

Initializing model


In [120]:
samples


Out[120]:
Inference for Bugs model at "4", fit using jags,
 1 chains, each with 2000 iterations (first 1000 discarded)
 n.sims = 1000 iterations saved
         mu.vect sd.vect   2.5%    25%    50%    75%  97.5%
alpha      0.881   0.053  0.758  0.851  0.888  0.920  0.961
beta       0.059   0.038  0.008  0.029  0.051  0.080  0.151
x[1]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
x[2]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
x[3]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
x[4]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
x[5]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
x[6]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
x[7]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
x[8]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[1]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[2]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[3]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[4]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[5]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
z[6]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[7]       0.000   0.000  0.000  0.000  0.000  0.000  0.000
z[8]       1.000   0.000  1.000  1.000  1.000  1.000  1.000
deviance  30.841   1.879 28.880 29.400 30.287 31.695 35.812

DIC info (using the rule, pD = var(deviance)/2)
pD = 1.8 and DIC = 32.6
DIC is an estimate of expected predictive error (lower deviance is better).

In [121]:
plot(samples)


Stan code


In [122]:
# Unfortunately, this model is not implemented! Search "double mixture model" 
# on mailing list to find out why.
# If you have some suggestions about the model, let us know!

Exercise 6.4.3

Now suppose that three extra people enter the room late, and begin to take the quiz. One of them (Late Person 1) has answered the first four questions, the next (Late Person 2) has only answered the first question, and the final new person (Late Person 3) is still sharpening their pencil, and has not started the quiz. This situation can be represented as an updated data set, now with missing data, as in Table 6.4. Interpret the inferences the model makes about the nationality of the late people, and whether or not they will get the unfinished questions correct.

Exercise 6.4.4

Finally, suppose that you are now given the correctness scores for a set of 10 new people, whose data were not previously available, but who form part of the same group of people we are studying. The updated data set is shown in Table 6.5. Interpret the inferences the model makes about the nationality of the new people. Revisit the inferences about the late people, and whether or not they will get the unfinished questions correct. Does the inference drawn by the model for the third late person match your intuition? There is a problem here. How could it be fixed?

6.5 Assessment of malingering

JAGS code


In [123]:
# Malingering
s <- "
model{
  # Each Person Belongs to One of Two Latent Groups
  for (i in 1:p){
    z[i] ~ dbern(0.5)
    z1[i] <- z[i]+1
  }
  # Bona Fide Group has Unknown Success Rate Above Chance
  psi[1] ~ dunif(0.5,1)
  # Malingering Group has Unknown Success Rate Below Bona Fide
  psi[2] ~ dunif(0,psi[1])
  # Data are Binomial with Group Rate for Each Person
  for (i in 1:p){
    theta[i] <- psi[z1[i]]
    k[i] ~ dbin(theta[i],n)
  }
}
"

In [124]:
k <- c(45,45,44,45,44,45,45,45,45,45,30,20,6,44,44,27,25,17,14,27,35,30)
p <- length(k) # number of people
n <- 45        # number of questions

In [125]:
data <- list("p", "k", "n") # to be passed on to JAGS

In [126]:
myinits <- list(
  list(psi = c(0.7,0.5), z = round(runif(p)))) # Initial group assignment

In [127]:
# parameters to be monitored:
parameters <- c("psi","z")

In [128]:
# The following command calls JAGS with specific options.
# For a detailed description see the R2jags documentation.
samples <- jags(data, inits=myinits, parameters,
                textConnection(s), n.chains=1, n.iter=2000, 
                n.burnin=1000, n.thin=1, DIC=T)


Compiling model graph
   Resolving undeclared variables
   Allocating nodes
   Graph Size: 95

Initializing model


In [129]:
samples


Out[129]:
Inference for Bugs model at "6", fit using jags,
 1 chains, each with 2000 iterations (first 1000 discarded)
 n.sims = 1000 iterations saved
         mu.vect sd.vect    2.5%     25%     50%     75%   97.5%
psi[1]     0.991   0.004   0.980   0.988   0.991   0.994   0.997
psi[2]     0.513   0.023   0.467   0.496   0.513   0.529   0.558
z[1]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[2]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[3]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[4]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[5]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[6]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[7]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[8]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[9]       0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[10]      0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[11]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[12]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[13]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[14]      0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[15]      0.000   0.000   0.000   0.000   0.000   0.000   0.000
z[16]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[17]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[18]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[19]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[20]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[21]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
z[22]      1.000   0.000   1.000   1.000   1.000   1.000   1.000
deviance 125.618   2.100 123.514 124.096 124.887 126.503 131.355

DIC info (using the rule, pD = var(deviance)/2)
pD = 2.2 and DIC = 127.8
DIC is an estimate of expected predictive error (lower deviance is better).

In [130]:
plot(samples)


Stan code


In [132]:
model <- "
// Malingering
data { 
  int<lower=1> n;
  int<lower=1> p;
  int<lower=0,upper=n> k[p];
}
parameters {
  vector<lower=0,upper=1>[2] psi;
} 
transformed parameters {
  vector[2] lp_parts[p];

  for (i in 1:p) {
    lp_parts[i, 1] <- log(.5) + binomial_log(k[i], n, psi[1]);
    lp_parts[i, 2] <- log(.5) + binomial_log(k[i], n, psi[2]);
  } 
}
model {
  // Bona Fide Group has Unknown Success Rate Above Chance
  psi[1] ~ uniform(.5, 1);
  // Malingering Group has Unknown Success Rate Below Bona Fide
  psi[2] ~ uniform(0, psi[1]);

  for (i in 1:p)
    increment_log_prob(log_sum_exp(lp_parts[i]));    
}
generated quantities {
  int<lower=0,upper=1> z[p];
  
  for (i in 1:p) {
    vector[2] prob;
    prob <- softmax(lp_parts[i]);
    z[i] <- bernoulli_rng(prob[2]);
  }
}"

k <- c(45, 45, 44, 45, 44, 45, 45, 45, 45, 45, 30,
       20, 6, 44, 44, 27, 25, 17, 14, 27, 35, 30)
p <- length(k)  # number of people
n <- 45         # number of questions

data <- list(p=p, k=k, n=n)  # To be passed on to Stan

myinits <- list(
    list(psi=c(.7, .5)))

parameters <- c("psi","z")  # Parameters to be monitored

# The following command calls Stan with specific options.
# For a detailed description type "?rstan".
samples <- stan(model_code=model,   
                data=data, 
                init=myinits,  # If not specified, gives random inits
                pars=parameters,
                iter=2000, 
                chains=1, 
                thin=1,
                # warmup = 100,  # Stands for burn-in; Default = iter/2
                # seed = 123  # Setting seed; Default is random seed
)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.

print(samples, digits=3)


TRANSLATING MODEL 'model' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'model' NOW.
In file included from file3797de71602.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:17:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev.hpp:5:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'static' function 'set_zero_all_adjoints' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration]
    static void set_zero_all_adjoints() {
                ^
In file included from file3797de71602.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:21:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/io/dump.hpp:26:14: warning: function 'product' is not needed and will not be emitted [-Wunneeded-internal-declaration]
      size_t product(std::vector<size_t> dims) {
             ^
2 warnings generated.

SAMPLING FOR MODEL 'model' NOW (CHAIN 1).

Iteration:    1 / 2000 [  0%]  (Warmup)
Iteration:  200 / 2000 [ 10%]  (Warmup)
Iteration:  400 / 2000 [ 20%]  (Warmup)
Iteration:  600 / 2000 [ 30%]  (Warmup)
Iteration:  800 / 2000 [ 40%]  (Warmup)
Iteration: 1000 / 2000 [ 50%]  (Warmup)
Iteration: 1001 / 2000 [ 50%]  (Sampling)
Iteration: 1200 / 2000 [ 60%]  (Sampling)
Iteration: 1400 / 2000 [ 70%]  (Sampling)
Iteration: 1600 / 2000 [ 80%]  (Sampling)
Iteration: 1800 / 2000 [ 90%]  (Sampling)
Iteration: 2000 / 2000 [100%]  (Sampling)
#  Elapsed Time: 0.172655 seconds (Warm-up)
#                0.178184 seconds (Sampling)
#                0.350839 seconds (Total)

Inference for Stan model: model.
1 chains, each with iter=2000; warmup=1000; thin=1; 
post-warmup draws per chain=1000, total post-warmup draws=1000.

          mean se_mean    sd    2.5%     25%     50%     75%   97.5% n_eff
psi[1]   0.991   0.000 0.004   0.981   0.989   0.991   0.994   0.997   660
psi[2]   0.514   0.001 0.023   0.472   0.498   0.514   0.530   0.561   625
z[1]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[2]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[3]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[4]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[5]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[6]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[7]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[8]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[9]     0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[10]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[11]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[12]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[13]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[14]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[15]    0.000   0.000 0.000   0.000   0.000   0.000   0.000   0.000  1000
z[16]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[17]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[18]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[19]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[20]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[21]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
z[22]    1.000   0.000 0.000   1.000   1.000   1.000   1.000   1.000  1000
lp__   -84.114   0.058 0.982 -86.766 -84.481 -83.799 -83.428 -83.180   290
        Rhat
psi[1] 0.999
psi[2] 1.000
z[1]     NaN
z[2]     NaN
z[3]     NaN
z[4]     NaN
z[5]     NaN
z[6]     NaN
z[7]     NaN
z[8]     NaN
z[9]     NaN
z[10]    NaN
z[11]    NaN
z[12]    NaN
z[13]    NaN
z[14]    NaN
z[15]    NaN
z[16]    NaN
z[17]    NaN
z[18]    NaN
z[19]    NaN
z[20]    NaN
z[21]    NaN
z[22]    NaN
lp__   1.001

Samples were drawn using NUTS(diag_e) at Sun Jul 19 01:42:26 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

In [133]:
plot(samples)


6.6 Individual differences in malingering

JAGS code


In [134]:
# Malingering, with Individual Differences
s <- "
model{
  # Each Person Belongs to One of Two Latent Groups
  for (i in 1:p){
    z[i] ~ dbern(phi) # phi is the Base Rate
    z1[i] <- z[i]+1
  }
  # Relatively Uninformative Prior on Base Rate
  phi ~ dbeta(5,5) 
  # Data are Binomial with Rate Given by 
  # Each Person's Group Assignment
  for (i in 1:p){
    k[i] ~ dbin(theta[i,z1[i]],n)
    theta[i,1] ~ dbeta(alpha[1],beta[1])
    theta[i,2] ~ dbeta(alpha[2],beta[2])
  }
  # Transformation to Group Mean and Precision
  alpha[1] <- mubon * lambdabon
  beta[1] <- lambdabon * (1-mubon)
  # Additivity on Logit Scale
  logit(mumal) <- logit(mubon) - mudiff
  alpha[2] <- mumal * lambdamal
  beta[2]  <- lambdamal * (1-mumal)
  # Priors
  mubon ~ dbeta(1,1)
  mudiff ~ dnorm(0,0.5)T(0,) # Constrained to be Positive
  lambdabon ~ dunif(40,800)
  lambdamal ~ dunif(4,100)
}
"

In [135]:
k <- c(45,45,44,45,44,45,45,45,45,45,30,20,6,44,44,27,25,17,14,27,35,30)
p <- length(k) # number of people
n <- 45        # number of questions

data <- list("p", "k", "n") # to be passed on to JAGS
myinits <- list(
  list(z = round(runif(p)), mudiff=0.2),
  list(z = round(runif(p)), mudiff=0.3),
  list(z = round(runif(p)), mudiff=0.4)) 

# parameters to be monitored:
parameters <- c("theta","z","mubon","lambdabon",
               "mumal","lambdamal","mudiff","phi")

# The following command calls JAGS with specific options.
# For a detailed description see the R2jags documentation.
samples <- jags(data, inits=myinits, parameters,
                textConnection(s), n.chains=3, n.iter=10000, 
                n.burnin=1000, n.thin=1, DIC=T)


Compiling model graph
   Resolving undeclared variables
   Allocating nodes
   Graph Size: 156

Initializing model


In [138]:
samples


Out[138]:
Inference for Bugs model at "4", fit using jags,
 3 chains, each with 10000 iterations (first 1000 discarded)
 n.sims = 27000 iterations saved
            mu.vect sd.vect   2.5%     25%     50%     75%   97.5%  Rhat n.eff
lambdabon   430.459 219.873 58.330 242.783 436.679 622.219 782.243 1.004   690
lambdamal    10.007   5.198  4.290   6.403   8.686  12.130  23.269 1.001 12000
mubon         0.986   0.005  0.974   0.983   0.987   0.990   0.994 1.005   510
mudiff        4.175   0.458  3.308   3.871   4.156   4.468   5.107 1.005   440
mumal         0.539   0.059  0.429   0.500   0.537   0.577   0.662 1.002  2400
phi           0.471   0.087  0.302   0.410   0.470   0.531   0.643 1.001 12000
theta[1,1]    0.988   0.007  0.971   0.984   0.989   0.993   0.999 1.002  1300
theta[2,1]    0.988   0.007  0.971   0.984   0.989   0.993   0.999 1.002  1700
theta[3,1]    0.985   0.008  0.966   0.981   0.987   0.991   0.997 1.003  1100
theta[4,1]    0.988   0.007  0.971   0.984   0.989   0.993   0.999 1.002  2400
theta[5,1]    0.985   0.008  0.965   0.981   0.987   0.991   0.997 1.002  1600
theta[6,1]    0.988   0.007  0.971   0.984   0.989   0.993   0.999 1.002  2700
theta[7,1]    0.988   0.007  0.971   0.984   0.989   0.993   0.999 1.002  1800
theta[8,1]    0.988   0.007  0.971   0.984   0.989   0.993   0.999 1.002  1500
theta[9,1]    0.988   0.007  0.970   0.984   0.989   0.993   0.999 1.002  1700
theta[10,1]   0.988   0.007  0.971   0.984   0.989   0.993   0.999 1.002  2500
theta[11,1]   0.986   0.009  0.964   0.982   0.988   0.992   0.998 1.003  1500
theta[12,1]   0.986   0.009  0.965   0.982   0.988   0.992   0.998 1.003  1300
theta[13,1]   0.986   0.009  0.964   0.982   0.988   0.992   0.998 1.002  2400
theta[14,1]   0.985   0.008  0.965   0.981   0.987   0.991   0.997 1.003   990
theta[15,1]   0.985   0.008  0.965   0.981   0.987   0.991   0.997 1.002  1600
theta[16,1]   0.986   0.009  0.964   0.982   0.988   0.992   0.998 1.005  1000
theta[17,1]   0.986   0.009  0.965   0.982   0.988   0.992   0.998 1.002  2900
theta[18,1]   0.986   0.009  0.964   0.982   0.988   0.992   0.998 1.002  1500
theta[19,1]   0.986   0.009  0.965   0.982   0.988   0.992   0.998 1.002  1500
theta[20,1]   0.986   0.009  0.965   0.982   0.988   0.992   0.998 1.003  1300
theta[21,1]   0.986   0.010  0.964   0.982   0.988   0.992   0.998 1.009  1100
theta[22,1]   0.986   0.009  0.964   0.982   0.988   0.992   0.998 1.002  2500
theta[1,2]    0.539   0.174  0.200   0.419   0.540   0.662   0.874 1.001  8900
theta[2,2]    0.540   0.173  0.202   0.423   0.540   0.660   0.880 1.001 16000
theta[3,2]    0.541   0.176  0.201   0.420   0.542   0.661   0.891 1.001 19000
theta[4,2]    0.541   0.172  0.203   0.422   0.541   0.662   0.874 1.001 13000
theta[5,2]    0.542   0.175  0.204   0.420   0.541   0.664   0.890 1.001  5600
theta[6,2]    0.541   0.171  0.207   0.423   0.541   0.661   0.873 1.001  8600
theta[7,2]    0.541   0.172  0.204   0.422   0.540   0.661   0.876 1.001  8900
theta[8,2]    0.541   0.172  0.201   0.423   0.541   0.660   0.875 1.001 27000
theta[9,2]    0.539   0.172  0.206   0.419   0.540   0.660   0.875 1.001 18000
theta[10,2]   0.539   0.173  0.200   0.419   0.539   0.659   0.876 1.001 19000
theta[11,2]   0.644   0.066  0.511   0.600   0.646   0.690   0.766 1.001 27000
theta[12,2]   0.460   0.067  0.330   0.415   0.461   0.506   0.592 1.001 14000
theta[13,2]   0.205   0.060  0.099   0.161   0.200   0.244   0.333 1.001 27000
theta[14,2]   0.543   0.175  0.203   0.421   0.541   0.664   0.893 1.001  3800
theta[15,2]   0.544   0.176  0.201   0.422   0.543   0.665   0.896 1.001 27000
theta[16,2]   0.589   0.067  0.458   0.543   0.590   0.635   0.718 1.001 27000
theta[17,2]   0.552   0.068  0.418   0.506   0.552   0.598   0.682 1.001 27000
theta[18,2]   0.405   0.067  0.276   0.359   0.404   0.451   0.537 1.001 16000
theta[19,2]   0.351   0.066  0.228   0.306   0.350   0.395   0.484 1.001 27000
theta[20,2]   0.589   0.067  0.455   0.544   0.590   0.635   0.717 1.001 13000
theta[21,2]   0.735   0.063  0.605   0.693   0.737   0.779   0.849 1.002 11000
theta[22,2]   0.643   0.066  0.512   0.599   0.645   0.689   0.767 1.002  3300
z[1]          0.002   0.049  0.000   0.000   0.000   0.000   0.000 1.016 12000
z[2]          0.003   0.053  0.000   0.000   0.000   0.000   0.000 1.025  6700
z[3]          0.009   0.094  0.000   0.000   0.000   0.000   0.000 1.009  6300
z[4]          0.002   0.044  0.000   0.000   0.000   0.000   0.000 1.043  5700
z[5]          0.009   0.093  0.000   0.000   0.000   0.000   0.000 1.002 27000
z[6]          0.002   0.043  0.000   0.000   0.000   0.000   0.000 1.006 27000
z[7]          0.003   0.052  0.000   0.000   0.000   0.000   0.000 1.003 27000
z[8]          0.002   0.047  0.000   0.000   0.000   0.000   0.000 1.029  7600
z[9]          0.003   0.054  0.000   0.000   0.000   0.000   0.000 1.008 20000
z[10]         0.002   0.047  0.000   0.000   0.000   0.000   0.000 1.072  2800
z[11]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[12]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[13]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[14]         0.009   0.094  0.000   0.000   0.000   0.000   0.000 1.002 27000
z[15]         0.011   0.102  0.000   0.000   0.000   0.000   0.000 1.006  7500
z[16]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[17]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[18]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[19]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[20]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
z[21]         0.999   0.034  1.000   1.000   1.000   1.000   1.000 1.078  4800
z[22]         1.000   0.000  1.000   1.000   1.000   1.000   1.000 1.000     1
deviance     70.704   5.848 61.239  66.528  70.021  74.151  84.136 1.001  9500

For each parameter, n.eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor (at convergence, Rhat=1).

DIC info (using the rule, pD = var(deviance)/2)
pD = 17.1 and DIC = 87.8
DIC is an estimate of expected predictive error (lower deviance is better).

In [139]:
plot(samples)


Stan code


In [140]:
model <- "
// Malingering, with Individual Differences
data { 
  int<lower=1> n;
  int<lower=1> p;
  int<lower=0,upper=n> k[p];
}
parameters {
  real<lower=0,upper=1> phi;
  real<lower=0,upper=1> mubon;
  real<lower=0> mudiff;
  real<lower=40,upper=800> lambdabon;
  real<lower=4,upper=100> lambdamal;
  matrix<lower=0,upper=1>[p,2] theta;
} 
transformed parameters {
  vector[2] lp_parts[p];
  vector<lower=0>[2] alpha;
  vector<lower=0>[2] beta;
  real<lower=0,upper=1> mumal;
    
  // Additivity on Logit Scale
  mumal <- inv_logit(logit(mubon) - mudiff);
  
  // Transformation to Group Mean and Precision
  alpha[1] <- mubon * lambdabon;
  beta[1] <- lambdabon * (1 - mubon);
  alpha[2] <- mumal * lambdamal;
  beta[2] <- lambdamal * (1 - mumal);
    
  // Data are Binomial with Rate Given by 
  // Each Person’s Group Assignment
  for (i in 1:p) {
    lp_parts[i,1] <- log1m(phi) + binomial_log(k[i], n, theta[i,1]);
    lp_parts[i,2] <- log(phi) + binomial_log(k[i], n, theta[i,2]);
  } 
}
model {
  // Priors
  mubon ~ beta(1, 1);  // can be removed
  mudiff ~ normal(0, 1 / sqrt(.5))T[0,];  // Constrained to be Positive
  // Relatively Uninformative Prior on Base Rate
  phi ~ beta(5, 5);
  
  for (i in 1:p)
    theta[i] ~ beta(alpha, beta);
    
  for (i in 1:p)
    increment_log_prob(log_sum_exp(lp_parts[i]));   
}
generated quantities {
  int<lower=0,upper=1> z[p];
    
  for (i in 1:p) {
    vector[2] prob;
    prob <- softmax(lp_parts[i]);
    // Each Person Belongs to One of Two Latent Groups
    z[i] <- bernoulli_rng(prob[2]);
  }
}"

k <- c(45, 45, 44, 45, 44, 45, 45, 45, 45, 45, 30,
       20, 6, 44, 44, 27, 25, 17, 14, 27, 35, 30)
p <- length(k) # number of people
n <- 45        # number of questions

data <- list(p=p, k=k, n=n) # To be passed on to Stan

myinits <- list(
    list(phi=.5, mubon=.5, mudiff=1, lambdabon=400, lambdamal=50,
         theta=matrix(rep(.5, p * 2), p, 2)))

# Parameters to be monitored
parameters <- c("mubon", "lambdabon", "mumal", "lambdamal", "mudiff", 
                "phi", "theta", "z")  

# The following command calls Stan with specific options.
# For a detailed description type "?rstan".
samples <- stan(model_code=model,   
                data=data, 
                init=myinits,  # If not specified, gives random inits
                pars=parameters,
                iter=3000, 
                chains=1, 
                thin=1,
                # warmup = 100,  # Stands for burn-in; Default = iter/2
                # seed = 123  # Setting seed; Default is random seed
)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.

print(samples, digits=3)


TRANSLATING MODEL 'model' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'model' NOW.
In file included from file379465f30d6.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:17:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev.hpp:5:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'static' function 'set_zero_all_adjoints' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration]
    static void set_zero_all_adjoints() {
                ^
In file included from file379465f30d6.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:21:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/io/dump.hpp:26:14: warning: function 'product' is not needed and will not be emitted [-Wunneeded-internal-declaration]
      size_t product(std::vector<size_t> dims) {
             ^
2 warnings generated.

SAMPLING FOR MODEL 'model' NOW (CHAIN 1).

Iteration:    1 / 3000 [  0%]  (Warmup)
Iteration:  300 / 3000 [ 10%]  (Warmup)
Iteration:  600 / 3000 [ 20%]  (Warmup)
Iteration:  900 / 3000 [ 30%]  (Warmup)
Iteration: 1200 / 3000 [ 40%]  (Warmup)
Iteration: 1500 / 3000 [ 50%]  (Warmup)
Iteration: 1501 / 3000 [ 50%]  (Sampling)
Iteration: 1800 / 3000 [ 60%]  (Sampling)
Iteration: 2100 / 3000 [ 70%]  (Sampling)
Iteration: 2400 / 3000 [ 80%]  (Sampling)
Iteration: 2700 / 3000 [ 90%]  (Sampling)
Iteration: 3000 / 3000 [100%]  (Sampling)
#  Elapsed Time: 4.60732 seconds (Warm-up)
#                4.15804 seconds (Sampling)
#                8.76535 seconds (Total)

Inference for Stan model: model.
1 chains, each with iter=3000; warmup=1500; thin=1; 
post-warmup draws per chain=1500, total post-warmup draws=1500.

               mean se_mean      sd     2.5%      25%     50%     75%   97.5%
mubon         0.986   0.000   0.005    0.975    0.983   0.987   0.990   0.994
lambdabon   416.076  18.127 218.429   57.424  225.829 411.435 602.424 780.577
mumal         0.535   0.002   0.058    0.422    0.497   0.535   0.573   0.653
lambdamal     9.974   0.245   5.014    4.347    6.454   8.703  12.084  23.478
mudiff        4.181   0.032   0.443    3.406    3.866   4.168   4.476   5.112
phi           0.470   0.002   0.089    0.293    0.410   0.471   0.531   0.645
theta[1,1]    0.988   0.000   0.008    0.970    0.984   0.989   0.994   0.999
theta[1,2]    0.532   0.006   0.175    0.206    0.410   0.529   0.657   0.869
theta[2,1]    0.988   0.000   0.008    0.969    0.983   0.989   0.993   0.999
theta[2,2]    0.541   0.004   0.174    0.196    0.420   0.547   0.662   0.880
theta[3,1]    0.985   0.000   0.008    0.966    0.981   0.986   0.991   0.996
theta[3,2]    0.539   0.005   0.177    0.195    0.418   0.542   0.665   0.883
theta[4,1]    0.988   0.000   0.007    0.971    0.984   0.990   0.994   0.999
theta[4,2]    0.534   0.005   0.173    0.189    0.416   0.532   0.658   0.847
theta[5,1]    0.985   0.000   0.009    0.964    0.981   0.987   0.991   0.997
theta[5,2]    0.539   0.005   0.176    0.200    0.420   0.541   0.662   0.890
theta[6,1]    0.988   0.000   0.008    0.971    0.984   0.989   0.994   0.999
theta[6,2]    0.535   0.006   0.174    0.195    0.414   0.547   0.656   0.880
theta[7,1]    0.988   0.000   0.007    0.973    0.984   0.989   0.993   0.999
theta[7,2]    0.534   0.005   0.173    0.193    0.416   0.532   0.652   0.871
theta[8,1]    0.988   0.000   0.008    0.969    0.984   0.989   0.993   0.999
theta[8,2]    0.532   0.004   0.166    0.196    0.419   0.531   0.653   0.844
theta[9,1]    0.988   0.000   0.007    0.970    0.984   0.989   0.994   0.998
theta[9,2]    0.537   0.004   0.160    0.208    0.430   0.544   0.644   0.841
theta[10,1]   0.988   0.000   0.007    0.970    0.984   0.989   0.994   0.999
theta[10,2]   0.533   0.005   0.176    0.181    0.409   0.530   0.652   0.880
theta[11,1]   0.986   0.000   0.009    0.966    0.982   0.988   0.992   0.998
theta[11,2]   0.642   0.002   0.062    0.515    0.601   0.642   0.685   0.754
theta[12,1]   0.986   0.000   0.010    0.965    0.981   0.987   0.992   0.998
theta[12,2]   0.460   0.002   0.067    0.332    0.415   0.458   0.505   0.593
theta[13,1]   0.986   0.000   0.009    0.965    0.982   0.988   0.993   0.998
theta[13,2]   0.204   0.002   0.060    0.101    0.161   0.199   0.243   0.328
theta[14,1]   0.985   0.000   0.008    0.965    0.981   0.987   0.991   0.996
theta[14,2]   0.539   0.004   0.169    0.209    0.423   0.538   0.661   0.862
theta[15,1]   0.985   0.000   0.008    0.965    0.981   0.986   0.991   0.997
theta[15,2]   0.530   0.005   0.171    0.186    0.413   0.530   0.646   0.863
theta[16,1]   0.986   0.000   0.010    0.962    0.982   0.988   0.992   0.998
theta[16,2]   0.590   0.002   0.070    0.452    0.541   0.591   0.639   0.724
theta[17,1]   0.986   0.000   0.009    0.964    0.981   0.987   0.993   0.998
theta[17,2]   0.550   0.002   0.069    0.415    0.504   0.552   0.598   0.682
theta[18,1]   0.986   0.000   0.009    0.965    0.982   0.987   0.992   0.998
theta[18,2]   0.408   0.002   0.066    0.285    0.361   0.405   0.454   0.540
theta[19,1]   0.986   0.000   0.009    0.965    0.982   0.987   0.992   0.999
theta[19,2]   0.351   0.002   0.068    0.222    0.303   0.348   0.396   0.487
theta[20,1]   0.986   0.000   0.008    0.966    0.981   0.987   0.992   0.998
theta[20,2]   0.589   0.002   0.066    0.458    0.545   0.591   0.637   0.708
theta[21,1]   0.986   0.000   0.011    0.964    0.982   0.988   0.992   0.999
theta[21,2]   0.736   0.002   0.065    0.600    0.693   0.740   0.783   0.850
theta[22,1]   0.986   0.000   0.010    0.964    0.982   0.988   0.992   0.998
theta[22,2]   0.643   0.002   0.066    0.510    0.598   0.644   0.692   0.765
z[1]          0.002   0.001   0.045    0.000    0.000   0.000   0.000   0.000
z[2]          0.003   0.002   0.058    0.000    0.000   0.000   0.000   0.000
z[3]          0.009   0.004   0.096    0.000    0.000   0.000   0.000   0.000
z[4]          0.001   0.001   0.037    0.000    0.000   0.000   0.000   0.000
z[5]          0.011   0.003   0.103    0.000    0.000   0.000   0.000   0.000
z[6]          0.002   0.001   0.045    0.000    0.000   0.000   0.000   0.000
z[7]          0.001   0.001   0.037    0.000    0.000   0.000   0.000   0.000
z[8]          0.001   0.001   0.037    0.000    0.000   0.000   0.000   0.000
z[9]          0.000   0.000   0.000    0.000    0.000   0.000   0.000   0.000
z[10]         0.002   0.001   0.045    0.000    0.000   0.000   0.000   0.000
z[11]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[12]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[13]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[14]         0.007   0.003   0.085    0.000    0.000   0.000   0.000   0.000
z[15]         0.009   0.002   0.093    0.000    0.000   0.000   0.000   0.000
z[16]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[17]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[18]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[19]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[20]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
z[21]         0.997   0.002   0.052    1.000    1.000   1.000   1.000   1.000
z[22]         1.000   0.000   0.000    1.000    1.000   1.000   1.000   1.000
lp__        -99.752   1.316  11.426 -126.228 -106.199 -98.201 -91.737 -81.630
            n_eff  Rhat
mubon         184 1.021
lambdabon     145 1.000
mumal         586 1.001
lambdamal     418 1.002
mudiff        188 1.018
phi          1500 1.001
theta[1,1]    299 1.010
theta[1,2]    949 0.999
theta[2,1]    342 1.012
theta[2,2]   1500 1.001
theta[3,1]    594 1.007
theta[3,2]   1500 1.000
theta[4,1]    254 1.010
theta[4,2]   1087 1.001
theta[5,1]    454 1.015
theta[5,2]   1103 0.999
theta[6,1]    268 1.008
theta[6,2]    861 1.000
theta[7,1]    256 1.006
theta[7,2]   1077 1.000
theta[8,1]    269 1.007
theta[8,2]   1500 0.999
theta[9,1]    345 1.010
theta[9,2]   1500 0.999
theta[10,1]   411 1.004
theta[10,2]  1500 0.999
theta[11,1]   464 1.010
theta[11,2]   965 1.000
theta[12,1]   697 1.007
theta[12,2]  1500 0.999
theta[13,1]   558 1.015
theta[13,2]   858 1.000
theta[14,1]   509 1.010
theta[14,2]  1500 1.001
theta[15,1]   608 1.009
theta[15,2]   998 1.003
theta[16,1]   540 1.014
theta[16,2]   930 1.000
theta[17,1]   457 1.007
theta[17,2]  1500 1.000
theta[18,1]   610 1.006
theta[18,2]  1500 1.001
theta[19,1]   581 1.002
theta[19,2]  1240 0.999
theta[20,1]   377 1.010
theta[20,2]  1500 1.000
theta[21,1]   513 1.006
theta[21,2]  1030 1.000
theta[22,1]   636 1.004
theta[22,2]  1088 1.000
z[1]          902 1.000
z[2]         1076 1.003
z[3]          676 1.006
z[4]         1500 0.999
z[5]          974 0.999
z[6]         1500 1.000
z[7]         1500 0.999
z[8]          751 1.001
z[9]         1500   NaN
z[10]         901 1.000
z[11]        1500   NaN
z[12]        1500   NaN
z[13]        1500   NaN
z[14]        1111 0.999
z[15]        1500 0.999
z[16]        1500   NaN
z[17]        1500   NaN
z[18]        1500   NaN
z[19]        1500   NaN
z[20]        1500   NaN
z[21]         602 1.000
z[22]        1500   NaN
lp__           75 1.006

Samples were drawn using NUTS(diag_e) at Sun Jul 19 01:48:45 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

In [141]:
plot(samples)


6.7 Alzheimer’s recall test cheating

JAGS code


In [157]:
# Cheating Latent Mixture Model
s <- "
model{
  # Each Person Belongs to One of Two Latent Groups
  for (i in 1:p){
     z[i] ~ dbern(phi) # phi is the Base Rate
     z1[i] <- z[i]+1
  }
  # Relatively Uninformative Prior on Base Rate
  phi ~ dbeta(5,5) 
  # Data are Binomial with Rate Given by 
  # Each Person뭩 Group Assignment
  for (i in 1:p){
    k[i] ~ dbin(theta[i,z1[i]],n)
    thetatmp[i,1] ~ dbeta(alpha[1],beta[1])
    theta[i,1] <- max(.01,min(.99,thetatmp[i,1]))
    thetatmp[i,2] ~ dbeta(alpha[2],beta[2])
    theta[i,2] <- max(.01,min(.99,thetatmp[i,2]))
  }
  # Transformation to Group Mean and Precision
  alpha[1] <- mubon * lambdabon
  beta[1] <- lambdabon * (1-mubon)
  # Additivity on Logit Scale
  logit(muche) <- logit(mubon) + mudiff # Note the +
  alpha[2] <- muche * lambdache
  beta[2] <- lambdache * (1-muche)
  # Priors
  mubon ~ dbeta(1,1)
  mudiff ~ dnorm(0,0.5)T(0,) # Constrained to be Positive
  lambdabon ~ dunif(5,50)
  lambdache ~ dunif(5,50)
  # Correct Count
  for (i in 1:p){
    pct[i] <- equals(z[i],truth[i])
  }
  pc <- sum(pct[1:p])
}
"

In [158]:
cheat.dat <- read.table("cheat.csv",header=F,sep=",")

In [159]:
cheatt.dat <- read.table("cheatt.csv",header=F,sep="")

In [160]:
truth <- cheatt.dat$V1 #truth = 1 if cheater
truth


Out[160]:
  1. 1
  2. 0
  3. 1
  4. 0
  5. 0
  6. 0
  7. 1
  8. 1
  9. 0
  10. 0
  11. 1
  12. 0
  13. 1
  14. 1
  15. 0
  16. 1
  17. 0
  18. 0
  19. 1
  20. 1
  21. 0
  22. 0
  23. 1
  24. 0
  25. 1
  26. 0
  27. 1
  28. 0
  29. 1
  30. 0
  31. 1
  32. 0
  33. 1
  34. 0
  35. 0
  36. 1
  37. 1
  38. 0
  39. 0
  40. 1
  41. 1
  42. 1
  43. 0
  44. 0
  45. 0
  46. 1
  47. 0
  48. 1
  49. 1
  50. 0
  51. 0
  52. 1
  53. 0
  54. 1
  55. 1
  56. 0
  57. 0
  58. 1
  59. 1
  60. 0
  61. 1
  62. 0
  63. 1
  64. 0
  65. 0
  66. 1
  67. 0
  68. 0
  69. 1
  70. 0
  71. 1
  72. 0
  73. 1
  74. 0
  75. 1
  76. 0
  77. 1
  78. 0
  79. 1
  80. 0
  81. 0
  82. 1
  83. 0
  84. 1
  85. 0
  86. 1
  87. 0
  88. 1
  89. 0
  90. 1
  91. 1
  92. 0
  93. 1
  94. 0
  95. 0
  96. 1
  97. 0
  98. 1
  99. 0
  100. 1
  101. 0
  102. 1
  103. 0
  104. 1
  105. 0
  106. 1
  107. 0
  108. 1
  109. 0
  110. 1
  111. 0
  112. 1
  113. 0
  114. 1
  115. 0
  116. 1
  117. 0
  118. 1

In [161]:
k <- apply(cheat.dat,1,sum) # total correct per participant
p <- length(k) # number of people
n <- 40        # total trials

data <- list("p", "k", "n", "truth") # to be passed on to JAGS
myinits <- list(
  list(z = round(runif(p)), mudiff=0.1, phi=0.5, mubon=0.5, lambdabon=30, lambdache=25),
  list(z = round(runif(p)), mudiff=0.15, phi=0.5, mubon=0.5, lambdabon=25, lambdache=30)
  ) 

#myinits <- list(
#  list(z = round(runif(p)), mudiff=0.1, phi=0.5, mubon=0.5, lambdabon=10, lambdache=10),
#  list(z = round(runif(p)), mudiff=0.2, phi=0.5, mubon=0.5, lambdabon=10, lambdache=10)
#  ) 

# parameters to be monitored:
parameters <- c("theta","z","mubon","lambdabon",
               "muche","lambdache","mudiff","phi","alpha","beta","pc")

set.seed(3) # some chains result in "undefined real result -- see box & exercise"

# The following command calls JAGS with specific options.
# For a detailed description see the R2jags documentation.
samples = jags(data, inits=myinits, parameters,
               textConnection(s), n.chains=2, n.iter=6000, 
               n.burnin=3000, n.thin=1, DIC=T)


Compiling model graph
   Resolving undeclared variables
   Allocating nodes
   Graph Size: 1441

Initializing model


Deleting model

Error: Error in node lambdache
Slicer stuck at value with infinite density



In [156]:
samples$summary


Out[156]:
NULL

In [162]:
pc <- samples$BUGSoutput$sims.list$pc/p #to get proportion correct
mean(pc)


Out[162]:
0.737137005649718

In [164]:
# plot 6.9
#make the two panel plot:
#windows(width=8,height=6) #this command works only under Windows!
layout(matrix(c(1,2),2,1))
layout.show(2)
par(cex.main = 1.5, mar = c(5, 6, 4, 5) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5,
    font.lab = 2, cex.axis = 1.3, bty = "n", las=1)
bins <- c(-1:n)+.5
bonafide <- hist(k[truth==0], breaks=bins, plot=F)$counts
cheat    <- hist(k[truth==1], breaks=bins, plot=F)$counts

counts <- rbind(bonafide, cheat)
barplot(counts, main=" ", xlab=" ", col=c("grey","white"),
  legend.text = c("Bona Fide","Cheater"), args.legend = list(x="topleft"),
  beside=TRUE, axes=F)

# bottom panel:
par(cex.main = 1.5, mar = c(5, 6, 4, 5) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5,
    font.lab = 2, cex.axis = 1.3, bty = "n", las=1)
pc.line <- array()
for (i in 1:41)
{
  pc.line[i] <- mean((k>=(i-1))==truth)
}



In [165]:
#dev.new() # so the plot below does not overwrite the plot above

plot(c(0:40), pc.line, type="l", lwd=2, xlim=c(0,40), ylim=c(0.4,1), 
     xlab="Number of Items Recalled Correctly", 
     ylab=" ", axes=F)
axis(1, at=c(0,seq(from=5,by=5,to=40)))
axis(2, at=c(.5,.75,1))
par(las=0)
mtext("Prop. Correct",side=2, line=2.5,cex=1.5)
# Now add the distribution:
pc.dens <- density(pc)
polygon(c(0,pc.dens$y,0,0), c(pc.dens$x[1]-.01,pc.dens$x,pc.dens$x[1]+.01,pc.dens$x[1]-.01), col="green")



In [166]:
# plot 6.10
par(cex.main = 1.5, mar = c(5, 6, 4, 5) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5,
    font.lab = 2, cex.axis = 1.3, bty = "n", las=1)
plot(k,samples$BUGSoutput$mean$z,ylim=c(0,1),xlim=c(0,n), xlab= "Number of Items Recalled Correctly", 
     ylab="Cheater Classification", lwd=2, pch=4) 
#in the code, z=0 is bonafide and z=1 is cheating
#so z gives the prob of being assigned to cheating group


Stan Code


In [167]:
model <- "
// Cheating Latent Mixture Model
data { 
  int<lower=1> n;
  int<lower=1> p;
  int<lower=1,upper=n> k[p];
  int<lower=0,upper=1> truth[p];
}
parameters {
  real<lower=0,upper=1> phi;
  real<lower=0,upper=1> mubon;
  real<lower=0> mudiff;
  real<lower=5,upper=50> lambdabon;
  real<lower=5,upper=50> lambdache;
  vector<lower=0,upper=1>[p] theta;
} 
transformed parameters {
  vector[2] lp_parts[p];
  vector<lower=0>[2] alpha;
  vector<lower=0>[2] beta;
  real<lower=0,upper=1> muche;
    
  // Additivity on Logit Scale
  muche <- inv_logit(logit(mubon) + mudiff);
  
  // Transformation to Group Mean and Precision
  alpha[1] <- mubon * lambdabon;
  beta[1] <- lambdabon * (1 - mubon);
  alpha[2] <- muche * lambdache;
  beta[2]  <- lambdache * (1 - muche);
    
  // Data are Binomial with Rate Given by 
  // Each Person’s Group Assignment
  for (i in 1:p) {
    lp_parts[i,1] <- log1m(phi) + beta_log(theta[i], alpha[1], beta[1]);
    lp_parts[i,2] <- log(phi) + beta_log(theta[i], alpha[2], beta[2]);
  } 
}
model {
  // Priors
  mubon ~ beta(1, 1);  // can be removed
  mudiff ~ normal(0, 1 / sqrt(.5))T[0,];  // Constrained to be Positive
  // Relatively Uninformative Prior on Base Rate
  phi ~ beta(5, 5);
    
  for (i in 1:p)  
    increment_log_prob(log_sum_exp(lp_parts[i]));    
    
  k ~ binomial(n, theta);
}
generated quantities {
  int<lower=0,upper=1> z[p];
  real pc;
  vector[p] pct;
  
  for (i in 1:p) {
    vector[2] prob;
    prob <- softmax(lp_parts[i]);
    // Each Person Belongs to One of Two Latent Groups
    z[i] <- bernoulli_rng(prob[2]);
    // Correct Count
    pct[i] <- if_else(z[i] == truth[i], 1, 0);
  }
  pc <- sum(pct);
}"

cheat.dat  <- read.table("cheat.csv", header=F, sep=",")
cheatt.dat <- read.table("cheatt.csv", header=F, sep="")
truth <- cheatt.dat$V1  # truth = 1 if cheater
k <- apply(cheat.dat, 1, sum)  # total correct per participant
p <- length(k)  # number of people
n <- 40         # total trials

data <- list(p=p, k=k, n=n, truth=truth) # To be passed on to Stan

myinits <- list(
  list(mudiff=.1, phi=.5, mubon=.5, lambdabon=30, lambdache=25, 
       theta=rep(.5, p)),
  list(mudiff=.15, phi=.5, mubon=.5, lambdabon=25, lambdache=30,
       theta=rep(.5, p))) 

# Parameters to be monitored:
parameters <- c("theta", "z", "mubon", "lambdabon", "muche", "lambdache", 
                "mudiff", "phi", "alpha", "beta", "pc") 

# The following command calls Stan with specific options.
# For a detailed description type "?rstan".
samples <- stan(model_code=model,   
                data=data, 
                init=myinits,  # If not specified, gives random inits
                pars=parameters,
                iter=1000, 
                chains=2, 
                thin=1,
                warmup = 200,  # Stands for burn-in; Default = iter/2
                # seed = 123  # Setting seed; Default is random seed
)
# Now the values for the monitored parameters are in the "samples" object, 
# ready for inspection.


TRANSLATING MODEL 'model' FROM Stan CODE TO C++ CODE NOW.
COMPILING THE C++ CODE FOR MODEL 'model' NOW.
In file included from file3798368091.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:17:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev.hpp:5:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/agrad/rev/chainable.hpp:87:17: warning: 'static' function 'set_zero_all_adjoints' declared in header file should be declared 'static inline' [-Wunneeded-internal-declaration]
    static void set_zero_all_adjoints() {
                ^
In file included from file3798368091.cpp:8:
In file included from /Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/model/model_header.hpp:21:
/Library/Frameworks/R.framework/Versions/3.2/Resources/library/rstan/include//stansrc/stan/io/dump.hpp:26:14: warning: function 'product' is not needed and will not be emitted [-Wunneeded-internal-declaration]
      size_t product(std::vector<size_t> dims) {
             ^
2 warnings generated.

SAMPLING FOR MODEL 'model' NOW (CHAIN 1).

Iteration:   1 / 1000 [  0%]  (Warmup)
Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Iteration: 201 / 1000 [ 20%]  (Sampling)
Iteration: 300 / 1000 [ 30%]  (Sampling)
Iteration: 400 / 1000 [ 40%]  (Sampling)
Iteration: 500 / 1000 [ 50%]  (Sampling)
Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)
#  Elapsed Time: 4.11211 seconds (Warm-up)
#                9.31429 seconds (Sampling)
#                13.4264 seconds (Total)


SAMPLING FOR MODEL 'model' NOW (CHAIN 2).

Iteration:   1 / 1000 [  0%]  (Warmup)
Iteration: 100 / 1000 [ 10%]  (Warmup)
Iteration: 200 / 1000 [ 20%]  (Warmup)
Iteration: 201 / 1000 [ 20%]  (Sampling)
Iteration: 300 / 1000 [ 30%]  (Sampling)
Iteration: 400 / 1000 [ 40%]  (Sampling)
Iteration: 500 / 1000 [ 50%]  (Sampling)
Iteration: 600 / 1000 [ 60%]  (Sampling)
Iteration: 700 / 1000 [ 70%]  (Sampling)
Iteration: 800 / 1000 [ 80%]  (Sampling)
Iteration: 900 / 1000 [ 90%]  (Sampling)
Iteration: 1000 / 1000 [100%]  (Sampling)
#  Elapsed Time: 3.70028 seconds (Warm-up)
#                6.25269 seconds (Sampling)
#                9.95297 seconds (Total)


In [168]:
print(samples)


Inference for Stan model: model.
2 chains, each with iter=1000; warmup=200; thin=1; 
post-warmup draws per chain=800, total post-warmup draws=1600.

               mean se_mean    sd     2.5%      25%      50%      75%    97.5%
theta[1]       0.67    0.00  0.07     0.53     0.63     0.67     0.72     0.80
theta[2]       0.61    0.00  0.07     0.48     0.57     0.62     0.66     0.74
theta[3]       0.96    0.00  0.03     0.89     0.95     0.97     0.98     1.00
theta[4]       0.84    0.00  0.06     0.71     0.80     0.84     0.88     0.93
theta[5]       0.71    0.00  0.06     0.58     0.67     0.71     0.75     0.83
theta[6]       0.61    0.00  0.07     0.48     0.57     0.62     0.66     0.75
theta[7]       0.89    0.00  0.05     0.78     0.86     0.89     0.92     0.96
theta[8]       0.79    0.00  0.06     0.67     0.75     0.79     0.83     0.89
theta[9]       0.73    0.00  0.06     0.60     0.69     0.73     0.77     0.84
theta[10]      0.77    0.00  0.06     0.64     0.72     0.77     0.81     0.88
theta[11]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[12]      0.61    0.00  0.06     0.49     0.57     0.61     0.66     0.73
theta[13]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[14]      0.96    0.00  0.03     0.89     0.95     0.97     0.98     1.00
theta[15]      0.65    0.00  0.07     0.51     0.61     0.65     0.70     0.77
theta[16]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[17]      0.83    0.00  0.06     0.71     0.80     0.84     0.88     0.93
theta[18]      0.73    0.00  0.06     0.61     0.69     0.73     0.77     0.83
theta[19]      0.94    0.00  0.04     0.84     0.92     0.95     0.97     0.99
theta[20]      0.91    0.00  0.05     0.80     0.89     0.92     0.95     0.98
theta[21]      0.81    0.00  0.06     0.68     0.77     0.82     0.86     0.92
theta[22]      0.75    0.00  0.06     0.62     0.71     0.75     0.79     0.86
theta[23]      0.91    0.00  0.04     0.81     0.89     0.92     0.95     0.98
theta[24]      0.77    0.00  0.06     0.64     0.73     0.77     0.81     0.87
theta[25]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[26]      0.84    0.00  0.06     0.71     0.80     0.84     0.87     0.93
theta[27]      0.94    0.00  0.04     0.84     0.91     0.94     0.97     0.99
theta[28]      0.61    0.00  0.06     0.48     0.57     0.62     0.66     0.73
theta[29]      0.96    0.00  0.03     0.89     0.95     0.97     0.98     1.00
theta[30]      0.54    0.00  0.07     0.41     0.49     0.54     0.59     0.66
theta[31]      0.86    0.00  0.05     0.75     0.83     0.86     0.90     0.94
theta[32]      0.67    0.00  0.06     0.53     0.63     0.67     0.71     0.79
theta[33]      0.94    0.00  0.04     0.84     0.92     0.95     0.97     0.99
theta[34]      0.52    0.00  0.07     0.38     0.47     0.52     0.57     0.67
theta[35]      0.77    0.00  0.06     0.64     0.73     0.77     0.81     0.88
theta[36]      0.91    0.00  0.05     0.80     0.89     0.92     0.95     0.98
theta[37]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[38]      0.83    0.00  0.06     0.70     0.79     0.84     0.88     0.94
theta[39]      0.81    0.00  0.06     0.69     0.77     0.81     0.85     0.91
theta[40]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[41]      0.89    0.00  0.05     0.78     0.85     0.89     0.92     0.96
theta[42]      0.86    0.00  0.05     0.75     0.83     0.87     0.90     0.95
theta[43]      0.58    0.00  0.07     0.45     0.53     0.58     0.62     0.70
theta[44]      0.91    0.00  0.05     0.81     0.89     0.92     0.94     0.98
theta[45]      0.61    0.00  0.07     0.48     0.57     0.62     0.66     0.74
theta[46]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[47]      0.96    0.00  0.03     0.89     0.95     0.97     0.99     1.00
theta[48]      0.91    0.00  0.04     0.81     0.89     0.92     0.94     0.98
theta[49]      0.89    0.00  0.05     0.77     0.85     0.89     0.92     0.96
theta[50]      0.84    0.00  0.05     0.72     0.80     0.84     0.88     0.93
theta[51]      0.77    0.00  0.06     0.65     0.73     0.77     0.81     0.88
theta[52]      0.79    0.00  0.06     0.66     0.75     0.79     0.83     0.90
theta[53]      0.83    0.00  0.06     0.72     0.80     0.84     0.88     0.93
theta[54]      0.75    0.00  0.06     0.63     0.71     0.75     0.79     0.86
theta[55]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[56]      0.63    0.00  0.06     0.50     0.59     0.64     0.68     0.76
theta[57]      0.81    0.00  0.06     0.69     0.77     0.81     0.85     0.91
theta[58]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[59]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[60]      0.61    0.00  0.07     0.47     0.57     0.62     0.66     0.74
theta[61]      0.96    0.00  0.03     0.89     0.95     0.97     0.98     1.00
theta[62]      0.60    0.00  0.07     0.46     0.55     0.60     0.64     0.72
theta[63]      0.62    0.00  0.06     0.49     0.58     0.62     0.66     0.73
theta[64]      0.79    0.00  0.06     0.66     0.75     0.80     0.84     0.90
theta[65]      0.91    0.00  0.05     0.79     0.88     0.92     0.95     0.98
theta[66]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[67]      0.65    0.00  0.07     0.52     0.61     0.65     0.70     0.78
theta[68]      0.83    0.00  0.06     0.71     0.80     0.84     0.87     0.93
theta[69]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[70]      0.69    0.00  0.06     0.56     0.65     0.69     0.73     0.81
theta[71]      0.86    0.00  0.05     0.74     0.83     0.87     0.90     0.95
theta[72]      0.79    0.00  0.06     0.66     0.75     0.79     0.83     0.89
theta[73]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[74]      0.88    0.00  0.05     0.77     0.85     0.89     0.92     0.96
theta[75]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[76]      0.73    0.00  0.07     0.59     0.68     0.73     0.78     0.85
theta[77]      0.75    0.00  0.06     0.62     0.70     0.75     0.79     0.86
theta[78]      0.86    0.00  0.05     0.74     0.83     0.86     0.90     0.95
theta[79]      0.96    0.00  0.03     0.89     0.95     0.97     0.99     1.00
theta[80]      0.71    0.00  0.06     0.59     0.67     0.71     0.75     0.82
theta[81]      0.69    0.00  0.07     0.55     0.64     0.69     0.73     0.81
theta[82]      0.99    0.00  0.02     0.93     0.98     0.99     1.00     1.00
theta[83]      0.79    0.00  0.06     0.66     0.75     0.79     0.83     0.89
theta[84]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[85]      0.65    0.00  0.07     0.52     0.61     0.65     0.70     0.77
theta[86]      0.75    0.00  0.06     0.62     0.71     0.75     0.79     0.86
theta[87]      0.67    0.00  0.07     0.54     0.62     0.67     0.71     0.79
theta[88]      0.91    0.00  0.05     0.80     0.88     0.92     0.94     0.98
theta[89]      0.50    0.00  0.07     0.36     0.45     0.50     0.55     0.64
theta[90]      0.94    0.00  0.04     0.84     0.92     0.95     0.97     0.99
theta[91]      0.74    0.00  0.06     0.62     0.70     0.75     0.79     0.86
theta[92]      0.81    0.00  0.06     0.69     0.78     0.82     0.85     0.91
theta[93]      0.88    0.00  0.05     0.77     0.85     0.89     0.92     0.97
theta[94]      0.81    0.00  0.06     0.68     0.77     0.82     0.86     0.92
theta[95]      0.69    0.00  0.06     0.56     0.65     0.69     0.73     0.81
theta[96]      0.94    0.00  0.04     0.85     0.92     0.94     0.97     0.99
theta[97]      0.84    0.00  0.05     0.72     0.80     0.84     0.88     0.93
theta[98]      0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[99]      0.56    0.00  0.06     0.42     0.52     0.56     0.60     0.68
theta[100]     0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[101]     0.65    0.00  0.06     0.52     0.61     0.65     0.70     0.77
theta[102]     0.94    0.00  0.04     0.85     0.92     0.95     0.97     0.99
theta[103]     0.56    0.00  0.07     0.43     0.52     0.56     0.60     0.69
theta[104]     0.91    0.00  0.04     0.81     0.89     0.92     0.94     0.98
theta[105]     0.79    0.00  0.06     0.66     0.75     0.79     0.84     0.90
theta[106]     0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[107]     0.58    0.00  0.07     0.44     0.53     0.58     0.62     0.70
theta[108]     0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[109]     0.83    0.00  0.06     0.70     0.80     0.84     0.88     0.93
theta[110]     0.99    0.00  0.02     0.94     0.98     0.99     1.00     1.00
theta[111]     0.77    0.00  0.06     0.64     0.73     0.77     0.81     0.88
theta[112]     0.91    0.00  0.05     0.80     0.88     0.92     0.95     0.98
theta[113]     0.69    0.00  0.06     0.56     0.65     0.69     0.73     0.81
theta[114]     0.84    0.00  0.06     0.71     0.80     0.84     0.88     0.94
theta[115]     0.88    0.00  0.05     0.77     0.85     0.89     0.92     0.96
theta[116]     0.88    0.00  0.05     0.77     0.85     0.89     0.92     0.96
theta[117]     0.58    0.00  0.07     0.44     0.53     0.58     0.63     0.71
theta[118]     0.67    0.00  0.06     0.55     0.63     0.67     0.72     0.79
z[1]           0.09    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[2]           0.08    0.01  0.27     0.00     0.00     0.00     0.00     1.00
z[3]           0.91    0.01  0.29     0.00     1.00     1.00     1.00     1.00
z[4]           0.37    0.02  0.48     0.00     0.00     0.00     1.00     1.00
z[5]           0.13    0.01  0.33     0.00     0.00     0.00     0.00     1.00
z[6]           0.08    0.01  0.27     0.00     0.00     0.00     0.00     1.00
z[7]           0.55    0.02  0.50     0.00     0.00     1.00     1.00     1.00
z[8]           0.24    0.02  0.43     0.00     0.00     0.00     0.00     1.00
z[9]           0.15    0.01  0.36     0.00     0.00     0.00     0.00     1.00
z[10]          0.20    0.01  0.40     0.00     0.00     0.00     0.00     1.00
z[11]          0.98    0.00  0.16     0.98     1.00     1.00     1.00     1.00
z[12]          0.06    0.01  0.24     0.00     0.00     0.00     0.00     1.00
z[13]          0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[14]          0.87    0.01  0.34     0.00     1.00     1.00     1.00     1.00
z[15]          0.08    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[16]          0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[17]          0.37    0.02  0.48     0.00     0.00     0.00     1.00     1.00
z[18]          0.13    0.01  0.34     0.00     0.00     0.00     0.00     1.00
z[19]          0.78    0.02  0.41     0.00     1.00     1.00     1.00     1.00
z[20]          0.65    0.02  0.48     0.00     0.00     1.00     1.00     1.00
z[21]          0.33    0.02  0.47     0.00     0.00     0.00     1.00     1.00
z[22]          0.17    0.01  0.37     0.00     0.00     0.00     0.00     1.00
z[23]          0.66    0.02  0.47     0.00     0.00     1.00     1.00     1.00
z[24]          0.18    0.01  0.38     0.00     0.00     0.00     0.00     1.00
z[25]          0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[26]          0.37    0.02  0.48     0.00     0.00     0.00     1.00     1.00
z[27]          0.78    0.01  0.42     0.00     1.00     1.00     1.00     1.00
z[28]          0.06    0.01  0.24     0.00     0.00     0.00     0.00     1.00
z[29]          0.89    0.01  0.32     0.00     1.00     1.00     1.00     1.00
z[30]          0.08    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[31]          0.46    0.02  0.50     0.00     0.00     0.00     1.00     1.00
z[32]          0.09    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[33]          0.78    0.02  0.41     0.00     1.00     1.00     1.00     1.00
z[34]          0.08    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[35]          0.20    0.01  0.40     0.00     0.00     0.00     0.00     1.00
z[36]          0.66    0.02  0.47     0.00     0.00     1.00     1.00     1.00
z[37]          0.98    0.00  0.15     1.00     1.00     1.00     1.00     1.00
z[38]          0.38    0.02  0.49     0.00     0.00     0.00     1.00     1.00
z[39]          0.29    0.02  0.45     0.00     0.00     0.00     1.00     1.00
z[40]          0.98    0.00  0.13     1.00     1.00     1.00     1.00     1.00
z[41]          0.56    0.02  0.50     0.00     0.00     1.00     1.00     1.00
z[42]          0.46    0.02  0.50     0.00     0.00     0.00     1.00     1.00
z[43]          0.08    0.01  0.27     0.00     0.00     0.00     0.00     1.00
z[44]          0.66    0.02  0.47     0.00     0.00     1.00     1.00     1.00
z[45]          0.08    0.01  0.27     0.00     0.00     0.00     0.00     1.00
z[46]          0.98    0.00  0.14     1.00     1.00     1.00     1.00     1.00
z[47]          0.88    0.01  0.32     0.00     1.00     1.00     1.00     1.00
z[48]          0.65    0.02  0.48     0.00     0.00     1.00     1.00     1.00
z[49]          0.56    0.02  0.50     0.00     0.00     1.00     1.00     1.00
z[50]          0.39    0.02  0.49     0.00     0.00     0.00     1.00     1.00
z[51]          0.20    0.01  0.40     0.00     0.00     0.00     0.00     1.00
z[52]          0.24    0.02  0.43     0.00     0.00     0.00     0.00     1.00
z[53]          0.37    0.02  0.48     0.00     0.00     0.00     1.00     1.00
z[54]          0.18    0.01  0.38     0.00     0.00     0.00     0.00     1.00
z[55]          0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[56]          0.07    0.01  0.26     0.00     0.00     0.00     0.00     1.00
z[57]          0.30    0.02  0.46     0.00     0.00     0.00     1.00     1.00
z[58]          0.98    0.00  0.15     1.00     1.00     1.00     1.00     1.00
z[59]          0.97    0.00  0.16     0.00     1.00     1.00     1.00     1.00
z[60]          0.08    0.01  0.27     0.00     0.00     0.00     0.00     1.00
z[61]          0.89    0.01  0.31     0.00     1.00     1.00     1.00     1.00
z[62]          0.08    0.01  0.27     0.00     0.00     0.00     0.00     1.00
z[63]          0.09    0.01  0.29     0.00     0.00     0.00     0.00     1.00
z[64]          0.26    0.02  0.44     0.00     0.00     0.00     1.00     1.00
z[65]          0.67    0.02  0.47     0.00     0.00     1.00     1.00     1.00
z[66]          0.98    0.00  0.16     0.98     1.00     1.00     1.00     1.00
z[67]          0.10    0.01  0.29     0.00     0.00     0.00     0.00     1.00
z[68]          0.37    0.02  0.48     0.00     0.00     0.00     1.00     1.00
z[69]          0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[70]          0.12    0.01  0.32     0.00     0.00     0.00     0.00     1.00
z[71]          0.46    0.02  0.50     0.00     0.00     0.00     1.00     1.00
z[72]          0.26    0.02  0.44     0.00     0.00     0.00     1.00     1.00
z[73]          0.97    0.00  0.16     0.00     1.00     1.00     1.00     1.00
z[74]          0.54    0.02  0.50     0.00     0.00     1.00     1.00     1.00
z[75]          0.97    0.00  0.18     0.00     1.00     1.00     1.00     1.00
z[76]          0.15    0.01  0.35     0.00     0.00     0.00     0.00     1.00
z[77]          0.17    0.01  0.38     0.00     0.00     0.00     0.00     1.00
z[78]          0.45    0.02  0.50     0.00     0.00     0.00     1.00     1.00
z[79]          0.89    0.01  0.32     0.00     1.00     1.00     1.00     1.00
z[80]          0.13    0.01  0.34     0.00     0.00     0.00     0.00     1.00
z[81]          0.11    0.01  0.31     0.00     0.00     0.00     0.00     1.00
z[82]          0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[83]          0.27    0.02  0.44     0.00     0.00     0.00     1.00     1.00
z[84]          0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[85]          0.09    0.01  0.29     0.00     0.00     0.00     0.00     1.00
z[86]          0.17    0.01  0.38     0.00     0.00     0.00     0.00     1.00
z[87]          0.09    0.01  0.29     0.00     0.00     0.00     0.00     1.00
z[88]          0.65    0.02  0.48     0.00     0.00     1.00     1.00     1.00
z[89]          0.09    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[90]          0.77    0.02  0.42     0.00     1.00     1.00     1.00     1.00
z[91]          0.16    0.01  0.37     0.00     0.00     0.00     0.00     1.00
z[92]          0.31    0.02  0.46     0.00     0.00     0.00     1.00     1.00
z[93]          0.55    0.02  0.50     0.00     0.00     1.00     1.00     1.00
z[94]          0.30    0.02  0.46     0.00     0.00     0.00     1.00     1.00
z[95]          0.09    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[96]          0.77    0.02  0.42     0.00     1.00     1.00     1.00     1.00
z[97]          0.38    0.02  0.49     0.00     0.00     0.00     1.00     1.00
z[98]          0.98    0.00  0.16     0.98     1.00     1.00     1.00     1.00
z[99]          0.07    0.01  0.26     0.00     0.00     0.00     0.00     1.00
z[100]         0.97    0.00  0.17     0.00     1.00     1.00     1.00     1.00
z[101]         0.09    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[102]         0.79    0.02  0.41     0.00     1.00     1.00     1.00     1.00
z[103]         0.08    0.01  0.28     0.00     0.00     0.00     0.00     1.00
z[104]         0.67    0.02  0.47     0.00     0.00     1.00     1.00     1.00
z[105]         0.27    0.02  0.44     0.00     0.00     0.00     1.00     1.00
z[106]         0.98    0.00  0.15     1.00     1.00     1.00     1.00     1.00
z[107]         0.07    0.01  0.26     0.00     0.00     0.00     0.00     1.00
z[108]         0.98    0.00  0.15     1.00     1.00     1.00     1.00     1.00
z[109]         0.38    0.02  0.48     0.00     0.00     0.00     1.00     1.00
z[110]         0.97    0.00  0.16     0.00     1.00     1.00     1.00     1.00
z[111]         0.21    0.01  0.41     0.00     0.00     0.00     0.00     1.00
z[112]         0.65    0.02  0.48     0.00     0.00     1.00     1.00     1.00
z[113]         0.11    0.01  0.31     0.00     0.00     0.00     0.00     1.00
z[114]         0.38    0.02  0.49     0.00     0.00     0.00     1.00     1.00
z[115]         0.55    0.02  0.50     0.00     0.00     1.00     1.00     1.00
z[116]         0.54    0.02  0.50     0.00     0.00     1.00     1.00     1.00
z[117]         0.06    0.01  0.24     0.00     0.00     0.00     0.00     1.00
z[118]         0.08    0.01  0.27     0.00     0.00     0.00     0.00     1.00
mubon          0.72    0.00  0.04     0.64     0.69     0.72     0.75     0.78
lambdabon     16.99    0.60  9.23     7.30    10.77    14.03    19.77    43.66
muche          0.93    0.00  0.03     0.86     0.91     0.93     0.96     0.98
lambdache     12.62    0.45  9.16     5.14     6.54     9.07    14.74    42.79
mudiff         1.82    0.03  0.51     1.00     1.47     1.72     2.10     3.03
phi            0.46    0.01  0.14     0.22     0.35     0.45     0.56     0.75
alpha[1]      11.97    0.38  5.92     5.58     7.93    10.21    14.02    28.62
alpha[2]      11.96    0.44  9.07     4.62     5.97     8.44    14.09    41.66
beta[1]        5.02    0.23  3.40     1.70     2.78     3.81     5.86    15.03
beta[2]        0.66    0.02  0.26     0.27     0.48     0.63     0.79     1.28
pc            87.36    0.22  5.37    76.00    84.00    88.00    91.00    97.00
lp__       -2100.17    0.71 10.68 -2121.87 -2107.19 -2099.87 -2092.66 -2080.56
           n_eff Rhat
theta[1]    1600 1.00
theta[2]    1600 1.00
theta[3]    1600 1.00
theta[4]    1600 1.00
theta[5]    1600 1.00
theta[6]    1309 1.00
theta[7]    1600 1.00
theta[8]    1600 1.00
theta[9]    1535 1.00
theta[10]   1600 1.00
theta[11]   1600 1.00
theta[12]   1600 1.00
theta[13]   1600 1.00
theta[14]   1600 1.00
theta[15]   1600 1.00
theta[16]   1600 1.00
theta[17]   1600 1.00
theta[18]   1600 1.00
theta[19]   1600 1.00
theta[20]   1600 1.00
theta[21]   1600 1.00
theta[22]   1600 1.00
theta[23]   1600 1.00
theta[24]   1600 1.00
theta[25]   1600 1.00
theta[26]   1600 1.00
theta[27]   1600 1.00
theta[28]   1600 1.00
theta[29]   1600 1.00
theta[30]   1600 1.00
theta[31]   1600 1.00
theta[32]   1600 1.00
theta[33]   1600 1.00
theta[34]   1600 1.00
theta[35]   1600 1.00
theta[36]   1600 1.00
theta[37]   1327 1.00
theta[38]   1600 1.00
theta[39]   1600 1.00
theta[40]   1600 1.00
theta[41]   1600 1.00
theta[42]   1600 1.00
theta[43]   1178 1.00
theta[44]   1600 1.00
theta[45]   1600 1.00
theta[46]   1600 1.00
theta[47]   1600 1.00
theta[48]   1600 1.00
theta[49]   1600 1.00
theta[50]   1600 1.00
theta[51]   1600 1.00
theta[52]   1600 1.00
theta[53]   1600 1.00
theta[54]   1600 1.00
theta[55]   1600 1.00
theta[56]   1600 1.00
theta[57]   1382 1.00
theta[58]   1600 1.00
theta[59]   1600 1.00
theta[60]   1600 1.00
theta[61]   1600 1.00
theta[62]   1600 1.00
theta[63]   1600 1.00
theta[64]   1600 1.00
theta[65]   1600 1.00
theta[66]   1600 1.00
theta[67]   1600 1.00
theta[68]   1600 1.00
theta[69]   1600 1.00
theta[70]   1600 1.00
theta[71]   1600 1.00
theta[72]   1600 1.00
theta[73]   1600 1.00
theta[74]   1600 1.00
theta[75]   1600 1.00
theta[76]   1600 1.00
theta[77]   1600 1.00
theta[78]   1600 1.00
theta[79]   1600 1.00
theta[80]   1600 1.00
theta[81]   1600 1.00
theta[82]   1600 1.00
theta[83]   1600 1.00
theta[84]   1600 1.00
theta[85]   1600 1.00
theta[86]   1600 1.00
theta[87]   1600 1.00
theta[88]   1600 1.00
theta[89]   1600 1.00
theta[90]   1600 1.00
theta[91]   1600 1.00
theta[92]   1600 1.00
theta[93]   1600 1.00
theta[94]   1600 1.00
theta[95]   1600 1.00
theta[96]   1600 1.00
theta[97]   1600 1.00
theta[98]   1600 1.00
theta[99]   1600 1.00
theta[100]  1600 1.00
theta[101]  1600 1.00
theta[102]  1600 1.00
theta[103]  1600 1.00
theta[104]  1600 1.00
theta[105]  1600 1.00
theta[106]  1600 1.00
theta[107]  1600 1.00
theta[108]  1600 1.00
theta[109]  1600 1.00
theta[110]  1600 1.00
theta[111]  1600 1.00
theta[112]  1600 1.00
theta[113]  1600 1.00
theta[114]  1600 1.00
theta[115]  1600 1.00
theta[116]  1600 1.00
theta[117]  1600 1.00
theta[118]  1600 1.00
z[1]        1185 1.00
z[2]        1366 1.00
z[3]         886 1.00
z[4]         566 1.01
z[5]        1247 1.00
z[6]        1329 1.00
z[7]         561 1.00
z[8]         529 1.01
z[9]         912 1.00
z[10]        829 1.00
z[11]       1524 1.00
z[12]       1136 1.00
z[13]       1484 1.00
z[14]        858 1.00
z[15]       1174 1.00
z[16]       1600 1.00
z[17]        569 1.00
z[18]       1208 1.00
z[19]        737 1.00
z[20]        543 1.00
z[21]        699 1.00
z[22]        881 1.00
z[23]        596 1.00
z[24]        924 1.00
z[25]       1600 1.00
z[26]        570 1.00
z[27]        810 1.00
z[28]       1190 1.00
z[29]        936 1.00
z[30]       1139 1.00
z[31]        640 1.00
z[32]       1600 1.00
z[33]        638 1.00
z[34]       1175 1.01
z[35]        856 1.00
z[36]        522 1.00
z[37]       1347 1.00
z[38]        588 1.00
z[39]        679 1.00
z[40]       1600 1.00
z[41]        656 1.00
z[42]        485 1.00
z[43]       1169 1.00
z[44]        548 1.00
z[45]       1288 1.00
z[46]       1502 1.00
z[47]       1123 1.00
z[48]        749 1.00
z[49]        541 1.00
z[50]        590 1.00
z[51]        907 1.00
z[52]        703 1.00
z[53]        691 1.00
z[54]       1001 1.00
z[55]       1600 1.00
z[56]       1424 1.00
z[57]        551 1.00
z[58]       1592 1.00
z[59]       1600 1.00
z[60]       1214 1.00
z[61]       1085 1.00
z[62]        940 1.00
z[63]        957 1.00
z[64]        800 1.00
z[65]        491 1.00
z[66]       1600 1.00
z[67]       1285 1.00
z[68]        671 1.00
z[69]       1496 1.00
z[70]       1222 1.00
z[71]        448 1.00
z[72]        782 1.00
z[73]       1600 1.00
z[74]        546 1.00
z[75]       1534 1.00
z[76]        909 1.00
z[77]        925 1.00
z[78]        622 1.00
z[79]        947 1.00
z[80]        871 1.00
z[81]       1290 1.00
z[82]       1419 1.00
z[83]        654 1.00
z[84]       1600 1.00
z[85]       1112 1.00
z[86]        662 1.00
z[87]        955 1.00
z[88]        630 1.00
z[89]        961 1.00
z[90]        764 1.00
z[91]       1030 1.00
z[92]        583 1.00
z[93]        522 1.00
z[94]        776 1.00
z[95]       1325 1.00
z[96]        743 1.00
z[97]        631 1.00
z[98]       1320 1.00
z[99]       1144 1.00
z[100]      1451 1.00
z[101]      1450 1.00
z[102]       726 1.00
z[103]      1016 1.00
z[104]       560 1.00
z[105]       655 1.00
z[106]      1502 1.00
z[107]      1408 1.00
z[108]      1277 1.00
z[109]       578 1.00
z[110]      1535 1.00
z[111]       889 1.00
z[112]       681 1.00
z[113]      1357 1.00
z[114]       668 1.00
z[115]       468 1.00
z[116]       533 1.00
z[117]      1190 1.00
z[118]      1114 1.00
mubon        272 1.00
lambdabon    237 1.01
muche        276 1.00
lambdache    418 1.00
mudiff       314 1.00
phi          265 1.00
alpha[1]     249 1.01
alpha[2]     419 1.00
beta[1]      227 1.01
beta[2]      234 1.00
pc           610 1.00
lp__         226 1.01

Samples were drawn using NUTS(diag_e) at Sun Jul 19 02:09:22 2015.
For each parameter, n_eff is a crude measure of effective sample size,
and Rhat is the potential scale reduction factor on split chains (at 
convergence, Rhat=1).

In [169]:
plot(samples)



In [170]:
pc <- extract(samples)$pc / p  # to get proportion correct
mean(pc)


Out[170]:
0.740323093220339

In [178]:
# plot 6.9
#make the two panel plot:
#windows(width=8,height=6) #this command works only under Windows!
layout(matrix(c(1,2),2,1))
layout.show(2)
par(cex.main = 1.5, mar = c(5, 6, 4, 5) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5,
    font.lab = 2, cex.axis = 1.3, bty = "n", las=1)
bins <- c(-1:n)+.5
bonafide <- hist(k[truth==0], breaks=bins, plot=F)$counts
cheat    <- hist(k[truth==1], breaks=bins, plot=F)$counts

counts <- rbind(bonafide, cheat)
barplot(counts, main=" ", xlab=" ", col=c("grey","white"),
  legend.text = c("Bona Fide","Cheater"), args.legend = list(x="topleft"),
  beside=TRUE, axes=F)

# bottom panel:
par(cex.main = 1.5, mar = c(5, 6, 4, 5) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5,
    font.lab = 2, cex.axis = 1.3, bty = "n", las=1)
pc.line <- array()
for (i in 1:41) {
  pc.line[i] <- mean((k>=(i-1))==truth)
}



In [172]:
#dev.new() # so the plot below does not overwrite the plot above

plot(c(0:40), pc.line, type="l", lwd=2, xlim=c(0,40), ylim=c(0.4,1), 
     xlab="Number of Items Recalled Correctly", 
     ylab=" ", axes=F)
axis(1, at=c(0,seq(from=5,by=5,to=40)))
axis(2, at=c(.5,.75,1))
par(las=0)
mtext("Prop. Correct",side=2, line=2.5,cex=1.5)
# Now add the distribution:
pc.dens <- density(pc)
polygon(c(0,pc.dens$y,0,0), c(pc.dens$x[1]-.01,pc.dens$x,pc.dens$x[1]+.01,
                              pc.dens$x[1]-.01), col="green")



In [173]:
# plot 6.10
#windows()
par(cex.main = 1.5, mar = c(5, 6, 4, 5) + 0.1, mgp = c(3.5, 1, 0), cex.lab = 1.5,
    font.lab = 2, cex.axis = 1.3, bty = "n", las=1)
plot(k,summary(samples)$summary[paste("z[", 1:118, "]", sep=""), 1], ylim=c(0,1),
     xlim=c(0,n), lwd=2, pch=4, xlab= "Number of Items Recalled Correctly", 
     ylab="Cheater Classification") 
# in the code, z=0 is bonafide and z=1 is cheating
# so z gives the prob of being assigned to cheating group


참고자료