Logit and Logistic of array values

This notebook illustrates the level of control and flexibility available in Julia functions. The task is to evaluate the logistic function $(-\infty, \infty)\rightarrow(0,1)$

\begin{equation} x \rightarrow \frac{1}{1 + e^{-x}} \end{equation}

and its inverse, the logit or "log-odds" function $(0,1)\rightarrow(-\infty, \infty)$

\begin{equation} p \rightarrow \log\left(\frac{p}{1-p}\right) \end{equation}

on an array of numeric values.

The first priority is to evaluate these functions accurately and robustly. This usually means watching for edge cases (e.g. very large positive or negative $x$ or values of $p$ that are close to zero or to one).

The second priority is evaluate them quickly and flexibly. When fitting a logistic regression model to very large data sets these functions may be evaluated hundreds of times on arrays with millions of elements.

In "vectorized" languages, such as R or Matlab/Octave and, to some extent, Python, the obvious choice is to work on vectors. In fact, the language often hides the fact that vectorization is occuring.

logit and logistic in R

The RCall for Julia starts an embedded R process and provides for two-way communication with it. In a Jupyter notebook like this a Julia string prepended with R is evaluated in the R process. String delimiters are " or """. In the second case the string can span multiple lines and can contain " characters.



In [1]:

    
using RCall



In [2]:

    
R" logit <- function(p) log(p / (1 - p)) ";

Create a vector of 100,000,000 random values between 0 and 1 on which to evaluate logit. This is done in Julia after setting the random number seed, to allow for reproducibility.



In [2]:

    
srand(1234321)
pvals = rand(100_000_000);

Copy the vector to the R process under the same name.



In [4]:

    
@rput pvals;



In [5]:

    
R""" print(system.time(xvals <- logit(pvals))) """;









    



   user  system elapsed 
  5.412   0.164   5.576

The first few values of pvals and xvals are



In [6]:

    
R" print(str(list(pvals=pvals, xvals=xvals))) ";









    



List of 2
 $ pvals: num [1:100000000] 0.0944 0.9366 0.2583 0.9309 0.5553 ...
 $ xvals: num [1:100000000] -2.261 2.693 -1.055 2.601 0.222 ...
NULL

Similarly, a vectorized logistic function can be defined as



In [7]:

    
R"""
logistic <- function(x) 1 / (1 + exp(-x))
print(system.time(pvalsnew <- logistic(xvals)))
""";









    



   user  system elapsed 
  3.472   0.156   3.631



In [8]:

    
R" all(pvals == pvalsnew) "; # check for 'round trip' identity









    Out[8]:





RCall.RObject{RCall.LglSxp}
[1] FALSE

The problem with the "round trip" check is that floating point arithmetic is not exact. Numbers are represented in a finite precision. pvalsnew is close to pvals but not exactly equal.



In [9]:

    
R" print(str(list(pvals=pvals, xvals=xvals, pvn=pvalsnew))) ";









    



List of 3
 $ pvals: num [1:100000000] 0.0944 0.9366 0.2583 0.9309 0.5553 ...
 $ xvals: num [1:100000000] -2.261 2.693 -1.055 2.601 0.222 ...
 $ pvn  : num [1:100000000] 0.0944 0.9366 0.2583 0.9309 0.5553 ...
NULL

R has an all.equal function that compares floating point values using a tolerance on the differences.



In [10]:

    
R" print(all.equal(pvals, pvalsnew)) ";









    



[1] TRUE

The problem with vectorization

Vectorized languages are wonderful environment when you begin programming because all the messy loop-related baggage is eliminated at the expense of some overhead. The evaluation of logistic(xvals) is done in 5 stages

Allocate a vector, t1, of 100,000,000 doubles and loop over x writing -x into t1.
Allocate a vector, t2, of 100,000,000 doubles and loop over t1 writing exp(t1) into t2.
Allocate a vector, t3, of 100,000,000 doubles and loop over t2 writing 1 + t2 into t3.
Allocate a vector, result, of 100,000,000 doubles and loop over t3 writing 1 / t3 into result.
Return result

Because R allows for missing data in any vector the scalar arithmetic is more complicated than just looping over the vector. Every scalar addition in, e.g. 1 + t2 has a check on both addends to see if they are NA. Furthermore, the "recycling rule" that cycles over the 1 operand while looping over the indices of t2 is further logic implemented inside the loop.

Notice that there are 3 temporary vectors allocated and the result must also be allocated. This storage must later be "garbage collected".

Functional programming in Julia

The operations could be performed in exactly the same way in Julia. Currently some Julia arithmetic and math functions are vectorized and some aren't. In future releases vectorization will need to be explicitly stated by appending a . to a function name or prepending a . to an operator.



In [ ]:

    
vlogistic(x::Vector) = 1 ./ (1 .+ exp.(-x));
vlogit(p::Vector) = log.(p ./ (1 .- p))
xvals = vlogit(pvals)
show(xvals[1:5])

Check for approximate equality



In [ ]:

    
pvals ≈ vlogistic(xvals)

The timings show that the Julia code is faster than the R functions but it still allocates a considerable amount of storage and uses time in garbage collection (gc).



In [ ]:

    
@time vlogit(pvals);



In [ ]:

    
@time vlogistic(xvals);

However, there is no need to allocate the intermediate values when operating on only one value.



In [3]:

    
logit(p) = log(p / (one(p) - p));
logistic(x) = inv(one(x) + exp(-x));
sxvals = logit.(pvals);
show(sxvals[1:5])









    



[-2.2608,2.69298,-1.05468,2.60097,0.222041]





    



WARNING: Method definition logit(Any) in module Main at In[1]:1 overwritten at In[3]:1.
WARNING: Method definition logistic(Any) in module Main at In[1]:2 overwritten at In[3]:2.



In [4]:

    
@time logit.(pvals);









    



  2.740797 seconds (159 allocations: 762.948 MB, 2.27% gc time)

This type of evaluation is in the "functional programming" style where simple functions are mapped over arrays. Julia allows for results to be pre-allocated as, e.g.



In [6]:

    
@time map!(logit, sxvals, pvals);









    



  2.631148 seconds (3.43 k allocations: 140.514 KB)

In this case there isn't much of a savings in time but there is a saving in the amount of storage allocated. This becomes important when, e.g. fitting a generalized linear model or a generalized linear mixed model.



In [8]:

    
pvalsnew = similar(sxvals); @time map!(logistic, pvalsnew, sxvals);









    



  2.116498 seconds (4.67 k allocations: 197.003 KB)

helps with the amount of allocation but doesn't actually run substantially faster.

There is a way to make the evaluation of the log and exp functions slightly faster, which is to use the @fastmath macro in the definitions of the scalar functions.



In [9]:

    
@fastmath flogit(p) = log(p / (one(p) + p))
function flogistic(x)
    expmx = @fastmath(exp(-x))
    inv(one(expmx) + expmx)
end









    Out[9]:





flogistic (generic function with 1 method)



In [11]:

    
@time map!(flogit, sxvals, pvals);









    



  2.382810 seconds (5.07 k allocations: 220.258 KB)



In [12]:

    
@time map!(flogistic, pvalsnew, sxvals);









    



  1.978709 seconds (4.93 k allocations: 210.208 KB)

Multiple threads

Some multithreading capability is available in v0.5.0 of Julia. Later versions with enhance these capabilities. Before starting this notebook I set the environment variable JULIA_NUM_THREADS=4 as this is running on a 4-core processor.

It is easiest to use multiple threads on a simple loop. Define a function that overwrites the values in one array with the logit or logistic of the values in another array. By convention, a ! is appended to the name of such a mutating function, which modifies one or more of its arguments.



In [13]:

    
function logit!(dest, src)
    length(dest) == length(src) || throw(DimensionMismatch())
    Threads.@threads for i in eachindex(dest, src)
        @inbounds dest[i] = flogit(src[i])
    end
    dest
end









    Out[13]:





logit! (generic function with 1 method)



In [14]:

    
@time logit!(sxvals, pvals);









    



  2.281610 seconds (13.87 k allocations: 597.863 KB)

The scaling is very good with 4 threads being



In [ ]:

    
1.66/0.571

times as fast as a single thread.