Why types matter

Julia's type system is at the core of Julia's personality as a language. Types allow the compiler to make strong assumptions and produce fast code, as well as giving programmers a set of abstractions through which to do domain modeling and code reuse.

Types vs Classes

Simply:

  • OO: classes are data + behavior (nouns, nouns, nouns)
  • Julia: types are data; behavior defined separately (cf. Haskell, other functional languages) (verbs are key)

Let's start with something basic:


In [1]:
typeof(1), typeof(1.), typeof(1.f0), typeof('a'), typeof("foo")


Out[1]:
(Int64,Float64,Float32,Char,ASCIIString)

In [2]:
A = rand(5, 5)


Out[2]:
5x5 Array{Float64,2}:
 0.86406    0.0989109  0.395494  0.815833  0.0964828
 0.654381   0.380578   0.568803  0.161554  0.0091364
 0.0654029  0.663925   0.280064  0.210903  0.316705 
 0.2731     0.459945   0.151887  0.307617  0.303337 
 0.362892   0.932725   0.642601  0.501254  0.504868 

In [3]:
typeof(A), eltype(A)


Out[3]:
(Array{Float64,2},Float64)

Clearly, Julia puts types front and center. Contrast this to Python, where it's possible, but not trivial or syntactically nice, to get the name of an object's superclass (obj.__class__.__bases__) or test for whether an object is a subclass (issubclass).

This is because Python is built on duck typing, and the focus is on behaviors that just work. This is a key philosophical point: you can be a very good Python programmer and worry very little about inheritance and types. Just define classes, add methods, and move on.

In Julia, everything is organized around types:


In [4]:
T = typeof(1.)  # a type is a variable


Out[4]:
Float64

In [5]:
typeof(T)


Out[5]:
DataType

In [6]:
typeof(DataType)  # DataType is its own type


Out[6]:
DataType

In [7]:
super(T)  # --> supertype in v0.5


Out[7]:
AbstractFloat

In [8]:
super(AbstractFloat)


Out[8]:
Real

In [9]:
super(Real)


Out[9]:
Number

In [10]:
super(Number)


Out[10]:
Any

In [11]:
super(Any)  # Any is its own supertype


Out[11]:
Any

In [12]:
super(DataType)


Out[12]:
Type{T}

In [13]:
super(super(DataType))  # Any really is the top of the hierarchy


Out[13]:
Any

In [14]:
subtypes(Number)


Out[14]:
2-element Array{Any,1}:
 Complex{T<:Real}
 Real            

In [15]:
subtypes(Real)


Out[15]:
4-element Array{Any,1}:
 AbstractFloat       
 Integer             
 Irrational{sym}     
 Rational{T<:Integer}

We can use the <: operator to test for subtyping, too:


In [16]:
Float64 <: Real, Int64 <: AbstractFloat


Out[16]:
(true,false)

And we can use the isa function for testing instances:


In [17]:
isa(1, Float64), isa(1., Float64), isa(1, Number)


Out[17]:
(false,true,true)

When it makes sense, we can use convert or simply the type name to convert:


In [18]:
convert(Int64, 1.), convert(Float32, 3.75), convert(Rational{Int64}, 3.75)


Out[18]:
(1,3.75f0,15//4)

Concrete and abstract types

In Julia, only leaf nodes in the type tree (types with no subtypes) are concrete and can be instantiated. That is, variables can only have concrete types, and no concrete type can have subtypes. This seems limiting, and is, but drastically speeds up type inference and performance and pushes us toward composition over inheritance.


In [19]:
isleaftype(Int64), isleaftype(AbstractFloat)


Out[19]:
(true,false)

Parametric types

It often allows for more generic code if types can take parameters. For instance:


In [20]:
A = rand(5, 5)
isa(A, Array{Float64}), isa(A, Array)


Out[20]:
(true,true)

In [21]:
B = reshape(1:25, 5, 5)
isa(B, Array{Int64}), isa(B, Array)


Out[21]:
(true,true)

That is, we'd like to write code that works for A and B. In defining functions, we can do this in a couple of ways:


In [22]:
function lastelem(A::Array)  # new syntax: restrict this definition to A's that are Arrays
    return A[end]
end


Out[22]:
lastelem (generic function with 1 method)

In [23]:
lastelem(A), lastelem(B)


Out[23]:
(0.7600555360482268,25)

We can also restrict the element types we will allow:


In [24]:
function diagsum{T<:Number}(A::Array{T})
    return sum(diag(A))  
end


Out[24]:
diagsum (generic function with 1 method)

In [25]:
diagsum(A)


Out[25]:
3.2605824430345383

In [26]:
C = reshape([c for c in "abcdefghijklmnopqrstuvwxy"], 5, 5)


Out[26]:
5x5 Array{Char,2}:
 'a'  'f'  'k'  'p'  'u'
 'b'  'g'  'l'  'q'  'v'
 'c'  'h'  'm'  'r'  'w'
 'd'  'i'  'n'  's'  'x'
 'e'  'j'  'o'  't'  'y'

In [27]:
diagsum(C)


LoadError: MethodError: `diagsum` has no method matching diagsum(::Array{Char,2})
while loading In[27], in expression starting on line 1

The best example of this is Array{T, N}, where the first parameter is the element type and the second is the number of dimensions. Later, we'll look at how to add parameters when we create our own types.

Invariance, covariance, and contravariance

Technical The following facts are very important:


In [28]:
typeof(A)


Out[28]:
Array{Float64,2}

In [29]:
eltype(A) <: Real


Out[29]:
true

In [30]:
typeof(A) <: Array{Real}


Out[30]:
false

It is natural to think this should hold in Julia, but it doesn't. cf. here for why.

There is, however, an important exception:


In [31]:
tt = (1.5, 2)
typeof(tt)


Out[31]:
Tuple{Float64,Int64}

In [32]:
typeof(tt) <: Tuple{Real, Real}


Out[32]:
true

Short answer: we must have this for function argument checking to work correctly, since collections to arguments of functions are checked as tuples. That is tuples must be covariant for f(x::Real, y::Real) to accept an integer and a floating point number.

So why all this fuss about types? Types and the type system are what allow multiple dispatch to work, and multiple dispatch and types are the key organizing principle of Julia code.