This quick introduction assumes that you have basic knowledge of some scripting language and provides an example of the Julia syntax. So before we explain anything, let's just treat it like a scripting language, take a head-first dive into Julia, and see what happens.
You'll notice that, given the right syntax, almost everything will "just work". There will be some peculiarities, and these we will be the facts which we will study in much more depth. Usually, these oddies/differences from other scripting languages are "the source of Julia's power".
Time to start using your noggin. Scattered in this document are problems for you to solve using Julia. Many of the details for solving these problems have been covered, some have not. You may need to use some external resources:
https://docs.julialang.org/en/stable/
https://gitter.im/JuliaLang/julia
Solve as many or as few problems as you can during these times. Please work at your own pace, or with others if that's how you're comfortable!
The main source of information is the Julia Documentation. Julia also provides lots of built-in documentation and ways to find out what's going on. The number of tools for "hunting down what's going on / available" is too numerous to explain in full detail here, so instead this will just touch on what's important. For example, the ? gets you to the documentation for a type, function, etc.
In [1]:
?copy
Out[1]:
To find out what methods are available, we can use the methods
function. For example, let's see how +
is defined:
In [2]:
methods(+)
Out[2]:
We can inspect a type by finding its fields with fieldnames
In [3]:
fieldnames(UnitRange)
Out[3]:
and find out which method was used with the @which
macro:
In [4]:
@which copy([1,2,3])
Out[4]:
Notice that this gives you a link to the source code where the function is defined.
Lastly, we can find out what type a variable is with the typeof
function:
In [5]:
a = [1;2;3]
typeof(a)
Out[5]:
In [6]:
a = Vector{Float64}(undef,5) # Create a length 5 Vector (dimension 1 array) of Float64's with undefined values
a = [1;2;3;4;5] # Create the column vector [1 2 3 4 5]
a = [1 2 3 4] # Create the row vector [1 2 3 4]
a[3] = 2 # Change the third element of a (using linear indexing) to 2
b = Matrix{Float64}(undef,4,2) # Define a Matrix of Float64's of size (4,2) with undefined values
c = Array{Float64}(undef, 4,5,6,7) # Define a (4,5,6,7) array of Float64's with undefined values
mat = [1 2 3 4
3 4 5 6
4 4 4 6
3 3 3 3] #Define the matrix inline
mat[1,2] = 4 # Set element (1,2) (row 1, column 2) to 4
mat
Out[6]:
Note that, in the console (called the REPL), you can use ;
to surpress the output. In a script this is done automatically. Note that the "value" of an array is its pointer to the memory location. This means that arrays which are set equal affect the same values:
In [7]:
a = [1;3;4]
b = a
b[1] = 10
a
Out[7]:
To set an array equal to the values to another array, use copy
In [8]:
a = [1;4;5]
b = copy(a)
b[1] = 10
a
Out[8]:
We can also make an array of a similar size and shape via the function similar
, or make an array of zeros/ones with zeros
or ones
respectively:
In [9]:
c = similar(a)
d = zero(a)
e = ones(a)
println(c); println(d); println(e)
Note that arrays can be index'd by arrays:
In [10]:
a[1:2]
Out[10]:
Arrays can be of any type, specified by the type parameter. One interesting thing is that this means that arrays can be of arrays:
In [11]:
a = Vector{Vector{Float64}}(undef,3)
a[1] = [1;2;3]
a[2] = [1;2]
a[3] = [3;4;5]
a
Out[11]:
In [12]:
b = a
b[1] = [1;4;5]
a
Out[12]:
To fix this, there is a recursive copy function: deepcopy
In [13]:
b = deepcopy(a)
b[1] = [1;2;3]
a
Out[13]:
For high performance, Julia provides mutating functions. These functions change the input values that are passed in, instead of returning a new value. By convention, mutating functions tend to be defined with a !
at the end and tend to mutate their first argument. An example of a mutating function in copyto!
which copies the values of over to the first array.
In [14]:
a = [1;6;8]
b = similar(a) # make an array just like a but with undefined values
copyto!(b,a) # b changes
Out[14]:
The purpose of mutating functions is that they allow one to reduce the number of memory allocations which is crucial for achiving high performance.
In [15]:
for i=1:5 #for i goes from 1 to 5
println(i)
end
t = 0
while t<5
println(t)
t+=1 # t = t + 1
end
school = :UCI
if school==:UCI
println("ZotZotZot")
else
println("Not even worth discussing.")
end
One interesting feature about Julia control flow is that we can write multiple loops in one line:
In [16]:
for i=1:2,j=2:4
println(i*j)
end
In [17]:
f(x,y) = 2x+y # Create an inline function
Out[17]:
In [18]:
f(1,2) # Call the function
Out[18]:
In [19]:
function f(x)
x+2
end # Long form definition
Out[19]:
By default, Julia functions return the last value computed within them.
In [20]:
f(2)
Out[20]:
A key feature of Julia is multiple dispatch. Notice here that there is "one function", f
, with two methods. Methods are the actionable parts of a function. Here, there is one method defined as (::Any,::Any)
and (::Any)
, meaning that if you give f
two values then it will call the first method, and if you give it one value then it will call the second method.
Multiple dispatch works on types. To define a dispatch on a type, use the ::Type
signifier:
In [21]:
f(x::Int,y::Int) = 3x+2y
Out[21]:
Julia will dispatch onto the strictest acceptible type signature.
In [22]:
f(2,3) # 3x+2y
Out[22]:
In [23]:
f(2.0,3) # 2x+y since 2.0 is not an Int
Out[23]:
Types in signatures can be parametric. For example, we can define a method for "two values are passed in, both Numbers and having the same type". Note that <:
means "a subtype of".
In [24]:
f{T<:Number}(x::T,y::T) = 4x+10y
In [25]:
f(2,3) # 3x+2y since (::Int,::Int) is stricter
Out[25]:
In [26]:
f(2.0,3.0) # 4x+10y
Out[26]:
Note that type parameterizations can have as many types as possible, and do not need to declare a supertype. For example, we can say that there is an x
which must be a Number, while y
and z
must match types:
In [27]:
f(x::T,y::T2,z::T2) where {T<:Number,T2} = 5x + 5y + 5z
Out[27]:
We will go into more depth on multiple dispatch later since this is the core design feature of Julia. The key feature is that Julia functions specialize on the types of their arguments. This means that f
is a separately compiled function for each method (and for parametric types, each possible method). The first time it is called it will compile.
In [28]:
f(x,y,z,w) = x+y+z+w
@time f(1,1,1,1)
@time f(1,1,1,1)
@time f(1,1,1,1)
@time f(1,1,1,1.0)
@time f(1,1,1,1.0)
Out[28]:
Note that functions can also feature optional arguments:
In [29]:
function test_function(x,y;z=0) #z is an optional argument
if z==0
return x+y,x*y #Return a tuple
else
return x*y*z,x+y+z #Return a different tuple
#whitespace is optional
end #End if statement
end #End function definition
Out[29]:
Here, if z is not specified, then it's 0.
In [30]:
x,y = test_function(1,2)
Out[30]:
In [31]:
x,y = test_function(1,2;z=3)
Out[31]:
Notice that we also featured multiple return values.
In [32]:
println(x); println(y)
The return type for multiple return values is a Tuple. The syntax for a tuple is (x,y,z,...)
or inside of functions you can use the shorthand x,y,z,...
as shown.
Note that functions in Julia are "first-class". This means that functions are just a type themselves. Therefore functions can make functions, you can store functions as variables, pass them as variables, etc. For example:
In [33]:
function function_playtime(x) #z is an optional argument
y = 2+x
function test()
2y # y is defined in the previous scope, so it's available here
end
z = test() * test()
return z,test
end #End function definition
z,test = function_playtime(2)
Out[33]:
In [34]:
test()
Out[34]:
Notice that test()
does not get passed in y
but knows what y
is. This is due to the function scoping rules: an inner function can know the variables defined in the same scope as the function. This rule is recursive, leading us to the conclusion that the top level scope is global. Yes, that means
In [35]:
a = 2
Out[35]:
defines a global variable. We will go into more detail on this.
Lastly we show the anonymous function syntax. This allows you to define a function inline.
In [36]:
g = (x,y) -> 2x+y
Out[36]:
Unlike named functions, g
is simply a function in a variable and can be overwritten at any time:
In [37]:
g = (x) -> 2x
Out[37]:
An anonymous function cannot have more than 1 dispatch. However, as of v0.5, they are compiled and thus do not have any performance disadvantages from named functions.
A type is what in many other languages is an "object". If that is a foreign concept, thing of a type as a thing which has named components. A type is the idea for what the thing is, while an instantiation of the type is a specific one. For example, you can think of a car as having an make and a model. So that means a Toyota RAV4 is an instantiation of the car type.
In Julia, we would define the car type as follows:
In [38]:
mutable struct Car
make
model
end
We could then make the instance of a car as follows:
In [39]:
mycar = Car("Toyota","Rav4")
Out[39]:
Here I introduced the string syntax for Julia which uses "..." (like most other languages, I'm glaring at you MATLAB). I can grab the "fields" of my type using the .
syntax:
In [40]:
mycar.make
Out[40]:
To "enhance Julia's performance", one usually likes to make the typing stricter. For example, we can define a WorkshopParticipant (notice the convention for types is capital letters, CamelCase) as having a name and a field. The name will be a string and the field will be a Symbol type, (defined by :Symbol, which we will go into plenty more detail later).
In [41]:
mutable struct WorkshopParticipant
name::String
field::Symbol
end
tony = WorkshopParticipant("Tony",:physics)
Out[41]:
As with functions, types can be set "parametrically". For example, we can have an StaffMember have a name and a field, but also an age. We can allow this age to be any Number type as follows:
In [42]:
mutable struct StaffMember{T<:Number}
name::String
field::Symbol
age::T
end
ter = StaffMember("Terry",:football,17)
Out[42]:
The rules for parametric typing is the same as for functions. Note that most of Julia's types, like Float64 and Int, are natively defined in Julia in this manner. This means that there's no limit for user defined types, only your imagination. Indeed, many of Julia's features first start out as a prototyping package before it's ever moved into Base (the Julia library that ships as the Base module in every installation).
Lastly, there exist abstract types. These types cannot be instantiated but are used to build the type hierarchy. You've already seen one abstract type, Number. We can define one for Person using the Abstract keyword
In [43]:
abstract type Person
end
Then we can set types as a subtype of person
In [44]:
mutable struct Student <: Person
name
grade
end
You can define type heirarchies on abstract types. See the beautiful explanation at: http://docs.julialang.org/en/release-0.5/manual/types/#abstract-types
In [45]:
abstract type AbstractStudent <: Person
end
Another "version" of type is immutable
. When one uses immutable
, the fields of the type cannot be changed. However, Julia will automatically stack allocate immutable types, whereas standard types are heap allocated. If this is unfamiliar terminology, then think of this as meaning that immutable types are able to be stored closer to the CPU and have less cost for memory access (this is a detail not present in many scripting languages). Many things like Julia's built-in Number types are defined as immutable
in order to give good performance.
In [46]:
struct Field
name
school
end
ds = Field(:DataScience,[:PhysicalScience;:ComputerScience])
Out[46]:
In [47]:
ds.name = :ComputationalStatistics
However, the following is allowed:
In [48]:
push!(ds.school,:BiologicalScience)
ds.school
Out[48]:
(Hint: recall that an array is not the values itself, but a pointer to the memory of the values)
One important detail in Julia is that everything is a type (and every piece of code is an Expression type, more on this later). Thus functions are also types, which we can access the fields of. Not only is everything compiled down to native, but all of the "native parts" are always accessible. For example, we can, if we so choose, get a function pointer:
In [49]:
foo(x) = 2x
cfunction(foo, Int, Tuple{Int})
Julia provides many basic types. Indeed, you will come to know Julia as a system of multiple dispatch on types, meaning that the interaction of types with functions is core to the design.
While MATLAB or Python has easy functions for building arrays, Julia tends to side-step the actual "array" part with specially made types. One such example are ranges. To define a range, use the start:stepsize:end
syntax. For example:
In [50]:
a = 1:5
println(a)
b = 1:2:10
println(b)
We can use them like any array. For example:
In [51]:
println(a[2]); println(b[3])
But what is b
?
In [52]:
println(typeof(b))
b
isn't an array, it's a StepRange. A StepRange has the ability to act like an array using its fields:
In [53]:
fieldnames(StepRange)
Out[53]:
Note that at any time we can get the array from these kinds of type via the collect
function:
In [54]:
c = collect(a)
Out[54]:
The reason why lazy iterator types are preferred is that they do not do the computations until it's absolutely necessary, and they take up much less space. We can check this with @time
:
In [55]:
@time a = 1:100000
@time a = 1:100
@time b = collect(1:100000);
Notice that the amount of time the range takes is much shorter. This is mostly because there is a lot less memory allocation needed: only a StepRange
is built, and all that holds is the three numbers. However, b
has to hold 100000
numbers, leading to the huge difference.
In [56]:
d = Dict(:test=>2,"silly"=>:suit)
println(d[:test])
println(d["silly"])
In [57]:
tup = (2.,3) # Don't have to match types
x,y = (3.0,"hi") # Can separate a tuple to multiple variables
Out[57]:
Metaprogramming is a huge feature of Julia. The key idea is that every statement in Julia is of the type Expression
. Julia operators by building an Abstract Syntax Tree (AST) from the Expressions. You've already been exposed to this a little bit: a Symbol
(like :PhysicalSciences
is not a string because it is part of the AST, and thus is part of the parsing/expression structure. One interesting thing is that symbol comparisons are O(1) while string comparisons, like always, are O(n)) is part of this, and macros (the weird functions with an @
) are functions on expressions.
Thus you can think of metaprogramming as "code which takes in code and outputs code". One basic example is the @time
macro:
In [58]:
macro my_time(ex)
return quote
local t0 = time()
local val = $ex
local t1 = time()
println("elapsed time: ", t1-t0, " seconds")
val
end
end
Out[58]:
This takes in an expression ex
, gets the time before and after evaluation, and prints the elapsed time between (the real time macro also calculates the allocations as seen earlier). Note that $ex
"interpolates" the expression into the macro. Going into detail on metaprogramming is a large step from standard scripting and will be a later session.
Why macros? One reason is because it lets you define any syntax you want. Since it operates on the expressions themselves, as long as you know how to parse the expression into working code, you can "choose any syntax" to be your syntax. A case study will be shown later. Another reason is because these are done at "parse time" and those are only called once (before the function compilation).