Presented to the Polyglot Programming DC Meetup, August 7th, 2014.
GitHub repo: https://github.com/HarlanH/JuliaPolygotPresentation
Huge parts of this presentation are cribbed from:
And see also:
Stefan Karpinski, Jeff Bezanson, Viral Shaw, Alan Edelman (and the Father of Floating Point):
Pros: fast matrix algebra, REPL, easy to start using
Cons: commercial software, slow or impractical for many non-numeric tasks, syntax issues
Pros: multiple dispatch, macros
Cons: obscure syntax, hard to learn
Pros: dynamic, great ecosystem, easy to start using, elegant OO design
Cons: have to extend in C for performance, version and package issues
Pros: blazing fast, best-of-class algorithms
Cons: hard to do anything but number crunching
Pros: syntax matters, dynamic, great ecosystem
Cons: very slow, especially for numerical work
Pros: domain-specific but not domain-limited, huge package ecosystem, great interactivity
Cons: forces vectorization for speed, hard to contribute to core, quirky
Pros: blazing fast JIT compilers, forces asynchronous design thinking
Cons: everything else
Pros: simple syntax, very fast
Cons: no REPL, error-prone memory handling, static typing means hard to start
Common pattern: Outer scripting language wraps inner systems language (JCL + Asm, Matlab + Fortran, R/Python + C, Javascript + C++)
Object oriented programming: Easy to add new types!
Web-centric languages! Yay!
Julia was designed from the beginning to have certain features that are necessary for high performance:
- type stability
- pervasive type inference
- execution semantics that closely match the hardware capabilities (pass-by-sharing, machine arithmetic)
- inlining
- macros
Additionally, certain oft-requested features were not included which make high performance much more difficult – or impossible – to achieve. These include:
- pass-by-copy
- first class local namespaces (aka eval in function scope)
- allowing overloading of the getfield function (a.b access)
In [1]:
function collatz(n) # unproven: always terminates
k = 0
while n > 1
n = isodd(n) ? 3n+1 : n>>1
k += 1
end
return k
end
Out[1]:
In [2]:
collatz(89)
Out[2]:
In [3]:
for i = 2:2:20
α = collatz(i)
println("$i = $α")
end
In [4]:
@time for i = 1:1e6 collatz(18) end
In [5]:
@code_native collatz(123)
In [6]:
@code_llvm collatz(123)
From Stefan Karpinski, The Design Impact of Multiple Dispatch, and Jiahao Chen, Julia Compiler and Community.
In [7]:
f(a::Any, b) = "fallback"
f(a::Number, b::Number) = "a and b are both numbers"
f(a::Number, b) = "a is a number"
f(a, b::Number) = "b is a number"
f(a::Integer, b::Integer) = "a and b are both integers"
Out[7]:
In [8]:
f(1.5, 2)
Out[8]:
In [9]:
print(typeof(1.5), ", ", typeof(2))
In [10]:
f(1, "bar")
Out[10]:
In [11]:
f(1, 2)
Out[11]:
In [12]:
f("foo", [1,2])
Out[12]:
In [13]:
f{T<:Number}(a::T, b::T) = "a and b are both $(T)s"
Out[13]:
In [14]:
methods(f)
Out[14]:
In [15]:
f(big(1.5), big(2.5))
Out[15]:
In [16]:
f("foo", "bar") #<== still doesn't apply to non-numbers
Out[16]:
In [17]:
immutable Interval{T<:Real} <: Number
lo::T
hi::T
end
(a::Real)..(b::Real) = Interval(a,b)
Base.show(io::IO, iv::Interval) = print(io, "($(iv.lo))..($(iv.hi))")
Out[17]:
In [18]:
(1..2) + 3 # tries but fails to find a way to reconcile two Numbers
In [19]:
1..2
Out[19]:
In [20]:
typeof(ans)
Out[20]:
In [21]:
sizeof(1..2) # two 64-bit/8-byte ints
Out[21]:
In [22]:
(1//2)..(2//3)
Out[22]:
In [23]:
a::Interval + b::Interval = (a.lo + b.lo)..(a.hi + b.hi)
a::Interval - b::Interval = (a.lo - b.hi)..(a.hi - b.lo)
Out[23]:
In [24]:
(2..3) + (-1..1)
Out[24]:
In [25]:
(2..3) + (1.0..3.14159) # autoconverts
Out[25]:
In [26]:
@code_native (2..3) + (-1..1)
This means you're free to extend everything
Since generic functions are open:
We're forced to think much harder about the meaning of operations
Results, if done well, are abstractions, defined generically, that extend easily
round(number, digits, base)
DataFrames
VennEuler
(shown elsewhere)
In [27]:
methods(round) # click through...
Out[27]:
In [28]:
round(123.321)
Out[28]:
In [29]:
round(123.321, 2)
Out[29]:
In [30]:
round(123.321, -1)
Out[30]:
In [31]:
round(123.321, 1, 2)
Out[31]:
In [32]:
for i = 1:10 println(round(123.321, i, 2)) end
In [33]:
Pkg.installed()
Out[33]:
In [34]:
using RDatasets
iris = dataset("datasets", "iris")
Out[34]:
In [35]:
typeof(iris)
Out[35]:
In [36]:
by(iris, :Species, df -> DataFrame(mean_length = mean(df[:PetalLength])))
Out[36]:
using
pulls dependencies too, in this case including DataFrames
show(DataFrame)
outputs Markdown-compatible tablesby
is part of split-apply-combine idiom for DFs:Species
is a symbol, ala LISP, used frequently instead of Enumsdf -> ...
is an anonymous functionDataFrame
constructor gets the new column name by tricky use of named arguments
In [39]:
using GLM
lm1 = fit(LinearModel, SepalLength ~ SepalWidth + PetalLength, iris)
Out[39]:
In [40]:
a ~ b + c
Out[40]:
fit
is a generic function, specialized here on a model type, a Formula, and datalm1
is interesting~
is syntactic sugar that calls a macro that captures the expression in a Formula