Visualization is a key topic in scientific computing. Currently there are several options available for producing different types of graphics in Julia.
Coming from Python, one of the most accessible (and full-featured) is PyPlot, which is a Julia wrapper to the pyplot submodule of the Python matplotlib library.
To install the PyPlot package, if it is not already installed, we do
In [1]:
Pkg.add("PyPlot")
If the package requested is already installed, this will be reported with the message INFO: Nothing to be done, as shown above.
It will also warn you if there may be a new version of the package available.
Pkg is the package manager. It uses git repositories, which are each cloned into their own sub-directory of the .julia directory in your home directory.
The list of packages currently available is at http://pkg.julialang.org/; it is growing rapidly.
Pkg checks each time to see if the METADATA.jl repository has been updated; this repository contains the information about available packages.
Pkg.update() will update to the latest versions of installed packages. (Currently this is an all-or-nothing operation, updating all installed packages.)
Using packages
In [2]:
using PyPlot
using is analogous to from [package] import * in Python; it makes all names in the package available.
We could instead use import PyPlot, in which case all names would be available as PyPlot.plot, etc.
In [4]:
x = rand(10)
y = rand(10)
p = plot(x, y, "ro-")
Out[4]:
In [5]:
p
Out[5]:
In [ ]:
An alternative is to use import instead:
In [3]:
import PyPlot
In [6]:
PyPlot.figure(figsize=(4,2))
PyPlot.plot(rand(10), rand(10))
Out[6]:
In [ ]:
PyPlot.save
In [3]:
L = 1000
diffs = []
for i in 1:10
M = randn(L, L);
M = Symmetric(M)
lamb = eigvals(M);
diffs = [diffs, diff(lamb)];
end
h = hist(diffs, 300)
plot(h[1][1:end-1], h[2])
Out[3]:
Gadfly is a native Julia plotting package, based on the Grammar of Graphics syntax, and thus with a similar flavour to R's ggplot package.
In [1]:
using Gadfly
In [2]:
xs = 1:10
ys = rand(10)
plot(x=xs, y=ys)
Out[2]:
The plot can be panned and zoomed directly in the browser.
Different geometries are available:
In [3]:
plot(x=xs, y=ys, Geom.line)
Out[3]:
In [7]:
plot(x=rand(10), y=rand(10), Geom.point, Geom.line(preserve_order=true))
Out[7]:
Note that the lines are drawn between points ordered in the $x$-direction. To avoid this, we use an argument to Geom.line:
In [8]:
p = plot(layer(x=rand(10), y=rand(10), color=[1], Geom.point, Geom.line(preserve_order=true)),
layer(x=rand(10), y=rand(10), Geom.point, Geom.line(preserve_order=true)),
Guide.XLabel("first"), Guide.YLabel("second"))
Out[8]:
We can write to a PDF:
In [9]:
draw(PDF("stuff.pdf", 10cm, 5cm), p)
In [11]:
;open stuff.pdf
The DataFrames package provides pandas-like functionality.
In [9]:
Pkg.add("DataFrames")
Gadfly has excellent integration with DataFrames:
In [10]:
Pkg.add("RDataSets")
In [12]:
using Gadfly
using RDatasets
In [13]:
irises = dataset("datasets", "iris")
head(irises)
Out[13]:
In [14]:
typeof(irises)
Out[14]:
In [15]:
plot(irises, x="SepalLength", y="SepalWidth", Geom.point)
Out[15]:
In [17]:
cars = dataset("car", "SLID")
head(cars)
Out[17]:
In [18]:
plot(cars, x="Wages", color="Language", Geom.histogram)
Out[18]:
A module is a way to encapsulate code in a namespace. The syntax is as follows
module NAME
import Base.show, Base.getindex # imports to extend Base functions
export A, B # exports
...
end
We can then do
using NAME
to use the functionality.
In [19]:
import PyPlot
In [21]:
plt = PyPlot
Out[21]:
In [20]:
help("import")
In [22]:
plt.plot
Out[22]:
In [25]:
macroexpand(:(@time begin sin(10); @time cos(10) end ))
Out[25]:
In [ ]:
i = 0
@time @until