Getting Started with GraphLab Create - Tutorial

Code available via turi-code tutorials

In [1]:
import graphlab as gl

In [2]:
gl.canvas.set_target('ipynb') # use IPython Notebook output for GraphLab Canvas

In [3]:
vertices = gl.SFrame.read_csv('')
edges = gl.SFrame.read_csv('')

[INFO] graphlab.cython.cy_server: GraphLab Create v2.1 started. Logging: /tmp/graphlab_server_1472322388.log
This non-commercial license of GraphLab Create for academic use is assigned to and will expire on August 27, 2017.
Downloading to /var/tmp/graphlab-dainabouquin/3885/36c94d2f-e9d0-49b4-9488-c03f03acfdfa.csv
Finished parsing file
Parsing completed. Parsed 10 lines in 0.047944 secs.
Inferred types from first 100 line(s) of file as 
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
Finished parsing file
Parsing completed. Parsed 10 lines in 0.017223 secs.
Downloading to /var/tmp/graphlab-dainabouquin/3885/4c04b0d6-a376-44a1-9a57-af6f10d6d7ca.csv
Finished parsing file
Parsing completed. Parsed 20 lines in 0.016904 secs.
Inferred types from first 100 line(s) of file as 
If parsing fails due to incorrect types, you can correct
the inferred type list above and pass it to read_csv in
the column_type_hints argument
Finished parsing file
Parsing completed. Parsed 20 lines in 0.017081 secs.

In [4]:
# SFrame has a number of methods to explore and transform your data

In [5]:
# this shows the summary of the edges SFrame

In [6]:
#Create a graph object
g = gl.SGraph()

In [7]:
#Add vertices and edges to this graph
# add some vertices in a dataflow-ish way
g = g.add_vertices(vertices=vertices, vid_field='name')

In [8]:
# more dataflow
g = g.add_edges(edges=edges, src_field='src', dst_field='dst')

In [9]:
# Show all the vertices

__id gender license_to_kill villian
Inga Bergstorm F 0 0
Moneypenny F 1 0
Henry Gupta M 0 1
Wai Lin F 1 0
Q M 1 0
James Bond M 1 0
M M 1 0
Paris Carver F 0 1
Elliot Carver M 0 1
Gotz Otto M 0 1
[10 rows x 4 columns]

In [10]:
# Show all the edges

__src_id __dst_id relation
Moneypenny M managed_by
Inga Bergstorm James Bond friend
Moneypenny Q colleague
Henry Gupta Elliot Carver killed_by
Q Moneypenny colleague
M Moneypenny worksfor
James Bond Inga Bergstorm friend
James Bond M managed_by
Q M managed_by
Wai Lin James Bond friend
[20 rows x 3 columns]
Note: Only the head of the SFrame is printed.
You can use print_rows(num_rows=m, num_columns=n) to print more rows and columns.

In [11]:
# Get all the "friend" edges
g.get_edges(fields={'relation': 'friend'})

__src_id __dst_id relation
Inga Bergstorm James Bond friend
James Bond Inga Bergstorm friend
Wai Lin James Bond friend
James Bond Wai Lin friend
[4 rows x 3 columns]

In [12]:
#Apply the pagerank algorithm to our graph
pr = gl.pagerank.create(g)

Counting out degree
Done counting out degree
| Iteration | L1 change in pagerank |
| 1         | 6.65833               |
| 2         | 4.65611               |
| 3         | 3.46298               |
| 4         | 2.55686               |
| 5         | 1.95422               |
| 6         | 1.42139               |
| 7         | 1.10464               |
| 8         | 0.806704              |
| 9         | 0.631771              |
| 10        | 0.465388              |
| 11        | 0.364898              |
| 12        | 0.271257              |
| 13        | 0.212255              |
| 14        | 0.159062              |
| 15        | 0.124071              |
| 16        | 0.0935911             |
| 17        | 0.0727674             |
| 18        | 0.0551714             |
| 19        | 0.0427744             |
| 20        | 0.0325555             |

In [14]:
#We see, not unexpectedly, that James Bond is a very important person, and that bad guys aren't that popular...

__id pagerank delta
James Bond 2.52743578524 0.0132914517076
M 1.87718696576 0.00666194771763
Moneypenny 1.18363921275 0.00143637385736
Q 1.18363921275 0.00143637385736
Wai Lin 0.869872717136 0.00477951418076
Inga Bergstorm 0.869872717136 0.00477951418076
Elliot Carver 0.634064732205 0.000113553313724
Henry Gupta 0.284762885673 1.89255522873e-05
Paris Carver 0.284762885673 1.89255522873e-05
Gotz Otto 0.284762885673 1.89255522873e-05
[10 rows x 3 columns]

In [ ]: