Displaying MusicBrainz data in timelines

One feature I would like to be added to the MusicBrainz web interface is the possibility to display timelines, e.g. showing when members arrived or left on a band page, or sorting concerts, albums releases, and so on.

Before suggesting this feature to the MusicBrainz developers (or programing it myself), I want to try and display this kind of timelines in a Jupyter notebook and see how it would look like.

Several JavaScript libraries can do the job; I decided to test the timesheet-advanced.js and visjs libraries.

Setup



In [1]:

    
%load_ext watermark
%watermark --python -r









    



CPython 3.7.0b5
IPython 6.4.0
Git repo: git@bitbucket.org:loujine/musicbrainz-dataviz.git



In [2]:

    
%watermark --date --updated









    



last updated: 2018-06-10

The setup required to repeat these operations is explained in the introduction notebook. In case graphs do not appear in this page you can refer to the static version.

The next commands may be needed or not depending on your setup (i.e. if you use my docker setup):



In [3]:

    
import os
from pprint import pprint
import pandas
import sqlalchemy

# your postgres server IP
IP = 'localhost'

def sql(query, **kwargs):
    """helper function for SQL queries using the %(...) syntax
    Parameters defined globally are replaced implicitely"""
    params = globals().copy()
    params.update(kwargs)

    # define DB connection parameters if needed
    PGHOST = os.environ.get('PGHOST', IP)
    PGDATABASE = os.environ.get('PGDATABASE', 'musicbrainz')
    PGUSER = os.environ.get('PGUSER', 'musicbrainz')
    PGPASSWORD = os.environ.get('PGPASSWORD', 'musicbrainz')
    engine = sqlalchemy.create_engine(
       'postgresql+psycopg2://%(PGUSER)s:%(PGPASSWORD)s@%(PGHOST)s/%(PGDATABASE)s' % locals(),
        isolation_level='READ UNCOMMITTED')
    return pandas.read_sql(query, engine, params=params)

# helper functions to generate an HTML link to an entity MusicBrainz URL
def _mb_link(type, mbid):
    return '<a href="https://musicbrainz.org/%(type)s/%(mbid)s">%(mbid)s</a>' % locals()

mb_artist_link = lambda mbid: _mb_link('artist', mbid)

Extraction of band data from the database

Now we can extract the information for the band I want. The SQL query will look for:

band name
artists linked to this band through the "member of" relationship
instrument/vocal role of this relationship

Let's start with some band you probably already know:



In [4]:

    
band_name = 'The Beatles'

The SQL query is a bit complicated because it uses a lot of different tables, I won't go into details. We store the result in a data structure called a PanDas DataFrame (df).



In [5]:

    
df = sql("""
SELECT b.name AS band,
       m.name AS member,
       m.gid AS mbid,
       lat.name AS role,
       to_date(to_char(l.begin_date_year, '9999') || '0101', 'YYYYMMDD') AS start,
       to_date(to_char(l.end_date_year, '9999') || '0101', 'YYYYMMDD') AS end
FROM artist              AS b
JOIN l_artist_artist     AS laa ON laa.entity1 = b.id
JOIN artist              AS m   ON laa.entity0 = m.id
JOIN link                AS l   ON l.id = laa.link
JOIN link_attribute      AS la  ON la.link = l.id
JOIN link_attribute_type AS lat ON la.attribute_type = lat.id
JOIN link_type           AS lt  ON l.link_type = lt.id
WHERE lt.name = 'member of band'
  AND b.name = %(band_name)s
  AND lat.name != 'original';
""")
df









    Out[5]:







  
    
      
      band
      member
      mbid
      role
      start
      end
    
  
  
    
      0
      The Beatles
      Pete Best
      0d4ab0f9-bbda-4ab1-ae2c-f772ffcfbea9
      drums
      1960-01-01
      1962-01-01
    
    
      1
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      lead vocals
      1957-01-01
      1970-01-01
    
    
      2
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      bass guitar
      1957-01-01
      1970-01-01
    
    
      3
      The Beatles
      Ringo Starr
      300c4c73-33ac-4255-9d57-4e32627f5e13
      drums
      1962-01-01
      1970-01-01
    
    
      4
      The Beatles
      Stuart Sutcliffe
      49a51491-650e-44b3-8085-2f07ac2986dd
      bass guitar
      1960-01-01
      1962-01-01
    
    
      5
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      lead vocals
      None
      1970-01-01
    
    
      6
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      guitar
      None
      1970-01-01
    
    
      7
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      lead vocals
      1958-01-01
      1970-01-01
    
    
      8
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      guitar
      1958-01-01
      1970-01-01

The data is here, we just want to set a start date for Lennon's roles since it is not in the database.



In [6]:

    
import datetime
df['start'] = df['start'].fillna(datetime.date(1957, 1, 1))
df['mbid'] = df['mbid'].astype(str) # otherwise PanDas uses the UUID data type which will cause problems later.
df









    Out[6]:







  
    
      
      band
      member
      mbid
      role
      start
      end
    
  
  
    
      0
      The Beatles
      Pete Best
      0d4ab0f9-bbda-4ab1-ae2c-f772ffcfbea9
      drums
      1960-01-01
      1962-01-01
    
    
      1
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      lead vocals
      1957-01-01
      1970-01-01
    
    
      2
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      bass guitar
      1957-01-01
      1970-01-01
    
    
      3
      The Beatles
      Ringo Starr
      300c4c73-33ac-4255-9d57-4e32627f5e13
      drums
      1962-01-01
      1970-01-01
    
    
      4
      The Beatles
      Stuart Sutcliffe
      49a51491-650e-44b3-8085-2f07ac2986dd
      bass guitar
      1960-01-01
      1962-01-01
    
    
      5
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      lead vocals
      1957-01-01
      1970-01-01
    
    
      6
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      guitar
      1957-01-01
      1970-01-01
    
    
      7
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      lead vocals
      1958-01-01
      1970-01-01
    
    
      8
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      guitar
      1958-01-01
      1970-01-01

Display a timeline with timesheet-advanced

The timesheet-advanced package requires the input data for the timeline to be inserted slightly differently from what we have in our dataframe df. Let us first copy our data in a new variable ts and simplify the dates to years.



In [7]:

    
ts = df.copy()
ts['start'] = ts['start'].apply(lambda date: date.year).astype(str)
ts['end'] = ts['end'].apply(lambda date: date.year).astype(str)

We need a 'label' field (we'll choose the band member name + instrument) and we need a 'type' which is a color. We choose colors to represent all possible roles (vocals, guitar, drums....)



In [8]:

    
ts['label'] = df['member'] + ' (' + df['role'] + ')'
ts









    Out[8]:







  
    
      
      band
      member
      mbid
      role
      start
      end
      label
    
  
  
    
      0
      The Beatles
      Pete Best
      0d4ab0f9-bbda-4ab1-ae2c-f772ffcfbea9
      drums
      1960
      1962
      Pete Best (drums)
    
    
      1
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      lead vocals
      1957
      1970
      Paul McCartney (lead vocals)
    
    
      2
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      bass guitar
      1957
      1970
      Paul McCartney (bass guitar)
    
    
      3
      The Beatles
      Ringo Starr
      300c4c73-33ac-4255-9d57-4e32627f5e13
      drums
      1962
      1970
      Ringo Starr (drums)
    
    
      4
      The Beatles
      Stuart Sutcliffe
      49a51491-650e-44b3-8085-2f07ac2986dd
      bass guitar
      1960
      1962
      Stuart Sutcliffe (bass guitar)
    
    
      5
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      lead vocals
      1957
      1970
      John Lennon (lead vocals)
    
    
      6
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      guitar
      1957
      1970
      John Lennon (guitar)
    
    
      7
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      lead vocals
      1958
      1970
      George Harrison (lead vocals)
    
    
      8
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      guitar
      1958
      1970
      George Harrison (guitar)



In [9]:

    
colors = dict(zip(sorted(set(ts['role'])), ['red', 'blue', 'yellow', 'green']))
print('Correspondance between colors and roles: {}'.format(colors))
ts['type'] = ts['role'].apply(lambda role: colors[role])
ts









    



Correspondance between colors and roles: {'bass guitar': 'red', 'drums': 'blue', 'guitar': 'yellow', 'lead vocals': 'green'}






    Out[9]:







  
    
      
      band
      member
      mbid
      role
      start
      end
      label
      type
    
  
  
    
      0
      The Beatles
      Pete Best
      0d4ab0f9-bbda-4ab1-ae2c-f772ffcfbea9
      drums
      1960
      1962
      Pete Best (drums)
      blue
    
    
      1
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      lead vocals
      1957
      1970
      Paul McCartney (lead vocals)
      green
    
    
      2
      The Beatles
      Paul McCartney
      ba550d0e-adac-4864-b88b-407cab5e76af
      bass guitar
      1957
      1970
      Paul McCartney (bass guitar)
      red
    
    
      3
      The Beatles
      Ringo Starr
      300c4c73-33ac-4255-9d57-4e32627f5e13
      drums
      1962
      1970
      Ringo Starr (drums)
      blue
    
    
      4
      The Beatles
      Stuart Sutcliffe
      49a51491-650e-44b3-8085-2f07ac2986dd
      bass guitar
      1960
      1962
      Stuart Sutcliffe (bass guitar)
      red
    
    
      5
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      lead vocals
      1957
      1970
      John Lennon (lead vocals)
      green
    
    
      6
      The Beatles
      John Lennon
      4d5447d7-c61c-4120-ba1b-d7f471d385b9
      guitar
      1957
      1970
      John Lennon (guitar)
      yellow
    
    
      7
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      lead vocals
      1958
      1970
      George Harrison (lead vocals)
      green
    
    
      8
      The Beatles
      George Harrison
      42a8f507-8412-4611-854f-926571049fa0
      guitar
      1958
      1970
      George Harrison (guitar)
      yellow

We can also add a 'link' columns containing URLs to the MusicBrainz website:



In [10]:

    
ts['link'] = 'https://musicbrainz.org/artist/' + ts['mbid']
ts.drop('mbid', axis=1, inplace=True)
ts









    Out[10]:







  
    
      
      band
      member
      role
      start
      end
      label
      type
      link
    
  
  
    
      0
      The Beatles
      Pete Best
      drums
      1960
      1962
      Pete Best (drums)
      blue
      https://musicbrainz.org/artist/0d4ab0f9-bbda-4...
    
    
      1
      The Beatles
      Paul McCartney
      lead vocals
      1957
      1970
      Paul McCartney (lead vocals)
      green
      https://musicbrainz.org/artist/ba550d0e-adac-4...
    
    
      2
      The Beatles
      Paul McCartney
      bass guitar
      1957
      1970
      Paul McCartney (bass guitar)
      red
      https://musicbrainz.org/artist/ba550d0e-adac-4...
    
    
      3
      The Beatles
      Ringo Starr
      drums
      1962
      1970
      Ringo Starr (drums)
      blue
      https://musicbrainz.org/artist/300c4c73-33ac-4...
    
    
      4
      The Beatles
      Stuart Sutcliffe
      bass guitar
      1960
      1962
      Stuart Sutcliffe (bass guitar)
      red
      https://musicbrainz.org/artist/49a51491-650e-4...
    
    
      5
      The Beatles
      John Lennon
      lead vocals
      1957
      1970
      John Lennon (lead vocals)
      green
      https://musicbrainz.org/artist/4d5447d7-c61c-4...
    
    
      6
      The Beatles
      John Lennon
      guitar
      1957
      1970
      John Lennon (guitar)
      yellow
      https://musicbrainz.org/artist/4d5447d7-c61c-4...
    
    
      7
      The Beatles
      George Harrison
      lead vocals
      1958
      1970
      George Harrison (lead vocals)
      green
      https://musicbrainz.org/artist/42a8f507-8412-4...
    
    
      8
      The Beatles
      George Harrison
      guitar
      1958
      1970
      George Harrison (guitar)
      yellow
      https://musicbrainz.org/artist/42a8f507-8412-4...

The last preparation step is to transform this Python data structure into a Javascript one that the timesheet library can read. We're going to use the fact that a Python list and a Javascript array are very close (we could also use JSON format to transform our data into smething JavaScript-compatible).



In [11]:

    
bubbles = [ts.loc[i].to_dict() for i in range(len(ts))]
print('First bubble:')
pprint(bubbles[0])









    



First bubble:
{'band': 'The Beatles',
 'end': '1962',
 'label': 'Pete Best (drums)',
 'link': 'https://musicbrainz.org/artist/0d4ab0f9-bbda-4ab1-ae2c-f772ffcfbea9',
 'member': 'Pete Best',
 'role': 'drums',
 'start': '1960',
 'type': 'blue'}

Perfect, bubbles contains our data. Time to do some javascript. The Jupyter notebook can display javascript code in an output cell by using the element.append magic.

To display the timeline inside this notebook we need to load the JS/CSS source of the timesheet-advanced package...



In [12]:

    
from IPython.display import HTML
HTML("""
<link rel="stylesheet" type="text/css" href="https://cdn.rawgit.com/ntucakovic/timesheet-advanced.js/ea3ee1ad/dist/timesheet.min.css" />
<script type="text/javascript" src="https://cdn.rawgit.com/ntucakovic/timesheet-advanced.js/ea3ee1ad/dist/timesheet-advanced.min.js"></script>
""")









    Out[12]:

... and to create an output container for our timeline. This cell be filled when the next cell code (new Timesheet(...)) will be executed.



In [13]:

    
%%javascript
// this must be executed before the "from IPython.display import Javascript" block
element.append('<div id="timesheet-container" style="width: 100%;height: 100%;"></div>');

Last step: we call the Timesheet javascript command using the CSS/JS libraries loaded above, our input data (bubbles), the cell where we want our graph, and the timeline limit (min and max date). Executing the next cell will fill the output cell just above this block automatically.



In [14]:

    
from IPython.display import Javascript

Javascript("""
var bubbles = %s;
new Timesheet(bubbles, {
    container: 'timesheet-container',
    type: 'parallel',
    timesheetYearMin: %s,
    timesheetYearMax: %s,
    theme: 'light'
});
""" % (bubbles, ts['start'].min(), ts['end'].max()))









    Out[14]:

We have our timeline now! As you can see, the same color is used for the same role consistently. The items on the timeline are clickable links bringing you to the artist page on MusicBrainz. If you can't see the timeline above you can find a static version on github.io

Display a timeline with vis.js

We can try to display the same data with another JavaScript library, vis.js. Again we will need to prepare the data.



In [15]:

    
v = df.copy()
v['start'] = v['start'].apply(lambda date: date.isoformat())
v['end'] = v['end'].apply(lambda date: date.isoformat())
v.drop('mbid', axis=1, inplace=True)
v['type'] = v['role'].apply(lambda role: colors[role])
v['label'] = v['member'] + ' (' + v['role'] + ')'
v









    Out[15]:







  
    
      
      band
      member
      role
      start
      end
      type
      label
    
  
  
    
      0
      The Beatles
      Pete Best
      drums
      1960-01-01
      1962-01-01
      blue
      Pete Best (drums)
    
    
      1
      The Beatles
      Paul McCartney
      lead vocals
      1957-01-01
      1970-01-01
      green
      Paul McCartney (lead vocals)
    
    
      2
      The Beatles
      Paul McCartney
      bass guitar
      1957-01-01
      1970-01-01
      red
      Paul McCartney (bass guitar)
    
    
      3
      The Beatles
      Ringo Starr
      drums
      1962-01-01
      1970-01-01
      blue
      Ringo Starr (drums)
    
    
      4
      The Beatles
      Stuart Sutcliffe
      bass guitar
      1960-01-01
      1962-01-01
      red
      Stuart Sutcliffe (bass guitar)
    
    
      5
      The Beatles
      John Lennon
      lead vocals
      1957-01-01
      1970-01-01
      green
      John Lennon (lead vocals)
    
    
      6
      The Beatles
      John Lennon
      guitar
      1957-01-01
      1970-01-01
      yellow
      John Lennon (guitar)
    
    
      7
      The Beatles
      George Harrison
      lead vocals
      1958-01-01
      1970-01-01
      green
      George Harrison (lead vocals)
    
    
      8
      The Beatles
      George Harrison
      guitar
      1958-01-01
      1970-01-01
      yellow
      George Harrison (guitar)

This time we are not going to inject the data inside a javascript string executed by the notebook, we are going to attach the data as JSON to the webpage itself (window) so that vis.js can find it.



In [16]:

    
# Transform into JSON
data = [{'start': line.start,
         'end': line.end,
         'content': line.label,
         'className': line.type
        } for _, line in ts.iterrows()]

# Send to Javascript
import json
from IPython.display import Javascript
Javascript("""window.bandData={};""".format(json.dumps(data, indent=4)))









    Out[16]:

We need to load the default CSS (from cdnjs.cloudflare.com) and add our custom CSS on top:



In [17]:

    
%%html
<link rel="stylesheet" type="text/css" href="https://cdnjs.cloudflare.com/ajax/libs/vis/4.20.1/vis-timeline-graph2d.min.css" />



In [18]:

    
%%html
<style type="text/css">
    /* custom styles for individual items, load this after vis.css/vis-timeline-graph2d.min.css */
    .vis-item.red {
      background-color: red;
    }
    .vis-item.blue {
      background-color: blue;
    }
    .vis-item.yellow {
      background-color: yellow;
    }
    .vis-item.green {
      background-color: greenyellow;
    }
    .vis-item.vis-selected {
      background-color: white;
      border-color: black;
      color: black;
      box-shadow: 0 0 10px gray;
    }
</style>

In order to load the JS library itself, we can use the require mechanism inside the notebook:



In [19]:

    
%%javascript
element.append('<div id="vis-container" style="width: 100%;height: 100%;"></div>');

requirejs.config({
    paths: {
        vis: '//cdnjs.cloudflare.com/ajax/libs/vis/4.20.1/vis'
    }
});

require(['vis'], function(vis){
  var data = new vis.DataSet(window.bandData);
  var options = {
    editable: false
  };
  // create the timeline
  var container = document.getElementById('vis-container');
  var timeline = new vis.Timeline(container, data, options);
})

And we have our timeline. Note that this time we can zoom/unzoom (with the mouse wheel) thanks to the vis.js library. You can also change the custom CSS above and see the timeline updated automatically.

Conclusion

In this notebook we used two different JS libraries to display the same data, extracted from the MusicBrainz DB. I hope I did not make things look too complicated and I convinced some of you to try and play with the MusicBrainz database :)

	band	member	mbid	role	start	end
0	The Beatles	Pete Best	0d4ab0f9-bbda-4ab1-ae2c-f772ffcfbea9	drums	1960-01-01	1962-01-01
1	The Beatles	Paul McCartney	ba550d0e-adac-4864-b88b-407cab5e76af	lead vocals	1957-01-01	1970-01-01
2	The Beatles	Paul McCartney	ba550d0e-adac-4864-b88b-407cab5e76af	bass guitar	1957-01-01	1970-01-01
3	The Beatles	Ringo Starr	300c4c73-33ac-4255-9d57-4e32627f5e13	drums	1962-01-01	1970-01-01
4	The Beatles	Stuart Sutcliffe	49a51491-650e-44b3-8085-2f07ac2986dd	bass guitar	1960-01-01	1962-01-01
5	The Beatles	John Lennon	4d5447d7-c61c-4120-ba1b-d7f471d385b9	lead vocals	None	1970-01-01
6	The Beatles	John Lennon	4d5447d7-c61c-4120-ba1b-d7f471d385b9	guitar	None	1970-01-01
7	The Beatles	George Harrison	42a8f507-8412-4611-854f-926571049fa0	lead vocals	1958-01-01	1970-01-01
8	The Beatles	George Harrison	42a8f507-8412-4611-854f-926571049fa0	guitar	1958-01-01	1970-01-01