In [2]:
import githistoryvis as ghv
Githistoryvis exposes the class git_history.
foo = git_history(PATH)
sets the attribute foo.path that point to the git respository in PATH.
Also def_states (and def_states_explain) are defined at inizialitation.
They are used to transform the state in the dataframe to number for visualization and define the legend.
You can overwrite them at your own risk.
# that is used as colorcode in the datamatrix
def_states = {
u'A': 120,
u'C': 25,
u'B': 51,
u'D': 240,
u'M': 180,
u'R': 102,
u'U': 204,
u'T': 76,
u'X': 153,
u'S': 255, # custom value, Static
u'N': None, # custom value, Non existent
}
# this is only a humand readable format
def_states_explain = {
u'A': u'added',
u'C': u'copied',
u'D': u'deleted',
u'M': u'modified',
u'R': u'renamed',
u'T': u'type changed',
u'U': u'unmerged',
u'X': u'unknown',
u'B': u'pairing broken',
u'S': u'Static',
u'N': u'Non existent'
}
foo.get_history()
extracts the git log, and define:
prettyformat, default %h
optional, accept one of the git prettyformat, see http://git-scm.com/docs/pretty-formats. For example, get the whole commit text with '%s' and write your own parser for sel.decodelog().
Deafault is '%h' of the short SHA-1 of the commit.
In [3]:
import os
path = os.getcwd() # put here the desired git repo path
gt = ghv.git_history(path)
gt.get_history()
# new compact version
gt = ghv.git_history(path, get_history=True)
default False
optional, if present should be a string withthe result of:
git -C PATH --no-pager log --reverse --name-status --oneline --pretty="format:COMMIT%x09%h"
For example, execute this command in remote and store the result in a file, read the content
with open('gitoutput', 'r') as file:
data = file.read()
and pass the result to get_history method:
gt.get_history(gitcommitlist=data)
In [5]:
with open('gitoutput', 'r') as file:
data = file.read()
gt.get_history(gitcommitlist=data)
We define a pandas DataFrame to contain all the files (Rows) and the status (Columns).
This Grid represent the status of each file at each step or commit.
The inizial stata for all the files is N or Non existent, they are updated in the sequential reding of git_history.all_commits object.
In [4]:
gt.definedatamatrix()
gt.datamatrix
Out[4]:
The data from the pandas DataFrame coul be visualized by this simple example routine.
The arguments are:
In [21]:
import matplotlib
from matplotlib import pyplot as plt
%matplotlib inline
In [22]:
gt.plot_history_df(plt,gt.datamatrix,size= 300, figsize = [12,10.5])
gt.plot_history_df(plt,gt.datamatrix,size= 300, figsize = [12,10.5],outpath=path+os.sep+'images/complete_visual_history.png')
In [24]:
# filtering the history on:
# a commit range
plot_df_commit_range = gt.datamatrix.ix[:,'a4cb9a1':'1222c5e']
gt.plot_history_df(plt,plot_df_commit_range,size= 300, figsize= [3,10])
gt.plot_history_df(plt,plot_df_commit_range,size= 300, figsize= [3,10], outpath=path+os.sep+'images/commit_range.png')
In [25]:
# filtering the history on:
# a file range: all files not ending with txt
plot_df_file_range = gt.datamatrix[~gt.datamatrix.index.str.contains('txt$')]
gt.plot_history_df(plt,plot_df_file_range,size= 300, figsize= [11.5,8.5])
gt.plot_history_df(plt,plot_df_file_range,size= 300, figsize= [11.5,8.5], outpath=path+os.sep+'images/file_range.png')
In [26]:
# filtering the history on:
# a commit range AND a file range: all files not ending with txt
plot_df_commit_file_range = gt.datamatrix.ix[:,'a4cb9a1':'1222c5e'][~gt.datamatrix.index.str.contains('txt$')]
gt.plot_history_df(plt,plot_df_commit_file_range,size= 300,figsize= [3.5,8.5])
gt.plot_history_df(plt,plot_df_commit_file_range,size= 300,figsize= [3.5,8.5],outpath=path+os.sep+'images/commit_file_range.png')
In [27]:
# filtering the history on:
# a commit range AND a file range: all files not ending with txt
plot_df_state_filter = gt.datamatrix[gt.datamatrix[gt.datamatrix.columns[-1]] != 'N']
gt.plot_history_df(plt,plot_df_state_filter,size= 300,figsize= [11,6])
gt.plot_history_df(plt,plot_df_state_filter,size= 300,figsize= [11,6],outpath=path+os.sep+'images/state_filter.png')