This will familiarize you with the different ways to access a GitRepo (or MultiGitRepo) object and how to use its data.
get_repo("https://github.com/sbenthall/bigbang.git", in_type = "remote" )
get_repo("~/urap/bigbang/archives/sample_git_repos/bigbang", in_type = "local" )
get_repo("bigbang", in_type = "name")
get_multi_repo(repo_names=["bigbang","django"])
get_multi_repo(repos=[{list of existing GitRepo objects}]
get_org_multirepo("glass-bead-labs")
As of now, repos are clones into archives/sample_git_repos/{repo_name}
. Their caches are stored at archives/sample_git_repos/{repo_name}_backup.csv
.
Caches are stored at archives/sample_git_repos/{repo_name}_backup.csv
. They are the dumped .csv
files of a GitRepo object's commit_data
attribute, which is a pandas dataframe of all commit information. We can initialize a GitRepo object by feeding the cache's Pandas dataframe into the GitRepo init function. However, the init function will need to do some processing before it can use the cache as its commit data. It needs to convert the "Touched File"
attribute of the cache dataframe from unicode "[file1, file2, file3]"
to an actual list ["file1", "file2", "file3"]
. It will also need to convert the time index of the cache from string to datetime.
Here, we can load in three ways. We can use a github url, a local path to a repo, or the name of a repo. All of these return a GitRepo
object.
A remote call to get_repo
will extract the repo's name from its git url. Thus, https://github.com/sbenthall/bigbang.git
will yield bigbang
as its name. It will check if the repo already exists. If it doesn't it will send a shell command to clone the remote repository to archives/sample_git_repos/{repo_name}
. It will then return get_repo({name}, in_type="name")
. Before returning, however, it will cache the GitRepo object at archives/sample_git_repos/{repo_name}_backup.csv
to make loading faster the next time.
A local call is the simplest. It will first extract the repo name from the filepath. Thus, ~/urap/bigbang/archives/sample_git_repos/bigbang
will yield bigbang
. It will check to see if a git repo exists at the given address. If it does, it will initialize a GitPython object, which only needs a name and a filepath to a Git repo. Note that this option does not check or create a cache.
This is the preferred and easiest way to load a git repository. It works under the assumptions above about where a git repo and its cache should be stored. It will check to see if a cache exists. If it does, then it will load a GitPython object using that cache.
If a cache is not found, then the function constructs a filepath from the name, using the above rule about where repo locations. It will pass off the function to get_repo(filepath, in_type="local")
. Before returning the answer, it will cache the result.
In [1]:
from bigbang import repo_loader # The file that handles most loading
repo = repo_loader.get_repo("https://github.com/sbenthall/bigbang.git", in_type = "remote" )
# repo = repo_loader.get_repo("../", in_type = "local" ) # I commented this out because it may take too long
repo = repo_loader.get_repo("bigbang", in_type = "name")
repo.commit_data
Out[1]:
Unnamed: 0
Commit Message
Committer Email
Committer Name
HEXSHA
Parent Commit
Time
Touched File
Person-ID
0
2015-04-13 22:49:33
Merge pull request #195 from jesscxu/master\n\...
sbenthall@gmail.com
Sebastian Benthall
e6f985d15ff4736a08e2112b6c7ff0c0d0836a75
[02d30c7ba4b02e899c4f098531812ca390983c0b, 5b5...
2015-04-13 22:49:33
[examples/viz/git/glass.json, examples/viz/git...
1
1
2015-04-13 22:44:21
Adding d3 visualization of GitDiff.ipynb graph\n
jcxu@berkeley.edu
Jessica Xu
5b54cc96d652a07b12b5c31d4f5ad5269e1aec37
[02d30c7ba4b02e899c4f098531812ca390983c0b]
2015-04-13 22:44:21
[examples/viz/git/glass.json, examples/viz/git...
2
2
2015-04-10 21:59:33
Merge pull request #194 from vsporeddy/master\...
sbenthall@gmail.com
Sebastian Benthall
02d30c7ba4b02e899c4f098531812ca390983c0b
[3723718c356155a8c2c2104e813d61263a1f23c7, 2ec...
2015-04-10 21:59:33
[examples/File Dependency Network.ipynb]
1
3
2015-04-10 18:19:22
Changed to directed graph
vs.poreddy@gmail.com
Venkata Poreddy
2ec31ee60878a08e5738dfa40245740e79dde97c
[f5316bf07da3d4d51ac3bc1875b24d10693daa02]
2015-04-10 18:19:22
[examples/File Dependency Network.ipynb]
3
4
2015-04-10 18:18:13
Merge pull request #3 from sbenthall/master\n\...
vs.poreddy@gmail.com
Venkata Poreddy
f5316bf07da3d4d51ac3bc1875b24d10693daa02
[9aacab2a8eb5e7eabcb227caea5a82d99e5f8835, 372...
2015-04-10 18:18:13
[bigbang/git_repo.py, bigbang/repo_loader.py]
3
5
2015-04-10 17:54:34
Merge pull request #192 from Aryan-Barbarian/m...
sbenthall@gmail.com
Sebastian Benthall
3723718c356155a8c2c2104e813d61263a1f23c7
[a22c55ea0887bdff8f62e50d2abdca02f6fdbce6, ed6...
2015-04-10 17:54:34
[bigbang/git_repo.py, bigbang/repo_loader.py]
1
6
2015-04-10 17:53:13
Merge pull request #193 from vsporeddy/master\...
sbenthall@gmail.com
Sebastian Benthall
a22c55ea0887bdff8f62e50d2abdca02f6fdbce6
[2b1f678c8ad75458b6a6b7484bed0ca72baee298, 9aa...
2015-04-10 17:53:13
[bigbang/get_dependencies.py, examples/File De...
1
7
2015-04-10 17:30:29
Fixed an issue where git repos with hyphens in...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
ed60740e26981e216542a258c0c5aa0afa50af95
[8dac7fc397738b057d7fbdcd2bea1552e6f88339]
2015-04-10 17:30:29
[bigbang/repo_loader.py]
4
8
2015-04-10 16:55:36
Update File Dependency Network.ipynb
vs.poreddy@gmail.com
Venkata Poreddy
9aacab2a8eb5e7eabcb227caea5a82d99e5f8835
[465c3a275bc341e2dab9d43c0363c2a7fff59b15]
2015-04-10 16:55:36
[examples/File Dependency Network.ipynb]
3
9
2015-04-10 16:54:44
Create get_dependencies.py
vs.poreddy@gmail.com
Venkata Poreddy
465c3a275bc341e2dab9d43c0363c2a7fff59b15
[95e074b3e32017adf92e74a8fb19e471bf95f1ee]
2015-04-10 16:54:44
[bigbang/get_dependencies.py]
3
10
2015-04-10 16:53:57
Update requirements.txt
vs.poreddy@gmail.com
Venkata Poreddy
95e074b3e32017adf92e74a8fb19e471bf95f1ee
[68a5743f1cfe1241cb2608739418850b0b285360]
2015-04-10 16:53:57
[requirements.txt]
3
11
2015-04-10 16:53:31
Create File Dependency Network.ipynb
vs.poreddy@gmail.com
Venkata Poreddy
68a5743f1cfe1241cb2608739418850b0b285360
[be536710f94ec072e04431e7cd043ad24f5f1afb]
2015-04-10 16:53:31
[examples/File Dependency Network.ipynb]
3
12
2015-04-10 16:18:26
Merge pull request #2 from sbenthall/master\n\...
vs.poreddy@gmail.com
Venkata Poreddy
be536710f94ec072e04431e7cd043ad24f5f1afb
[3287f61619d148ccb7deb77c4821812d1dc9cff0, 2b1...
2015-04-10 16:18:26
[.gitignore, README.md, bigbang/archive.py, bi...
3
13
2015-04-10 11:06:56
Warning people how long git diffs will take\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
8dac7fc397738b057d7fbdcd2bea1552e6f88339
[0db0b375fcb90522f6a8700d87820e8fd91e5343]
2015-04-10 11:06:56
[bigbang/git_repo.py]
4
14
2015-04-10 10:56:57
Fixed another bug with repo loading logic\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
0db0b375fcb90522f6a8700d87820e8fd91e5343
[a121a04579461d4a520fbe4113f0cd0b3a052911]
2015-04-10 10:56:57
[bigbang/git_repo.py, bigbang/repo_loader.py]
4
15
2015-04-10 10:35:54
Fixed repo loading bug. The answer fetched was...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
a121a04579461d4a520fbe4113f0cd0b3a052911
[2b1f678c8ad75458b6a6b7484bed0ca72baee298]
2015-04-10 10:35:54
[bigbang/git_repo.py, bigbang/repo_loader.py]
4
16
2015-04-06 23:30:06
Merge pull request #190 from dwins/setting_wit...
sbenthall@gmail.com
Sebastian Benthall
2b1f678c8ad75458b6a6b7484bed0ca72baee298
[48dfc9b5472471b5a8768f56566c6246c63aa3fe, c03...
2015-04-06 23:30:06
[bigbang/archive.py]
1
17
2015-04-06 23:21:00
Merge branch 'raj4-master'\n
sbenthall@gmail.com
sb
48dfc9b5472471b5a8768f56566c6246c63aa3fe
[ff0a46b3afac4995517d7dc0ad1281f457e818b4, bc5...
2015-04-06 23:21:00
[examples/Collaboration Robustness.ipynb]
1
18
2015-04-06 23:20:37
Merge branch 'master' of https://github.com/ra...
sbenthall@gmail.com
sb
bc5ccc1fe3034f939ef2f74789a949d2f3604694
[ff0a46b3afac4995517d7dc0ad1281f457e818b4, 039...
2015-04-06 23:20:37
[examples/Collaboration Robustness.ipynb]
1
19
2015-04-06 23:13:58
Merge branch 'cool9210-master'\n
sbenthall@gmail.com
sb
ff0a46b3afac4995517d7dc0ad1281f457e818b4
[6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e, 505...
2015-04-06 23:13:58
[bigbang/twopeople.py]
1
20
2015-04-06 23:13:27
Merge branch 'master' of https://github.com/co...
sbenthall@gmail.com
sb
505689d8494bab11e69f0687364dbba2a461b532
[6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e, 3fa...
2015-04-06 23:13:27
[bigbang/twopeople.py]
1
21
2015-04-03 21:41:36
Avoid SettingWithCopyWarning\n\nfixes #162\n
cdwinslow@gmail.com
David Winslow
c03e3d20fae49a6d2f0458a4132af557b7ec355b
[6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e]
2015-04-03 21:41:36
[bigbang/archive.py]
5
22
2015-04-02 23:45:44
committing twopeople\n
kdkim@berkeley.edu
Ki Deuk Kim
3fa34b21dc5e7d6c7a7154fcda9473f4b0f18f93
[e57bd1d4a81466b73027808d1f55fb9b4c671072]
2015-04-02 23:45:44
[bigbang/twopeople.py]
6
23
2015-04-02 23:26:23
updated robustness notebook\n
r.agrawal@berkeley.edu
Raj Agrawal
039df37b77929fe52b183dfbf436254b95a4742d
[a69e75b9e36afaf1a1b7af1f51ef00e9c3468095]
2015-04-02 23:26:23
[bigbang/twopeople.py, examples/Collaboration ...
7
24
2015-04-01 04:14:15
Merge branch 'dwins-email_character_sets'\n
sbenthall@gmail.com
sb
6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e
[05d773f13331693d796a75daac2529b2efb8ccff, 561...
2015-04-01 04:14:15
[bigbang/mailman.py]
1
25
2015-03-31 20:34:42
Consistently represent email data as Unicode\n
cdwinslow@gmail.com
David Winslow
56140670a9f627e226d449c17d29544be6f5598d
[05d773f13331693d796a75daac2529b2efb8ccff]
2015-03-31 20:34:42
[bigbang/mailman.py]
5
26
2015-03-31 04:50:46
changing type attribute to be keyed to string ...
sbenthall@gmail.com
sb
05d773f13331693d796a75daac2529b2efb8ccff
[3e1c1f07f1b0d4a55751405b65004bd2b469945f]
2015-03-31 04:50:46
[examples/Git Diffs.ipynb]
1
27
2015-03-30 01:08:56
Merge pull request #182 from Aryan-Barbarian/g...
sbenthall@gmail.com
Sebastian Benthall
3e1c1f07f1b0d4a55751405b65004bd2b469945f
[11905640d44377fb0c007cd340ab780e408f2d10, a71...
2015-03-30 01:08:56
[.gitignore, README.md, bigbang/git_repo.py, b...
1
28
2015-03-24 04:43:47
Added the option to override the cache and for...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
a713fad3a49cbb803cac33b01cfa3283fe20840f
[225b0ee0c3b4db0cda06155eacc1b7d945572306]
2015-03-24 04:43:47
[bigbang/git_repo.py, bigbang/repo_loader.py, ...
4
29
2015-03-24 04:17:58
Fixed bugs relating to caching the data.\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
225b0ee0c3b4db0cda06155eacc1b7d945572306
[d51c62ea197eedbe3ff7ff63ebb2c1a9a497b21f]
2015-03-24 04:17:58
[bigbang/git_repo.py, bigbang/repo_loader.py, ...
4
30
2015-03-24 03:55:41
Repo Loader wasn't importing pandas\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
d51c62ea197eedbe3ff7ff63ebb2c1a9a497b21f
[c5919b8d0fc2482b172923e58e51dad54ff209f9]
2015-03-24 03:55:41
[bigbang/repo_loader.py]
4
31
2015-03-24 03:54:51
Repo Loader tries to cache now?\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
c5919b8d0fc2482b172923e58e51dad54ff209f9
[fa5688b0711d68ec0ffa436d7f31c73907c81e35]
2015-03-24 03:54:51
[bigbang/repo_loader.py]
4
32
2015-03-24 03:40:36
Git Repo takes flags for initialization now. N...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
fa5688b0711d68ec0ffa436d7f31c73907c81e35
[c886ee31fbd48f17afc1b3158983591a17389dfd]
2015-03-24 03:40:36
[bigbang/git_repo.py]
4
33
2015-03-22 19:51:47
Fixed issues in the ipython notebooks regardin...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
c886ee31fbd48f17afc1b3158983591a17389dfd
[d5187fadf9a8529bfc57ac9bade890cd7167a20b]
2015-03-22 19:51:47
[examples/Committer Dominance.ipynb, examples/...
4
34
2015-03-22 19:32:33
Moved git files into the main bigbang library....
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
d5187fadf9a8529bfc57ac9bade890cd7167a20b
[89de558656441f4f4e2ec16cc96d757c073d4772]
2015-03-22 19:32:33
[bigbang/git_repo.py, bigbang/repo_loader.py, ...
4
35
2015-03-17 21:26:03
Fixing the readme\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
89de558656441f4f4e2ec16cc96d757c073d4772
[befc9ba1742ca9cd8eb2dfc03be3289ab1d1a99d]
2015-03-17 21:26:03
[README.md]
4
36
2015-03-17 21:14:42
One more tweak to the README\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
befc9ba1742ca9cd8eb2dfc03be3289ab1d1a99d
[d0f9f1f7e62d9471b8aba0e52831bd93f7fb6501]
2015-03-17 21:14:42
[README.md]
4
37
2015-03-17 21:10:21
Updated README\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
d0f9f1f7e62d9471b8aba0e52831bd93f7fb6501
[b7c4d709b0a07972c90b336a0f7a667981416b7a]
2015-03-17 21:10:21
[README.md]
4
38
2015-03-17 20:41:16
The repo loader can now correctly fetch files.\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
b7c4d709b0a07972c90b336a0f7a667981416b7a
[974c7a2e1765365dd40705e6ae7b41d9f984a118]
2015-03-17 20:41:16
[git_data/RepoLoader.py]
4
39
2015-03-17 20:27:58
Small bug with repo loader\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
974c7a2e1765365dd40705e6ae7b41d9f984a118
[598cf71c6697e4e346894bb58dfbeb30bda3c4aa]
2015-03-17 20:27:58
[git_data/RepoLoader.py]
4
40
2015-03-17 20:26:12
RepoLoader generates the sample git directory ...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
598cf71c6697e4e346894bb58dfbeb30bda3c4aa
[a0f02f7f9a401c79815df5f5f52ca483dd6c007b]
2015-03-17 20:26:12
[git_data/RepoLoader.py]
4
41
2015-03-17 20:25:37
Moved a lot of git repo loading functionality ...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
a0f02f7f9a401c79815df5f5f52ca483dd6c007b
[8c102702f168ba86a8bb81802fe61db70361dfb0]
2015-03-17 20:25:37
[git_data/RepoLoader.py]
4
42
2015-03-17 20:06:49
Very rough first draft of repo loader\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
8c102702f168ba86a8bb81802fe61db70361dfb0
[296dd9a35d2aa006b8f8e9c32852b073e961b3bd]
2015-03-17 20:06:49
[bin/collect_git.py, git_data/RepoLoader.py]
4
43
2015-03-17 19:14:08
collect git now imports from Repository Loader\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
296dd9a35d2aa006b8f8e9c32852b073e961b3bd
[0ff39bd05a7b4b792459b991a0f726422c7d2ef0]
2015-03-17 19:14:08
[bin/collect_git.py, git_data/GitRepo.py, git_...
4
44
2015-03-17 18:53:17
Slight cleanup in collect git script\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
0ff39bd05a7b4b792459b991a0f726422c7d2ef0
[4f5104300b17035460a9f5e7819f8999da72e75b]
2015-03-17 18:53:17
[bin/collect_git.py]
4
45
2015-03-17 18:31:44
Merge remote-tracking branch 'upstream/master'...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
4f5104300b17035460a9f5e7819f8999da72e75b
[f54194242ea036274d788039e77b2619020434dd, 119...
2015-03-17 18:31:44
[bigbang/mailman.py, bigbang/twopeople.py, req...
4
46
2015-03-17 00:03:27
Merge branch 'raj4-master'\n
sbenthall@gmail.com
sb
11905640d44377fb0c007cd340ab780e408f2d10
[00f13d97385763b699b52b562fc204d80149098b, 9f6...
2015-03-17 00:03:27
[bigbang/twopeople.py]
1
47
2015-03-17 00:03:11
Merge branch 'master' of https://github.com/ra...
sbenthall@gmail.com
sb
9f6c74e01dbbdd14468befa8cde1de82d08d7935
[00f13d97385763b699b52b562fc204d80149098b, a69...
2015-03-17 00:03:11
[bigbang/twopeople.py]
1
48
2015-03-16 23:56:02
functions to create df\n
r.agrawal@berkeley.edu
Raj Agrawal
a69e75b9e36afaf1a1b7af1f51ef00e9c3468095
[847720442d7cab223a6c83f0bd9db37ca28bdfbd]
2015-03-16 23:56:02
[bigbang/twopeople.py]
7
49
2015-03-14 20:18:01
fixing variable reference in data collection e...
sbenthall@gmail.com
sb
00f13d97385763b699b52b562fc204d80149098b
[701212ecb79f1b400c2e293d98ff582c750532d0]
2015-03-14 20:18:01
[bigbang/mailman.py]
1
50
2015-03-12 21:51:25
adding jsonschema as a pip requirement\n
sbenthall@gmail.com
sb
701212ecb79f1b400c2e293d98ff582c750532d0
[aef98ed18e82a52ca4dfc593769f99f4618f8edb]
2015-03-12 21:51:25
[requirements.txt]
1
51
2015-03-10 20:33:56
git will now ignore the git_locals.json file, ...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
f54194242ea036274d788039e77b2619020434dd
[aef98ed18e82a52ca4dfc593769f99f4618f8edb]
2015-03-10 20:33:56
[.gitignore]
4
52
2015-03-10 00:06:20
Merge branch 'cool9210-master'\n
sbenthall@gmail.com
sb
aef98ed18e82a52ca4dfc593769f99f4618f8edb
[a87af8aed3e0e2fb964579b8a7144361d4c19d2f, e57...
2015-03-10 00:06:20
[examples/Collaboration Robustness.ipynb]
1
53
2015-03-10 00:01:08
Merge branch 'master' of https://github.com/co...
kdkim@berkeley.edu
Ki Deuk Kim
e57bd1d4a81466b73027808d1f55fb9b4c671072
[0547569578a496cf80d153ca9cf2d20849c1736c, 4ba...
2015-03-10 00:01:08
[]
6
54
2015-03-09 23:52:56
This change is adding duration, reciprocity, a...
kdkim@berkeley.edu
Ki Deuk Kim
0547569578a496cf80d153ca9cf2d20849c1736c
[a87af8aed3e0e2fb964579b8a7144361d4c19d2f]
2015-03-09 23:52:56
[examples/Collaboration Robustness.ipynb]
6
55
2015-03-09 23:40:02
Merge branch 'raj4-master'\n
sbenthall@gmail.com
sb
a87af8aed3e0e2fb964579b8a7144361d4c19d2f
[8c450a41c5446db94c0cff7151a8ef2297c43a07, 847...
2015-03-09 23:40:02
[bigbang/twopeople.py]
1
56
2015-03-09 23:31:02
first commit\n
r.agrawal@berkeley.edu
Raj Agrawal
847720442d7cab223a6c83f0bd9db37ca28bdfbd
[8c450a41c5446db94c0cff7151a8ef2297c43a07]
2015-03-09 23:31:02
[bigbang/twopeople.py]
7
57
2015-03-09 23:20:54
Create twopeople.py
kdkim@berkeley.edu
Ki Deuk Kim
4ba2d1df3cb06eec91795ff22489b5533690dcfa
[8c450a41c5446db94c0cff7151a8ef2297c43a07]
2015-03-09 23:20:54
[bigbang/twopeople.py]
6
58
2015-03-05 22:47:56
Merge branch 'vsporeddy'\n
sbenthall@gmail.com
sb
8c450a41c5446db94c0cff7151a8ef2297c43a07
[0b47f504de03817db97e0d3556c98f7c252bc0f9, fef...
2015-03-05 22:47:56
[examples/Git Diffs.ipynb]
1
59
2015-03-04 05:48:45
Update Git Diffs.ipynb\n\nAdded node colors an...
vs.poreddy@gmail.com
Venkata Poreddy
fefb82dbc2b827cafb47edea9678f43f2a411681
[0b47f504de03817db97e0d3556c98f7c252bc0f9]
2015-03-04 05:48:45
[examples/Git Diffs.ipynb]
3
...
...
...
...
...
...
...
...
...
372 rows × 9 columns
These are the ways we can get MultiGitRepo objects. MultiGitRepo objects are GitRepos that were created with a list of GitRepos. Basically, a MultiGitRepo's commit_data
contains the commit_data from all of its GitRepos. The only difference is that each entry has an extra attribute, Repo Name
that tells us which Repo that commit is initially from.
get_multi_repo
)This is rather simple. We can call the get_multi_repo
method with either a list of repo names ["bigbang", "django", "scipy"]
or a list of actual GitRepo objects. This returns us the merged MultiGitRepo. Please note that this will not work if a local clone / cache of the repos does not exist for every repo name (e.g. if you ask for ["bigbang", "django", "scipy"]
, you must already have a local copy of those in your sample_git_repos directory.
get_org_multirepo
)This is more useful to us. We can use this method to get a MultiGitRepo that contains the information from every repo in a Github Organization. This requires that we input the organization's name exactly as it appears on Github (edX, glass-bead-labs, codeforamerica, etc.)
It will look for examples/{org_name}_urls.txt
, which should be a file that contains all of the git urls of the projects that belong to that organization. If this file doesn't yet exist, it will make a call to the Github API. This requires a stable internet connection, and it may randomly stall on requests that do not time out.
The function will then use the list of git urls and the get_repo
method to get each repo. It will use this list of repos to create a MultiGitRepo object, using get_multi_repo
.
Note that the examples below will not work if you don't have an internet connection, and may take some time to process. The first call may also fail if you do not have all of the repositories
In [2]:
# Using GitHub API
multirepo = repo_loader.get_org_multirepo("glass-bead-labs")
# List of repo names
multirepo = repo_loader.get_multi_repo(repo_names = ["bigbang","bead.glass"])
# List of actual repos
repo1 = repo_loader.get_repo("bigbang", in_type="name")
repo2 = repo_loader.get_repo("bead.glass", in_type="name")
multirepo = repo_loader.get_multi_repo(repos = [repo1, repo2])
multirepo.commit_data
Out[2]:
Unnamed: 0
Commit Message
Committer Email
Committer Name
HEXSHA
Parent Commit
Time
Touched File
Person-ID
Repo Name
0
2015-04-13 22:49:33
Merge pull request #195 from jesscxu/master\n\...
sbenthall@gmail.com
Sebastian Benthall
e6f985d15ff4736a08e2112b6c7ff0c0d0836a75
[02d30c7ba4b02e899c4f098531812ca390983c0b, 5b5...
2015-04-13 22:49:33
[examples/viz/git/glass.json, examples/viz/git...
1
bigbang
1
2015-04-13 22:44:21
Adding d3 visualization of GitDiff.ipynb graph\n
jcxu@berkeley.edu
Jessica Xu
5b54cc96d652a07b12b5c31d4f5ad5269e1aec37
[02d30c7ba4b02e899c4f098531812ca390983c0b]
2015-04-13 22:44:21
[examples/viz/git/glass.json, examples/viz/git...
2
bigbang
2
2015-04-10 21:59:33
Merge pull request #194 from vsporeddy/master\...
sbenthall@gmail.com
Sebastian Benthall
02d30c7ba4b02e899c4f098531812ca390983c0b
[3723718c356155a8c2c2104e813d61263a1f23c7, 2ec...
2015-04-10 21:59:33
[examples/File Dependency Network.ipynb]
1
bigbang
3
2015-04-10 18:19:22
Changed to directed graph
vs.poreddy@gmail.com
Venkata Poreddy
2ec31ee60878a08e5738dfa40245740e79dde97c
[f5316bf07da3d4d51ac3bc1875b24d10693daa02]
2015-04-10 18:19:22
[examples/File Dependency Network.ipynb]
3
bigbang
4
2015-04-10 18:18:13
Merge pull request #3 from sbenthall/master\n\...
vs.poreddy@gmail.com
Venkata Poreddy
f5316bf07da3d4d51ac3bc1875b24d10693daa02
[9aacab2a8eb5e7eabcb227caea5a82d99e5f8835, 372...
2015-04-10 18:18:13
[bigbang/git_repo.py, bigbang/repo_loader.py]
3
bigbang
5
2015-04-10 17:54:34
Merge pull request #192 from Aryan-Barbarian/m...
sbenthall@gmail.com
Sebastian Benthall
3723718c356155a8c2c2104e813d61263a1f23c7
[a22c55ea0887bdff8f62e50d2abdca02f6fdbce6, ed6...
2015-04-10 17:54:34
[bigbang/git_repo.py, bigbang/repo_loader.py]
1
bigbang
6
2015-04-10 17:53:13
Merge pull request #193 from vsporeddy/master\...
sbenthall@gmail.com
Sebastian Benthall
a22c55ea0887bdff8f62e50d2abdca02f6fdbce6
[2b1f678c8ad75458b6a6b7484bed0ca72baee298, 9aa...
2015-04-10 17:53:13
[bigbang/get_dependencies.py, examples/File De...
1
bigbang
7
2015-04-10 17:30:29
Fixed an issue where git repos with hyphens in...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
ed60740e26981e216542a258c0c5aa0afa50af95
[8dac7fc397738b057d7fbdcd2bea1552e6f88339]
2015-04-10 17:30:29
[bigbang/repo_loader.py]
4
bigbang
8
2015-04-10 16:55:36
Update File Dependency Network.ipynb
vs.poreddy@gmail.com
Venkata Poreddy
9aacab2a8eb5e7eabcb227caea5a82d99e5f8835
[465c3a275bc341e2dab9d43c0363c2a7fff59b15]
2015-04-10 16:55:36
[examples/File Dependency Network.ipynb]
3
bigbang
9
2015-04-10 16:54:44
Create get_dependencies.py
vs.poreddy@gmail.com
Venkata Poreddy
465c3a275bc341e2dab9d43c0363c2a7fff59b15
[95e074b3e32017adf92e74a8fb19e471bf95f1ee]
2015-04-10 16:54:44
[bigbang/get_dependencies.py]
3
bigbang
10
2015-04-10 16:53:57
Update requirements.txt
vs.poreddy@gmail.com
Venkata Poreddy
95e074b3e32017adf92e74a8fb19e471bf95f1ee
[68a5743f1cfe1241cb2608739418850b0b285360]
2015-04-10 16:53:57
[requirements.txt]
3
bigbang
11
2015-04-10 16:53:31
Create File Dependency Network.ipynb
vs.poreddy@gmail.com
Venkata Poreddy
68a5743f1cfe1241cb2608739418850b0b285360
[be536710f94ec072e04431e7cd043ad24f5f1afb]
2015-04-10 16:53:31
[examples/File Dependency Network.ipynb]
3
bigbang
12
2015-04-10 16:18:26
Merge pull request #2 from sbenthall/master\n\...
vs.poreddy@gmail.com
Venkata Poreddy
be536710f94ec072e04431e7cd043ad24f5f1afb
[3287f61619d148ccb7deb77c4821812d1dc9cff0, 2b1...
2015-04-10 16:18:26
[.gitignore, README.md, bigbang/archive.py, bi...
3
bigbang
13
2015-04-10 11:06:56
Warning people how long git diffs will take\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
8dac7fc397738b057d7fbdcd2bea1552e6f88339
[0db0b375fcb90522f6a8700d87820e8fd91e5343]
2015-04-10 11:06:56
[bigbang/git_repo.py]
4
bigbang
14
2015-04-10 10:56:57
Fixed another bug with repo loading logic\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
0db0b375fcb90522f6a8700d87820e8fd91e5343
[a121a04579461d4a520fbe4113f0cd0b3a052911]
2015-04-10 10:56:57
[bigbang/git_repo.py, bigbang/repo_loader.py]
4
bigbang
15
2015-04-10 10:35:54
Fixed repo loading bug. The answer fetched was...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
a121a04579461d4a520fbe4113f0cd0b3a052911
[2b1f678c8ad75458b6a6b7484bed0ca72baee298]
2015-04-10 10:35:54
[bigbang/git_repo.py, bigbang/repo_loader.py]
4
bigbang
16
2015-04-06 23:30:06
Merge pull request #190 from dwins/setting_wit...
sbenthall@gmail.com
Sebastian Benthall
2b1f678c8ad75458b6a6b7484bed0ca72baee298
[48dfc9b5472471b5a8768f56566c6246c63aa3fe, c03...
2015-04-06 23:30:06
[bigbang/archive.py]
1
bigbang
17
2015-04-06 23:21:00
Merge branch 'raj4-master'\n
sbenthall@gmail.com
sb
48dfc9b5472471b5a8768f56566c6246c63aa3fe
[ff0a46b3afac4995517d7dc0ad1281f457e818b4, bc5...
2015-04-06 23:21:00
[examples/Collaboration Robustness.ipynb]
1
bigbang
18
2015-04-06 23:20:37
Merge branch 'master' of https://github.com/ra...
sbenthall@gmail.com
sb
bc5ccc1fe3034f939ef2f74789a949d2f3604694
[ff0a46b3afac4995517d7dc0ad1281f457e818b4, 039...
2015-04-06 23:20:37
[examples/Collaboration Robustness.ipynb]
1
bigbang
19
2015-04-06 23:13:58
Merge branch 'cool9210-master'\n
sbenthall@gmail.com
sb
ff0a46b3afac4995517d7dc0ad1281f457e818b4
[6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e, 505...
2015-04-06 23:13:58
[bigbang/twopeople.py]
1
bigbang
20
2015-04-06 23:13:27
Merge branch 'master' of https://github.com/co...
sbenthall@gmail.com
sb
505689d8494bab11e69f0687364dbba2a461b532
[6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e, 3fa...
2015-04-06 23:13:27
[bigbang/twopeople.py]
1
bigbang
21
2015-04-03 21:41:36
Avoid SettingWithCopyWarning\n\nfixes #162\n
cdwinslow@gmail.com
David Winslow
c03e3d20fae49a6d2f0458a4132af557b7ec355b
[6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e]
2015-04-03 21:41:36
[bigbang/archive.py]
5
bigbang
22
2015-04-02 23:45:44
committing twopeople\n
kdkim@berkeley.edu
Ki Deuk Kim
3fa34b21dc5e7d6c7a7154fcda9473f4b0f18f93
[e57bd1d4a81466b73027808d1f55fb9b4c671072]
2015-04-02 23:45:44
[bigbang/twopeople.py]
6
bigbang
23
2015-04-02 23:26:23
updated robustness notebook\n
r.agrawal@berkeley.edu
Raj Agrawal
039df37b77929fe52b183dfbf436254b95a4742d
[a69e75b9e36afaf1a1b7af1f51ef00e9c3468095]
2015-04-02 23:26:23
[bigbang/twopeople.py, examples/Collaboration ...
7
bigbang
24
2015-04-01 04:14:15
Merge branch 'dwins-email_character_sets'\n
sbenthall@gmail.com
sb
6856dc4b4b7ce515c34c180f5ff72dd1b2676b1e
[05d773f13331693d796a75daac2529b2efb8ccff, 561...
2015-04-01 04:14:15
[bigbang/mailman.py]
1
bigbang
25
2015-03-31 20:34:42
Consistently represent email data as Unicode\n
cdwinslow@gmail.com
David Winslow
56140670a9f627e226d449c17d29544be6f5598d
[05d773f13331693d796a75daac2529b2efb8ccff]
2015-03-31 20:34:42
[bigbang/mailman.py]
5
bigbang
26
2015-03-31 04:50:46
changing type attribute to be keyed to string ...
sbenthall@gmail.com
sb
05d773f13331693d796a75daac2529b2efb8ccff
[3e1c1f07f1b0d4a55751405b65004bd2b469945f]
2015-03-31 04:50:46
[examples/Git Diffs.ipynb]
1
bigbang
27
2015-03-30 01:08:56
Merge pull request #182 from Aryan-Barbarian/g...
sbenthall@gmail.com
Sebastian Benthall
3e1c1f07f1b0d4a55751405b65004bd2b469945f
[11905640d44377fb0c007cd340ab780e408f2d10, a71...
2015-03-30 01:08:56
[.gitignore, README.md, bigbang/git_repo.py, b...
1
bigbang
28
2015-03-24 04:43:47
Added the option to override the cache and for...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
a713fad3a49cbb803cac33b01cfa3283fe20840f
[225b0ee0c3b4db0cda06155eacc1b7d945572306]
2015-03-24 04:43:47
[bigbang/git_repo.py, bigbang/repo_loader.py, ...
4
bigbang
29
2015-03-24 04:17:58
Fixed bugs relating to caching the data.\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
225b0ee0c3b4db0cda06155eacc1b7d945572306
[d51c62ea197eedbe3ff7ff63ebb2c1a9a497b21f]
2015-03-24 04:17:58
[bigbang/git_repo.py, bigbang/repo_loader.py, ...
4
bigbang
30
2015-03-24 03:55:41
Repo Loader wasn't importing pandas\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
d51c62ea197eedbe3ff7ff63ebb2c1a9a497b21f
[c5919b8d0fc2482b172923e58e51dad54ff209f9]
2015-03-24 03:55:41
[bigbang/repo_loader.py]
4
bigbang
31
2015-03-24 03:54:51
Repo Loader tries to cache now?\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
c5919b8d0fc2482b172923e58e51dad54ff209f9
[fa5688b0711d68ec0ffa436d7f31c73907c81e35]
2015-03-24 03:54:51
[bigbang/repo_loader.py]
4
bigbang
32
2015-03-24 03:40:36
Git Repo takes flags for initialization now. N...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
fa5688b0711d68ec0ffa436d7f31c73907c81e35
[c886ee31fbd48f17afc1b3158983591a17389dfd]
2015-03-24 03:40:36
[bigbang/git_repo.py]
4
bigbang
33
2015-03-22 19:51:47
Fixed issues in the ipython notebooks regardin...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
c886ee31fbd48f17afc1b3158983591a17389dfd
[d5187fadf9a8529bfc57ac9bade890cd7167a20b]
2015-03-22 19:51:47
[examples/Committer Dominance.ipynb, examples/...
4
bigbang
34
2015-03-22 19:32:33
Moved git files into the main bigbang library....
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
d5187fadf9a8529bfc57ac9bade890cd7167a20b
[89de558656441f4f4e2ec16cc96d757c073d4772]
2015-03-22 19:32:33
[bigbang/git_repo.py, bigbang/repo_loader.py, ...
4
bigbang
35
2015-03-17 21:26:03
Fixing the readme\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
89de558656441f4f4e2ec16cc96d757c073d4772
[befc9ba1742ca9cd8eb2dfc03be3289ab1d1a99d]
2015-03-17 21:26:03
[README.md]
4
bigbang
36
2015-03-17 21:14:42
One more tweak to the README\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
befc9ba1742ca9cd8eb2dfc03be3289ab1d1a99d
[d0f9f1f7e62d9471b8aba0e52831bd93f7fb6501]
2015-03-17 21:14:42
[README.md]
4
bigbang
37
2015-03-17 21:10:21
Updated README\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
d0f9f1f7e62d9471b8aba0e52831bd93f7fb6501
[b7c4d709b0a07972c90b336a0f7a667981416b7a]
2015-03-17 21:10:21
[README.md]
4
bigbang
38
2015-03-17 20:41:16
The repo loader can now correctly fetch files.\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
b7c4d709b0a07972c90b336a0f7a667981416b7a
[974c7a2e1765365dd40705e6ae7b41d9f984a118]
2015-03-17 20:41:16
[git_data/RepoLoader.py]
4
bigbang
39
2015-03-17 20:27:58
Small bug with repo loader\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
974c7a2e1765365dd40705e6ae7b41d9f984a118
[598cf71c6697e4e346894bb58dfbeb30bda3c4aa]
2015-03-17 20:27:58
[git_data/RepoLoader.py]
4
bigbang
40
2015-03-17 20:26:12
RepoLoader generates the sample git directory ...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
598cf71c6697e4e346894bb58dfbeb30bda3c4aa
[a0f02f7f9a401c79815df5f5f52ca483dd6c007b]
2015-03-17 20:26:12
[git_data/RepoLoader.py]
4
bigbang
41
2015-03-17 20:25:37
Moved a lot of git repo loading functionality ...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
a0f02f7f9a401c79815df5f5f52ca483dd6c007b
[8c102702f168ba86a8bb81802fe61db70361dfb0]
2015-03-17 20:25:37
[git_data/RepoLoader.py]
4
bigbang
42
2015-03-17 20:06:49
Very rough first draft of repo loader\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
8c102702f168ba86a8bb81802fe61db70361dfb0
[296dd9a35d2aa006b8f8e9c32852b073e961b3bd]
2015-03-17 20:06:49
[bin/collect_git.py, git_data/RepoLoader.py]
4
bigbang
43
2015-03-17 19:14:08
collect git now imports from Repository Loader\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
296dd9a35d2aa006b8f8e9c32852b073e961b3bd
[0ff39bd05a7b4b792459b991a0f726422c7d2ef0]
2015-03-17 19:14:08
[bin/collect_git.py, git_data/GitRepo.py, git_...
4
bigbang
44
2015-03-17 18:53:17
Slight cleanup in collect git script\n
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
0ff39bd05a7b4b792459b991a0f726422c7d2ef0
[4f5104300b17035460a9f5e7819f8999da72e75b]
2015-03-17 18:53:17
[bin/collect_git.py]
4
bigbang
45
2015-03-17 18:31:44
Merge remote-tracking branch 'upstream/master'...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
4f5104300b17035460a9f5e7819f8999da72e75b
[f54194242ea036274d788039e77b2619020434dd, 119...
2015-03-17 18:31:44
[bigbang/mailman.py, bigbang/twopeople.py, req...
4
bigbang
46
2015-03-17 00:03:27
Merge branch 'raj4-master'\n
sbenthall@gmail.com
sb
11905640d44377fb0c007cd340ab780e408f2d10
[00f13d97385763b699b52b562fc204d80149098b, 9f6...
2015-03-17 00:03:27
[bigbang/twopeople.py]
1
bigbang
47
2015-03-17 00:03:11
Merge branch 'master' of https://github.com/ra...
sbenthall@gmail.com
sb
9f6c74e01dbbdd14468befa8cde1de82d08d7935
[00f13d97385763b699b52b562fc204d80149098b, a69...
2015-03-17 00:03:11
[bigbang/twopeople.py]
1
bigbang
48
2015-03-16 23:56:02
functions to create df\n
r.agrawal@berkeley.edu
Raj Agrawal
a69e75b9e36afaf1a1b7af1f51ef00e9c3468095
[847720442d7cab223a6c83f0bd9db37ca28bdfbd]
2015-03-16 23:56:02
[bigbang/twopeople.py]
7
bigbang
49
2015-03-14 20:18:01
fixing variable reference in data collection e...
sbenthall@gmail.com
sb
00f13d97385763b699b52b562fc204d80149098b
[701212ecb79f1b400c2e293d98ff582c750532d0]
2015-03-14 20:18:01
[bigbang/mailman.py]
1
bigbang
50
2015-03-12 21:51:25
adding jsonschema as a pip requirement\n
sbenthall@gmail.com
sb
701212ecb79f1b400c2e293d98ff582c750532d0
[aef98ed18e82a52ca4dfc593769f99f4618f8edb]
2015-03-12 21:51:25
[requirements.txt]
1
bigbang
51
2015-03-10 20:33:56
git will now ignore the git_locals.json file, ...
aryan.falahatpisheh@berkeley.edu
Aryan Falahatpisheh
f54194242ea036274d788039e77b2619020434dd
[aef98ed18e82a52ca4dfc593769f99f4618f8edb]
2015-03-10 20:33:56
[.gitignore]
4
bigbang
52
2015-03-10 00:06:20
Merge branch 'cool9210-master'\n
sbenthall@gmail.com
sb
aef98ed18e82a52ca4dfc593769f99f4618f8edb
[a87af8aed3e0e2fb964579b8a7144361d4c19d2f, e57...
2015-03-10 00:06:20
[examples/Collaboration Robustness.ipynb]
1
bigbang
53
2015-03-10 00:01:08
Merge branch 'master' of https://github.com/co...
kdkim@berkeley.edu
Ki Deuk Kim
e57bd1d4a81466b73027808d1f55fb9b4c671072
[0547569578a496cf80d153ca9cf2d20849c1736c, 4ba...
2015-03-10 00:01:08
[]
6
bigbang
54
2015-03-09 23:52:56
This change is adding duration, reciprocity, a...
kdkim@berkeley.edu
Ki Deuk Kim
0547569578a496cf80d153ca9cf2d20849c1736c
[a87af8aed3e0e2fb964579b8a7144361d4c19d2f]
2015-03-09 23:52:56
[examples/Collaboration Robustness.ipynb]
6
bigbang
55
2015-03-09 23:40:02
Merge branch 'raj4-master'\n
sbenthall@gmail.com
sb
a87af8aed3e0e2fb964579b8a7144361d4c19d2f
[8c450a41c5446db94c0cff7151a8ef2297c43a07, 847...
2015-03-09 23:40:02
[bigbang/twopeople.py]
1
bigbang
56
2015-03-09 23:31:02
first commit\n
r.agrawal@berkeley.edu
Raj Agrawal
847720442d7cab223a6c83f0bd9db37ca28bdfbd
[8c450a41c5446db94c0cff7151a8ef2297c43a07]
2015-03-09 23:31:02
[bigbang/twopeople.py]
7
bigbang
57
2015-03-09 23:20:54
Create twopeople.py
kdkim@berkeley.edu
Ki Deuk Kim
4ba2d1df3cb06eec91795ff22489b5533690dcfa
[8c450a41c5446db94c0cff7151a8ef2297c43a07]
2015-03-09 23:20:54
[bigbang/twopeople.py]
6
bigbang
58
2015-03-05 22:47:56
Merge branch 'vsporeddy'\n
sbenthall@gmail.com
sb
8c450a41c5446db94c0cff7151a8ef2297c43a07
[0b47f504de03817db97e0d3556c98f7c252bc0f9, fef...
2015-03-05 22:47:56
[examples/Git Diffs.ipynb]
1
bigbang
59
2015-03-04 05:48:45
Update Git Diffs.ipynb\n\nAdded node colors an...
vs.poreddy@gmail.com
Venkata Poreddy
fefb82dbc2b827cafb47edea9678f43f2a411681
[0b47f504de03817db97e0d3556c98f7c252bc0f9]
2015-03-04 05:48:45
[examples/Git Diffs.ipynb]
3
bigbang
...
...
...
...
...
...
...
...
...
...
412 rows × 10 columns
Content source: npdoty/bigbang
Similar notebooks: