The website http://ghtorrent.org/ maintains a mirror of GitHub's public data in a large Mongo Database.
We access and query this database with Blaze.
In [1]:
import blaze
from blaze import Table, into
blaze.__version__
Out[1]:
'0.6.5'
We authenticate by tunneling into the server. We previously sent them an ssh key.
ssh -L 27017:dutihr.st.ewi.tudelft.nl:27017 ghtorrent@dutihr.st.ewi.tudelft.nl
In [2]:
users = Table('mongodb://ghtorrentro:ghtorrentro@localhost/github::users')
users
Out[2]:
avatar_url
bio
blog
company
created_at
email
followers
following
gravatar_id
hireable
html_url
id
location
login
name
public_gists
public_repos
type
url
0
https://secure.gravatar.com/avatar/a7e55f31bb4...
None
None
None
2012-05-04T13:59:54Z
None
0
0
a7e55f31bb45321f30211e901cd89ffa
None
https://github.com/Michaelwussler
1706010
None
Michaelwussler
None
0
3
User
https://api.github.com/users/Michaelwussler
1
https://secure.gravatar.com/avatar/eb8139078bc...
None
None
None
2012-05-03T18:47:13Z
None
0
0
eb8139078bc623dee103ed3917c080dc
None
https://github.com/praiser
1703505
None
praiser
None
0
3
User
https://api.github.com/users/praiser
2
https://secure.gravatar.com/avatar/13c7b665e0c...
None
2010-04-07T12:15:00Z
vad.viktor@gmail.com
2
3
13c7b665e0cbd94e0155387c35957d13
False
https://github.com/vadviktor
238703
Budapest
vadviktor
Vad Viktor
0
10
User
https://api.github.com/users/vadviktor
3
https://secure.gravatar.com/avatar/b7937805411...
None
Appcelerator
2012-04-02T16:13:58Z
yjin@appcelerator.com
0
0
b7937805411d278ceb839175e251e2a0
False
https://github.com/ypjin
1598831
Beijing
ypjin
Yuping
0
5
User
https://api.github.com/users/ypjin
4
https://secure.gravatar.com/avatar/89e109fca84...
http://blogs.perl.org/users/steven_haryanto
-
2010-02-26T01:28:09Z
stevenharyanto@gmail.com
39
307
89e109fca8474e5636c9feef7a8422ea
False
https://github.com/sharyanto
211084
Jakarta, Indonesia
sharyanto
Steven Haryanto
5
195
User
https://api.github.com/users/sharyanto
5
https://secure.gravatar.com/avatar/7490b4e3e9c...
Perl, C, C++, JavaScript, PHP, Haskell, Ruby, ...
http://c9s.me
2009-02-01T15:20:08Z
cornelius.howl@gmail.com
330
599
7490b4e3e9cb85a1f7dc0c8ea01a86e5
True
https://github.com/c9s
50894
Taipei, Taiwan
c9s
Yo-An Lin
281
206
User
https://api.github.com/users/c9s
6
https://secure.gravatar.com/avatar/dc078ac4dbd...
None
azhari.harahap.us
CapungRiders
2010-10-31T05:53:40Z
azhari@harahap.us
26
11
dc078ac4dbdc06d3e3c0ec0b6801b53d
False
https://github.com/back2arie
461397
Indonesia
back2arie
Azhari Harahap
1
15
User
https://api.github.com/users/back2arie
7
https://secure.gravatar.com/avatar/fb844ffed6c...
Git Ninja and language-agnostic problem solver...
http://dukeleto.pl
Leto Labs LLC
2008-10-22T03:02:15Z
jonathan@leto.net
175
635
fb844ffed6c5a2e69638627e3b721308
True
https://github.com/leto
30298
Portland, OR
leto
Jonathan "Duke" Leto
276
112
User
https://api.github.com/users/leto
8
https://secure.gravatar.com/avatar/3843ec7861e...
http://alanhaggai.org/
Thought Ripples
2009-01-13T16:25:15Z
haggai@cpan.org
46
365
3843ec7861e271e803ea076035d683dd
False
https://github.com/alanhaggai
46288
IN
alanhaggai
Alan Haggai Alavi
4
54
User
https://api.github.com/users/alanhaggai
9
https://secure.gravatar.com/avatar/f611628c558...
None
arisdottle.net
Team Rooster Pirates
2009-05-12T19:29:09Z
amiri@roosterpirates.com
16
87
f611628c5588f7a0a72c65ec1f94dfb8
False
https://github.com/amiri
83806
Los Angeles, CA
amiri
Amiri Barksdale
16
18
User
https://api.github.com/users/amiri
10
https://secure.gravatar.com/avatar/c57483c5cfe...
None
http://www.geekfarm.org/wu/muse/WebHome.html
None
2009-02-08T03:28:54Z
git-c@geekfarm.org
16
87
c57483c5cfe159b98a6e33ee7e9eec38
False
https://github.com/wu
52700
None
wu
Alex White
0
15
User
https://api.github.com/users/wu
In [3]:
users.company
Out[3]:
company
0
None
1
None
2
3
Appcelerator
4
-
5
6
CapungRiders
7
Leto Labs LLC
8
Thought Ripples
9
Team Rooster Pirates
10
None
In [4]:
users[users.followers > 100][['login', 'followers', 'following', 'blog']]
Out[4]:
login
followers
following
blog
0
c9s
330
599
http://c9s.me
1
leto
175
635
http://dukeleto.pl
2
bingos
125
277
http://use.perl.org/~bingos/journal/
3
chovy
1056
39044
http://anthony.ettinger.name
4
chapmanb
120
30
http://bcbio.wordpress.com
5
equus12
109
4801
None
6
carljm
177
34
http://www.oddbird.net
7
andrewsmedina
171
295
http://www.andrewsmedina.com
8
jbalogh
172
47
http://jbalogh.me
9
ametaireau
116
57
http://www.notmyidea.org
10
robhudson
239
99
http://rob.cogit8.org/
In [5]:
repos = Table('mongodb://ghtorrentro:ghtorrentro@localhost/github::repos')
repos
Out[5]:
clone_url
created_at
description
fork
forks
full_name
git_url
has_downloads
has_issues
has_wiki
homepage
html_url
id
language
master_branch
mirror_url
name
open_issues
organization
owner
parent
private
pushed_at
size
source
ssh_url
svn_url
updated_at
url
watchers
0
https://github.com/Michaelwussler/gittest.git
2012-07-12T10:41:03Z
False
1
Michaelwussler/gittest
git://github.com/Michaelwussler/gittest.git
True
True
True
None
https://github.com/Michaelwussler/gittest
5002137
Java
master
None
gittest
0
None
{u'url': u'https://api.github.com/users/Michae...
None
False
2012-07-12T11:40:07Z
164
None
git@github.com:Michaelwussler/gittest.git
https://github.com/Michaelwussler/gittest
2012-07-12T11:40:07Z
https://api.github.com/repos/Michaelwussler/gi...
1
1
https://github.com/sharyanto/perl-Task-BeLike-...
2011-03-16T15:06:38Z
Install modules currently used in SHARYANTO's ...
False
1
sharyanto/perl-Task-BeLike-SHARYANTO-Devel
git://github.com/sharyanto/perl-Task-BeLike-SH...
True
True
True
http://search.cpan.org/dist/Task-BeLike-SHARYA...
https://github.com/sharyanto/perl-Task-BeLike-...
1487560
Perl
master
None
perl-Task-BeLike-SHARYANTO-Devel
0
None
{u'url': u'https://api.github.com/users/sharya...
None
False
2012-07-12T11:35:03Z
608
None
git@github.com:sharyanto/perl-Task-BeLike-SHAR...
https://github.com/sharyanto/perl-Task-BeLike-...
2012-07-12T11:35:03Z
https://api.github.com/repos/sharyanto/perl-Ta...
1
2
https://github.com/Toolpark/irma.git
2012-03-20T11:31:16Z
False
1
Toolpark/irma
git://github.com/Toolpark/irma.git
True
True
True
https://github.com/Toolpark/irma
3774477
JavaScript
master
None
irma
0
{u'url': u'https://api.github.com/users/Toolpa...
{u'url': u'https://api.github.com/users/Toolpa...
None
False
2012-07-12T11:43:31Z
964
None
git@github.com:Toolpark/irma.git
https://github.com/Toolpark/irma
2012-07-12T11:43:31Z
https://api.github.com/repos/Toolpark/irma
2
3
https://github.com/hirakchatterjee/try_git.git
2012-07-12T11:19:45Z
None
False
1
hirakchatterjee/try_git
git://github.com/hirakchatterjee/try_git.git
True
True
True
None
https://github.com/hirakchatterjee/try_git
5002444
None
master
None
try_git
0
None
{u'url': u'https://api.github.com/users/hirakc...
None
False
2012-07-12T11:31:50Z
92
None
git@github.com:hirakchatterjee/try_git.git
https://github.com/hirakchatterjee/try_git
2012-07-12T11:31:50Z
https://api.github.com/repos/hirakchatterjee/t...
1
4
https://github.com/anirbansaha/inmobi_general_...
2012-07-10T05:37:49Z
inmobi_general_cookbooks
False
1
anirbansaha/inmobi_general_cookbooks
git://github.com/anirbansaha/inmobi_general_co...
True
True
True
None
https://github.com/anirbansaha/inmobi_general_...
4969515
Ruby
master
None
inmobi_general_cookbooks
0
None
{u'url': u'https://api.github.com/users/anirba...
None
False
2012-07-12T11:31:44Z
448
None
git@github.com:anirbansaha/inmobi_general_cook...
https://github.com/anirbansaha/inmobi_general_...
2012-07-12T11:31:44Z
https://api.github.com/repos/anirbansaha/inmob...
1
5
https://github.com/mmacedo/myapp.git
2012-07-05T21:09:14Z
Just test
False
1
mmacedo/myapp
git://github.com/mmacedo/myapp.git
True
False
False
None
https://github.com/mmacedo/myapp
4915307
Ruby
master
None
myapp
0
None
{u'url': u'https://api.github.com/users/mmaced...
None
False
2012-07-12T11:35:33Z
356
None
git@github.com:mmacedo/myapp.git
https://github.com/mmacedo/myapp
2012-07-12T11:35:33Z
https://api.github.com/repos/mmacedo/myapp
1
6
https://github.com/rotschopf/SSE.git
2012-05-18T11:38:07Z
False
1
rotschopf/SSE
git://github.com/rotschopf/SSE.git
True
False
False
None
https://github.com/rotschopf/SSE
4368710
VHDL
master
None
SSE
0
None
{u'url': u'https://api.github.com/users/rotsch...
None
False
2012-07-12T11:30:39Z
944
None
git@github.com:rotschopf/SSE.git
https://github.com/rotschopf/SSE
2012-07-12T11:30:39Z
https://api.github.com/repos/rotschopf/SSE
1
7
https://github.com/pokermania/engine.ns.io-cli...
2012-07-05T15:59:51Z
True
0
pokermania/engine.ns.io-client
git://github.com/pokermania/engine.ns.io-clien...
True
False
True
https://github.com/pokermania/engine.ns.io-client
4910102
CoffeeScript
master
None
engine.ns.io-client
0
{u'url': u'https://api.github.com/users/pokerm...
{u'url': u'https://api.github.com/users/pokerm...
{u'has_wiki': True, u'mirror_url': None, u'upd...
False
2012-07-12T11:31:40Z
112
{u'has_wiki': True, u'mirror_url': None, u'upd...
git@github.com:pokermania/engine.ns.io-client.git
https://github.com/pokermania/engine.ns.io-client
2012-07-12T11:31:41Z
https://api.github.com/repos/pokermania/engine...
1
8
https://github.com/trifork/dgws.git
2012-04-12T11:04:29Z
False
3
trifork/dgws
git://github.com/trifork/dgws.git
True
True
True
https://github.com/trifork/dgws
4003806
Java
develop
None
dgws
0
{u'url': u'https://api.github.com/users/trifor...
{u'url': u'https://api.github.com/users/trifor...
None
False
2012-07-12T11:40:57Z
168
None
git@github.com:trifork/dgws.git
https://github.com/trifork/dgws
2012-07-12T11:40:57Z
https://api.github.com/repos/trifork/dgws
4
9
https://github.com/fzoli/MillServer.git
2012-06-27T07:01:42Z
False
1
fzoli/MillServer
git://github.com/fzoli/MillServer.git
True
True
True
None
https://github.com/fzoli/MillServer
4805282
Java
master
None
MillServer
0
None
{u'url': u'https://api.github.com/users/fzoli'...
None
False
2012-07-12T11:31:32Z
75760
None
git@github.com:fzoli/MillServer.git
https://github.com/fzoli/MillServer
2012-07-12T11:31:32Z
https://api.github.com/repos/fzoli/MillServer
1
10
https://github.com/gkno/gkno.github.com.git
2012-02-23T21:46:20Z
False
2
gkno/gkno.github.com
git://github.com/gkno/gkno.github.com.git
True
True
True
gkno.github.com
https://github.com/gkno/gkno.github.com
3530198
None
master
None
gkno.github.com
1
{u'url': u'https://api.github.com/users/gkno',...
{u'url': u'https://api.github.com/users/gkno',...
None
False
2012-07-12T11:31:33Z
160
None
git@github.com:gkno/gkno.github.com.git
https://github.com/gkno/gkno.github.com
2012-07-12T11:31:33Z
https://api.github.com/repos/gkno/gkno.github.com
2
In [6]:
issues = Table('mongodb://ghtorrentro:ghtorrentro@localhost/github::issues')
issues
Out[6]:
assignee
body
closed_at
comments
comments_url
created_at
events_url
html_url
id
labels
labels_url
milestone
number
owner
pull_request
repo
state
title
updated_at
url
user
0
None
TweetLine is a Sublime Text 2 Plugin to post c...
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-20T15:51:49Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8509346
[]
https://api.github.com/repos/wbond/package_con...
None
809
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Add SublimeTweetLine
2012-11-20T15:52:20Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
1
None
Submitting a new package named AutoIndent whic...
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-20T08:16:05Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8496155
[]
https://api.github.com/repos/wbond/package_con...
None
808
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Added AutoIndent
2012-11-20T08:16:05Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
2
None
Adding support for my library of Sublime Text ...
None
8
https://api.github.com/repos/wbond/package_con...
2012-11-19T22:47:17Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8485997
[]
https://api.github.com/repos/wbond/package_con...
None
806
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Adding Dayle Rees Color Schemes
2012-11-20T06:17:02Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
3
None
Added SuperAnt
2012-10-02T02:34:26Z
0
None
2012-09-28T19:32:40Z
None
https://github.com/wbond/package_control_chann...
7226975
[]
None
None
657
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
closed
SuperANT
2012-10-02T02:34:26Z
https://api.github.com/repos/wbond/package_con...
{u'url': u'https://api.github.com/users/aphex'...
4
None
See readme for info!
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-19T19:27:28Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8479860
[]
https://api.github.com/repos/wbond/package_con...
None
805
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Adding Expand Selection by Paragraph Plugin
2012-11-19T19:27:28Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
5
None
Added JavaScript snippets from: https://github...
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-19T19:23:11Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8479724
[]
https://api.github.com/repos/wbond/package_con...
None
804
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Added JavaScript Snippets
2012-11-19T19:23:11Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
6
None
See [repository on GitHub](https://github.com/...
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-19T18:49:52Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8478712
[]
https://api.github.com/repos/wbond/package_con...
None
803
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Added ParentalControl Package
2012-11-19T18:49:52Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
7
None
IMESupport is a plugin to fix an issue of Subl...
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-19T15:50:14Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8472990
[]
https://api.github.com/repos/wbond/package_con...
None
802
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Add IMESupport plugin
2012-11-19T15:50:14Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
8
None
ThemeSelector is a Sublime Text 2 Plugin to se...
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-19T14:20:34Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8470181
[]
https://api.github.com/repos/wbond/package_con...
None
801
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
add ThemeSelector
2012-11-19T14:20:34Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
9
None
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-19T12:31:38Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8467611
[]
https://api.github.com/repos/wbond/package_con...
None
800
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Added Gauche Syntax
2012-11-19T12:31:38Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
10
None
None
0
https://api.github.com/repos/wbond/package_con...
2012-11-19T06:40:47Z
https://api.github.com/repos/wbond/package_con...
https://github.com/wbond/package_control_chann...
8460421
[]
https://api.github.com/repos/wbond/package_con...
None
799
wbond
{u'diff_url': u'https://github.com/wbond/packa...
package_control_channel
open
Adding LettuceFarmer plugin
2012-11-19T06:40:47Z
https://api.github.com/repos/wbond/package_con...
{u'following_url': u'https://api.github.com/us...
In [7]:
into(list,
issues[(issues.owner == 'ContinuumIO')
& (issues.repo == 'blaze')
& (issues.state == 'open')][['title', 'created_at']])
Out[7]:
[(u"Blaze docs can't be read on iPhone", u'2013-06-24T21:33:56Z'),
(u"Tiny fix about options parsing in the 'chunked dot' bench.",
u'2013-06-04T22:23:51Z'),
(u'Declaring dependencies', u'2013-04-19T05:59:02Z'),
(u'add a basic mailmap file', u'2013-04-12T18:57:20Z'),
(u'After following install instructions "blaze" module won\'t import',
u'2013-03-22T22:13:22Z'),
(u'Mailing list link on http://blaze.pydata.org/ points to GitHub not Google Groups',
u'2013-03-22T22:05:51Z'),
(u"Quickstart first example doesn't work as described",
u'2013-03-21T01:24:53Z'),
(u'Disagreement in size of "float" between blaze and numpy',
u'2013-03-16T17:37:33Z'),
(u'blaze.zeros() slowness', u'2013-03-13T16:06:19Z'),
(u'Opening CTable fails', u'2013-03-03T22:22:14Z'),
(u'fromiter silently catches exceptions thrown by generators, generates bad matrices',
u'2013-03-01T17:36:34Z'),
(u'Add complex32 support', u'2013-02-28T18:34:24Z'),
(u'persistence of tables seems to not be working', u'2013-02-27T18:03:07Z'),
(u'warnings when building extensions (at least on mac os x)',
u'2013-02-27T16:47:24Z'),
(u'Example in quick start docs does not work', u'2013-02-21T11:06:06Z'),
(u'Parsing datashapes with "type Name = ..." in them returns None',
u'2013-02-19T19:44:40Z'),
(u'Vlen implementation issues on Windows', u'2013-02-14T11:59:17Z'),
(u"Can't create Tables using RecordDecl per the examples in the docs",
u'2013-02-01T04:17:56Z'),
(u'many import errors', u'2013-07-18T02:55:58Z'),
(u" BLZ `format` '' is not supported.", u'2013-07-31T15:21:21Z'),
(u'Start on expressoin graph', u'2013-08-23T15:01:07Z'),
(u"Can't create large multidimensional array in BLZ",
u'2013-09-09T08:38:34Z'),
(u'Blaze Kernels', u'2013-09-06T13:58:04Z'),
(u'Iterators in BLZ should be in their own class', u'2013-09-25T09:35:43Z'),
(u'Missing dynd-python dependency requirement', u'2013-10-03T21:00:37Z'),
(u'outdated website examples?', u'2013-11-04T02:50:19Z'),
(u'Update install doc about the dynd dependency. Closed #79',
u'2013-10-29T20:52:45Z'),
(u'blz storage r/w mode is wrong', u'2013-11-04T02:54:42Z'),
(u'The printing code should support general datashapes',
u'2013-11-28T15:35:01Z'),
(u'Added data descriptors for CSV and JSON files. Storage and Array also support them.',
u'2013-12-10T12:33:08Z'),
(u'a basic catalog', u'2013-12-10T08:06:03Z'),
(u'Documentation', u'2013-12-09T16:21:42Z'),
(u'Open command should be load', u'2013-12-04T20:36:58Z'),
(u"Array's from iterators don't determing type correctly",
u'2013-12-03T20:38:29Z'),
(u'Cannot create a blaze array with a Record datashape',
u'2013-12-02T14:03:39Z'),
(u'[WIP] Blaze distributed capabilities', u'2013-12-12T16:43:22Z'),
(u'Shuffle files around', u'2013-12-12T05:44:43Z'),
(u'Do not require uri for local file', u'2013-12-11T22:46:15Z'),
(u'Second round of shuffling', u'2013-12-12T19:25:59Z'),
(u'[WIP] Csv dd cleanup refactor', u'2013-12-18T23:27:39Z'),
(u'Update server code to use catalog', u'2013-12-18T08:02:41Z'),
(u'Data Descriptor cleanup', u'2013-12-17T18:06:25Z'),
(u'Dshape refactor', u'2013-12-19T23:58:34Z'),
(u"use relative imports for tests and blz_ext's use of bparams",
u'2013-12-21T22:24:48Z'),
(u'Skipping cffi test on travis', u'2014-01-01T22:11:52Z'),
(u'Convert remote array to a data descriptor', u'2014-01-08T01:05:46Z'),
(u'[WIP] Remove blz', u'2014-01-13T14:43:52Z'),
(u'Consider renaming blaze.drop function', u'2014-01-14T06:19:37Z'),
(u'blaze.array from iterator type deduction, closes issue #86',
u'2014-01-14T02:14:14Z'),
(u'Clean up blaze.array methods/attributes', u'2014-01-14T06:27:11Z'),
(u'Server compute context', u'2014-01-14T22:06:38Z'),
(u'[WIP] Doc tweaks', u'2014-01-14T06:32:32Z'),
(u'Add a caching mechanism to the blaze catalog', u'2014-01-16T23:02:26Z'),
(u'Reform execution pipeline a bit (work towards better integration of new backends',
u'2014-01-20T16:09:14Z'),
(u'[WIP] HDF5 DataDescriptor', u'2014-01-20T12:19:01Z'),
(u'Catalog module requires yaml', u'2014-01-21T10:06:59Z'),
(u'design doc for numpy-like API', u'2014-01-21T08:06:18Z'),
(u'Uniformexecution', u'2014-01-21T20:26:50Z'),
(u'DataDescriptor for HDF5', u'2014-01-22T17:23:53Z'),
(u'Diagnose and fix problem evaluating nested ufunc calls',
u'2014-01-29T20:44:58Z'),
(u'[WIP] Sql', u'2014-01-29T17:35:37Z'),
(u'Adding internal import details', u'2014-01-30T20:01:48Z'),
(u'Add drop to catalog', u'2014-01-31T17:06:36Z'),
(u'[WIP] HDF5 DataDescriptor docs and examples', u'2014-02-03T16:30:06Z'),
(u'Work on allowing creation of stand-alone blaze functions',
u'2014-02-03T14:48:04Z'),
(u'Backend generalization and gentle start on sql backend',
u'2014-02-03T14:47:08Z'),
(u'Syntax for choosing multiple fields', u'2014-02-05T20:21:37Z'),
(u'Work on AIR debug printing in blaze REPL', u'2014-02-05T17:08:04Z'),
(u'`blaze.open()` should try to recognize file extensions in case `format` param is not passed',
u'2014-02-04T16:52:22Z'),
(u'Pyinterp', u'2014-02-12T11:39:15Z'),
(u'Strategy', u'2014-02-11T18:51:20Z'),
(u'[WIP] Work in sql(ite) data descriptor', u'2014-02-11T15:04:35Z'),
(u'[WIP] Initial design for making hdf5 files acting as native catalog dirs',
u'2014-02-06T17:18:48Z'),
(u'Assignation error between blaze array and numpy array',
u'2014-02-17T14:56:44Z'),
(u'Update README.md to fix a broken link', u'2014-02-14T22:09:30Z'),
(u'hdf5 sample broken', u'2014-02-12T16:54:09Z'),
(u'Cannot get string out of blaze array', u'2014-02-18T18:42:51Z'),
(u'Iteration over blaze arrays returns data descriptors',
u'2014-02-18T18:42:02Z'),
(u'Manual CSV delimiter specification needed', u'2014-02-18T17:18:39Z'),
(u'Dependency on pyparsing', u'2014-02-19T18:19:54Z'),
(u'Adding testing driven code documentation', u'2014-02-19T17:55:22Z'),
(u'Blaze sql record field selection is not lazy', u'2014-02-26T17:47:07Z'),
(u'SQL tutorial and column selection', u'2014-02-26T11:01:45Z'),
(u'Sqldocs', u'2014-02-24T17:21:48Z'),
(u'[WIP] Removing some "import blaze" statements', u'2014-02-20T17:24:04Z'),
(u'Graphs do not fold constants', u'2014-03-03T22:47:26Z'),
(u'Simple graph', u'2014-03-03T22:27:37Z'),
(u'A proposal for a simple SQL cache for Blaze.', u'2014-03-03T18:11:58Z'),
(u'[WIP] Updates for new datashape grammar', u'2014-02-27T03:49:07Z'),
(u"blaze server can't handle names with .", u'2014-03-03T22:53:01Z'),
(u'[WIP] Constant folding', u'2014-03-03T22:48:09Z'),
(u'blaze catalog requires hdf5 file have extension .h5',
u'2014-03-03T22:53:54Z'),
(u'Add hdf5 catalog to server sample', u'2014-03-03T23:17:53Z'),
(u"Samples and doctest aren't tested with unittests",
u'2014-03-04T00:13:39Z'),
(u"Samples and doctest aren't tested with unittests",
u'2014-03-04T00:26:47Z'),
(u'[WIP] Fix sql printing and selection', u'2014-03-04T00:24:18Z'),
(u'[WIP] Use new datashape overloader, general dispatch cleanup',
u'2014-03-14T01:07:11Z'),
(u'Tweak for overloader PR on datashape', u'2014-03-13T21:26:50Z'),
(u'[WIP] Design Doc Update', u'2014-03-07T08:46:24Z'),
(u'Mode in storage is respected in constructors now. Fixes #83.',
u'2014-03-14T21:03:25Z'),
(u'Indexed assignment does not work', u'2014-03-18T17:37:58Z'),
(u'[WIP] Element wise, chunked evaluator, suited for OOC operations',
u'2014-03-17T15:22:22Z'),
(u'Better error message on getting buffers out of deferred arrays',
u'2014-03-19T12:57:18Z'),
(u'Remove scidb (for now), make default overloading explicit',
u'2014-03-19T00:12:57Z'),
(u'Continuing proposal for a SQL cache for Blaze.', u'2014-03-19T19:59:13Z'),
(u'Build dynd in travis', u'2014-03-20T01:39:59Z'),
(u'Adding dynd install from source', u'2014-03-19T23:19:30Z'),
(u'Blaze SQL Example Fails', u'2014-03-20T14:38:12Z'),
(u'Update requirements use only pip on travis', u'2014-03-20T20:42:43Z'),
(u'SQL catalogue parsing', u'2014-03-20T16:04:32Z'),
(u'[WIP] Add ReductionBlazeFunc and instances using it',
u'2014-03-20T23:50:13Z'),
(u'Assignments of operations in ranges does not work',
u'2014-03-21T10:40:34Z'),
(u'A propsoal for handling SQL queries', u'2014-03-21T04:54:48Z'),
(u'[WIP] A first proposal for a Table object', u'2014-03-22T08:48:24Z'),
(u'[WIP] A first proposal for a Table object', u'2014-03-21T16:27:37Z'),
(u'Finish reduction support', u'2014-03-25T21:15:57Z'),
(u'Adding support for the HDF5 format in Storage class',
u'2014-03-26T16:04:22Z'),
(u'[WIP] datetime design doc', u'2014-03-26T08:13:01Z'),
(u'A design document to convert the DataDescriptor class as first-class citizen',
u'2014-03-27T16:25:28Z'),
(u'Link datashape doc to datashape repo', u'2014-03-28T17:38:40Z'),
(u'[WIP] High level parallel expression graph', u'2014-03-28T14:36:27Z'),
(u'[WIP] datetime implementation', u'2014-03-28T22:48:39Z'),
(u'[WIP] A blaze.where() function for filters for HDF5 and BLZ',
u'2014-04-03T15:10:20Z'),
(u'Array.__iter__ yields either scalars or arrays', u'2014-04-07T20:47:13Z'),
(u'Efficient bulk append for DataDescriptors', u'2014-04-07T21:15:42Z'),
(u'CSV_DDesc tweaks', u'2014-04-07T22:53:06Z'),
(u'iterchunks(blen=None) never set to a default', u'2014-04-07T23:31:29Z'),
(u'CSV_DDesc does not respect its own dialect', u'2014-04-07T23:35:51Z'),
(u'Need datasets for comprehensive test suite', u'2014-04-08T14:06:51Z'),
(u'JSON data descriptor reads everything into memory',
u'2014-04-08T15:55:29Z'),
(u'Structured array printing is verbose', u'2014-04-08T15:59:59Z'),
(u'Python_DataDescriptor', u'2014-04-08T18:29:47Z'),
(u'Replace use of `ddesc_as_py` for testing with `list`',
u'2014-04-08T18:38:06Z'),
(u'Getting element from array yield element not array',
u'2014-04-08T18:50:57Z'),
(u'Add Array methods to match numpy interface', u'2014-04-08T19:00:20Z'),
(u'Blaze.JSON_DDesc not compatible with Pandas.DataFrame.to_json',
u'2014-04-08T22:45:17Z'),
(u'[WIP] - Design - Bulk transfer between Data Descriptors',
u'2014-04-08T22:35:52Z'),
(u'[WIP] Reduction tweaks', u'2014-04-08T19:54:50Z'),
(u'Add validate to public blaze API', u'2014-04-09T16:24:07Z'),
(u'[WIP] - Playing with data descriptors', u'2014-04-09T15:32:24Z'),
(u'Replace Capability class with dictionary', u'2014-04-09T17:54:32Z'),
(u'Intelligent caching', u'2014-04-09T21:55:58Z'),
(u'Changes to DyND interrupt development workflow', u'2014-04-09T22:03:46Z'),
(u'File system meta DataDescriptor', u'2014-04-10T14:24:37Z'),
(u'[WIP] Rolling reduce design doc', u'2014-04-10T21:54:50Z'),
(u'Shorten data descriptor file names', u'2014-04-11T14:42:12Z'),
(u'[WIP] Adding support for the netCDF3/netCDF4 format',
u'2014-04-11T12:21:48Z'),
(u'Depend on SQLAlchemy for SQL code generation', u'2014-04-11T15:01:13Z'),
(u'Validate and into', u'2014-04-16T22:50:36Z'),
(u'New Data layer', u'2014-04-16T01:45:28Z'),
(u'Add an optional, no dependencies configuration to travis',
u'2014-04-15T10:01:36Z'),
(u'Dispatched validate and coerce operations', u'2014-04-15T01:07:33Z'),
(u'Table', u'2014-04-25T01:34:21Z'),
(u'[WIP] allow JSON data descriptor to iterate over series of JSON files',
u'2014-04-21T15:01:36Z'),
(u'blaze/data/{dynd,json}.py hide modules when importing in blaze/data/',
u'2014-05-01T22:44:08Z'),
(u'Encode dates/datetimes in JSON data descriptor', u'2014-05-01T21:31:16Z'),
(u'Table Reductions', u'2014-05-07T17:18:49Z'),
(u'[WIP] Initial version of HDFS support via context manager',
u'2014-05-05T01:57:46Z'),
(u'Python join', u'2014-05-15T20:18:18Z'),
(u'Various fixes to Table expressions', u'2014-05-15T17:10:42Z'),
(u'[WIP] Compute layer operations on pyspark RDDs', u'2014-05-15T15:56:26Z'),
(u'Depend on PyToolz', u'2014-05-14T20:29:57Z'),
(u'Support Datetime in HDF5', u'2014-05-12T20:27:30Z'),
(u'Support variable length strings in HDF5', u'2014-05-12T20:23:05Z'),
(u'Datashape discovery', u'2014-05-08T19:57:26Z'),
(u'Add simple static check on expr', u'2014-05-20T16:16:12Z'),
(u'Various fixes, often in SQL', u'2014-05-19T23:07:28Z'),
(u'Dangling file descriptors in `blaze.data.{csv,json}`',
u'2014-05-20T20:36:15Z'),
(u'Apply and Map generic functions onto TableExprs', u'2014-05-22T02:15:00Z'),
(u'Scalar Expressions', u'2014-05-22T01:19:16Z'),
(u'[WIP] Pyspark', u'2014-05-21T22:54:20Z'),
(u'Add nunique operation', u'2014-05-21T18:30:40Z'),
(u'Implicit Joins', u'2014-05-22T21:01:04Z'),
(u'Booleans', u'2014-05-22T18:44:38Z'),
(u'Use Blaze to benchmark various backends', u'2014-05-22T16:47:13Z'),
(u'DyND OOC Backend', u'2014-05-22T16:41:31Z'),
(u'Trivial demonstration development environment ', u'2014-05-22T16:32:04Z'),
(u'Add timezone support to the datetime type', u'2014-05-23T23:45:06Z'),
(u'DyND compute frontend', u'2014-05-23T23:43:30Z'),
(u'Missing data support in DyND', u'2014-05-23T21:57:10Z'),
(u'Jaccard similarity demo', u'2014-05-23T18:28:46Z'),
(u'Label', u'2014-05-23T18:15:48Z'),
(u'Serialization issues with `compute`', u'2014-05-23T14:43:47Z'),
(u'Merge Reorg', u'2014-05-26T17:57:37Z'),
(u'Arbitrary functions', u'2014-05-26T17:45:35Z'),
(u'Blaze Table Object', u'2014-05-26T21:31:19Z'),
(u'Add new quickstart ', u'2014-05-26T21:32:29Z'),
(u'Development blaze on Binstar', u'2014-05-26T22:09:34Z'),
(u'Update Catalog Server', u'2014-05-26T22:25:14Z'),
(u'[WIP] Distinct', u'2014-05-27T15:23:17Z'),
(u'Create `Distinct` term', u'2014-05-27T14:33:45Z'),
(u'Clean up import *', u'2014-05-28T15:53:01Z'),
(u'Jaccard2', u'2014-05-28T19:45:11Z'),
(u'Datashape Discovery', u'2014-05-27T01:01:19Z'),
(u'Impala Backend', u'2014-05-30T16:28:12Z'),
(u'Spark stand-alone mode', u'2014-05-29T15:41:54Z'),
(u'Add compute(Expr, DataDescriptor) implementation',
u'2014-05-30T17:15:17Z'),
(u'merge twitter dataset1 with WDC data', u'2014-05-30T21:08:35Z'),
(u'Python multicolumn groupby', u'2014-05-30T22:41:24Z'),
(u'Spark compute', u'2014-06-06T15:53:38Z'),
(u'Spark', u'2014-06-06T14:54:43Z'),
(u'Scalar Expressions', u'2014-06-06T19:53:30Z'),
(u'Test unicode string support in `blaze.data`', u'2014-06-09T21:45:33Z'),
(u'Delete old Vagrant code, favor conda', u'2014-06-09T22:02:42Z'),
(u'Put `spark` on binstar', u'2014-06-09T23:00:52Z'),
(u'Stress test datashape discovery', u'2014-06-09T22:37:23Z'),
(u'Tune Python Streaming backend', u'2014-06-09T22:36:35Z'),
(u'Fill out Spark implementation', u'2014-06-08T17:28:28Z'),
(u'Jaccard fix', u'2014-06-11T00:33:46Z'),
(u'Vagrant del', u'2014-06-10T22:31:44Z'),
(u'Update documentation for reorg branch', u'2014-06-11T17:42:01Z'),
(u"Multi-input compute doesn't play well with consumable data sources ",
u'2014-06-11T17:55:38Z'),
(u'SciPy 2014 Paper', u'2014-06-11T20:36:46Z'),
(u'[WIP] Reorg Docs', u'2014-06-11T21:23:34Z'),
(u'Blaze server', u'2014-06-16T23:01:29Z'),
(u'Add `into` operation to api', u'2014-06-16T17:33:45Z'),
(u'Interactive Table object', u'2014-06-18T19:35:02Z'),
(u'data: CSV supports sep as alias for delimiter', u'2014-06-18T22:15:10Z'),
(u'projection of filter TableExpr fails on Spark RDDs',
u'2014-06-19T02:18:11Z'),
(u'Delete old stuff', u'2014-06-20T14:05:45Z'),
(u'Structured description of data descriptors', u'2014-06-20T21:44:42Z'),
(u'Various Small fixes', u'2014-06-20T19:24:32Z'),
(u'Fixup quickstart', u'2014-06-23T14:55:26Z'),
(u'Merge reorg', u'2014-06-23T14:46:09Z'),
(u'Imports', u'2014-06-23T19:06:52Z'),
(u'Efficient CSV -> SQL migration', u'2014-06-24T21:40:50Z'),
(u'Multi column join', u'2014-06-24T15:53:10Z'),
(u"SQL extend doesn't preserve schema", u'2014-06-26T15:01:46Z'),
(u'Better csv unicode support with `unicodecsv`', u'2014-06-26T17:07:24Z'),
(u'Small fixes', u'2014-06-26T18:37:07Z'),
(u'[WIP] - HDF5 variable length strings', u'2014-06-26T18:13:36Z'),
(u'Scalar coercion - Server selection', u'2014-06-26T00:37:43Z'),
(u'compute on HDF5 with PyTables', u'2014-06-28T17:45:46Z'),
(u'Sample operation', u'2014-06-30T21:25:16Z'),
(u'unicodecsv is slow', u'2014-07-01T15:27:26Z'),
(u'Coerce works on Spark RDDs', u'2014-07-03T01:05:34Z'),
(u'Extend Projection operation to data descriptors', u'2014-07-03T15:32:52Z'),
(u'expr: Join automatically selects all shared columns',
u'2014-07-03T18:48:18Z'),
(u'Skip gzip csv tests on windows py2.x', u'2014-07-03T18:13:34Z'),
(u'Expression Optimization', u'2014-07-03T22:06:57Z'),
(u'INTO feature for CSV to DB', u'2014-07-09T16:41:55Z'),
(u'Fix repr when Table is backed by mutable data', u'2014-07-11T18:49:15Z'),
(u"setup.py doesn't include unicodecsv", u'2014-07-12T15:01:55Z'),
(u'Added unicde to requirements and docs closes #378',
u'2014-07-12T15:27:28Z'),
(u'Various fixes', u'2014-07-07T14:21:23Z'),
(u'Integer column names not working', u'2014-07-12T17:39:06Z'),
(u"Conda install doesn't install toolz dependency", u'2014-07-12T23:29:55Z'),
(u"spark tests aren't skipped when spark isn't installed",
u'2014-07-13T01:49:52Z'),
(u"Don't run spark tests if pyspark isn't available",
u'2014-07-13T11:43:39Z'),
(u'Assist Spark users in parsing CSV files', u'2014-07-02T21:23:25Z'),
(u'Refactor recursion out of compute ', u'2014-07-02T14:41:59Z'),
(u'Multiprocessing meta-backend', u'2014-07-01T16:46:31Z'),
(u'Data descriptor constructors should specify missing values',
u'2014-07-14T13:43:07Z'),
(u'CSV header handling', u'2014-07-14T13:47:13Z'),
(u'`data.py[...]` should avoid returning an iterator when data is small',
u'2014-07-14T13:48:45Z'),
(u'Improve error message for DataDescriptor.__len__',
u'2014-07-14T13:50:05Z'),
(u'Put docs on readthedocs', u'2014-07-14T13:50:53Z'),
(u'Handle missing data in SQL data descriptor', u'2014-07-14T13:55:50Z'),
(u'`rpy2` integration', u'2014-07-14T14:15:42Z'),
(u'server expression security improvements', u'2014-07-14T19:00:13Z'),
(u'support for Map of Columnwise', u'2014-07-14T20:47:52Z'),
(u'Travis conda', u'2014-07-14T15:54:27Z'),
(u'Reduction dshape and csv missing values', u'2014-07-16T15:45:56Z'),
(u'Outer join', u'2014-07-16T21:28:54Z'),
(u'Access columns as attributes, rather than with strings',
u'2014-07-17T00:41:16Z'),
(u'Add `into` implementations for TableExprs', u'2014-07-17T01:21:50Z'),
(u'Add to `into`', u'2014-07-17T01:58:00Z'),
(u'Implement __getattr__', u'2014-07-17T03:39:37Z'),
(u'Consider using setuptools to install instead of distutils',
u'2014-07-17T13:53:32Z'),
(u'How to handle missing values in HDF5?', u'2014-07-17T15:56:05Z'),
(u'Dependency list is incomplete and contradictory', u'2014-07-17T18:33:38Z'),
(u'CSV keyword arguments documentation', u'2014-07-17T19:11:41Z'),
(u'compute: projection of data descriptor uses `.py`',
u'2014-07-17T19:43:51Z'),
(u'CSV: errors and encoding arguments', u'2014-07-17T20:58:54Z'),
(u'SQL databases match nullability to datashape.Option',
u'2014-07-17T22:21:04Z'),
(u'Broken links in \'blaze.pydata.org"', u'2014-07-17T22:29:34Z'),
(u'Fixed two links and added google analytics tracking',
u'2014-07-17T23:20:12Z'),
(u'Implement a scalar expression parser', u'2014-07-17T23:03:24Z'),
(u'Add to the CSV docstring', u'2014-07-17T22:54:45Z'),
(u'Cleanup Scalar a bit', u'2014-07-18T12:48:30Z'),
(u'By of merged columns has stopped working.', u'2014-07-18T18:27:18Z'),
(u'Import of blaze.expr.scalar.* breaks merge', u'2014-07-18T18:31:31Z'),
(u'Dev install instructions', u'2014-07-19T16:20:47Z'),
(u'10 minutes to Blaze', u'2014-07-19T22:41:00Z'),
(u'python: by maps call to compute onto child', u'2014-07-20T16:22:49Z'),
(u'Add funders to webpage', u'2014-07-21T22:02:30Z'),
(u'Selection for Date columns in SQL backend produces odd expression',
u'2014-07-22T17:58:56Z'),
(u'Support some NoSQL Database', u'2014-07-23T00:45:17Z'),
(u'flatMapValue PySpark equivalent in Blaze', u'2014-07-24T15:35:39Z'),
(u'Individual columns should be able to repr if not passed in CSV',
u'2014-07-27T17:43:40Z'),
(u'Raise when Table has a different schema than the underlying data',
u'2014-07-27T18:33:43Z'),
(u'Add google analytics to docs', u'2014-07-22T13:25:47Z'),
(u'Fix double return', u'2014-07-28T18:40:40Z'),
(u'Relax constraint that `By` must use reductions', u'2014-07-29T22:20:14Z'),
(u'BColz', u'2014-07-31T03:17:53Z'),
(u"`count` operation doesn't consider missing values",
u'2014-07-31T17:13:00Z'),
(u"expr: selection doesn't fail on non-rowwise child",
u'2014-08-05T14:34:38Z'),
(u'Visualize the capabilities of each backend', u'2014-08-05T20:28:24Z'),
(u'bcolz, blz, and chunks', u'2014-08-05T19:56:38Z'),
(u'Doc refresh', u'2014-08-05T22:36:08Z'),
(u'[WIP] Bcolz copy', u'2014-08-05T16:13:18Z'),
(u'MongoDB Backend', u'2014-08-01T22:32:16Z'),
(u'PyTables computational backend', u'2014-07-30T19:35:04Z'),
(u'Build Blaze on jenkins, upload to binstar blaze-dev account',
u'2014-08-07T16:16:02Z'),
(u'Make blaze.test() return True or False', u'2014-08-08T20:33:29Z'),
(u'documentation link in the README is broken', u'2014-08-09T00:02:22Z'),
(u'Consistent column naming scheme', u'2014-08-12T15:18:24Z'),
(u'Spark by should use reduceby or foldby', u'2014-08-12T18:44:53Z'),
(u'Comprehensive test suite for `into`', u'2014-08-12T18:58:19Z'),
(u'Parallel chunking or streaming backend', u'2014-08-12T19:04:16Z'),
(u'Into test', u'2014-08-12T21:12:49Z'),
(u'pandas: enforce expression column names on `by`', u'2014-08-12T15:32:07Z'),
(u'Add BColz and chunking backend', u'2014-08-12T02:52:50Z'),
(u'Graceful handling of empty results', u'2014-08-13T15:40:37Z'),
(u'into: test foo <- CSV', u'2014-08-13T15:18:20Z'),
(u'SQL Table Overwrite', u'2014-08-14T04:01:10Z'),
(u"'by' of pandas DataFrame doesn't work as expected",
u'2014-08-14T13:46:25Z'),
(u'add into(DataFrame, pytables Table)', u'2014-08-11T21:21:07Z'),
(u'[WIP] Feature/csv to sql natively', u'2014-08-10T02:36:58Z'),
(u'Maintain length in table expressions', u'2014-08-15T01:44:20Z'),
(u'`from blaze import *` results in override of built-ins ',
u'2014-08-15T14:05:14Z'),
(u'dispatch on mathematical functions', u'2014-08-15T15:17:46Z'),
(u'Open world assumption and 3VL in Blaze', u'2014-08-15T17:14:21Z'),
(u'Compute on scalar expressions', u'2014-08-15T19:39:11Z'),
(u'Overload `__len__` to work on Table Expressions and on Table interactive objects',
u'2014-08-15T18:35:50Z'),
(u'Which packages should be required for blaze, which should be optional?',
u'2014-08-15T22:27:59Z'),
(u'Look towards dplyr for ideas to expand expression input',
u'2014-08-16T13:43:04Z'),
(u'ETL on bad CSV data - what should we do?', u'2014-08-18T16:05:15Z'),
(u'[WIP] - Summary', u'2014-08-18T17:30:20Z'),
(u'Compute on scalar expressions', u'2014-08-18T17:23:05Z'),
(u'comprehensive compute tests', u'2014-08-16T17:57:20Z'),
(u"[WIP] csv_into (don't merge)", u'2014-08-18T20:25:04Z'),
(u'Release Blogpost', u'2014-08-19T15:29:15Z'),
(u'General function expressions', u'2014-08-19T18:51:04Z'),
(u"Don't use eval when evaluating RealMath subclasses",
u'2014-08-19T20:11:47Z'),
(u'drop and create_index dispatched functions', u'2014-08-20T14:56:29Z'),
(u'Does not list pymongo as a requirement', u'2014-08-20T16:37:23Z'),
(u'Small fixes 3', u'2014-08-20T18:32:03Z'),
(u'WIP: compute on HDF5 with PyTables', u'2014-08-19T21:48:48Z'),
(u'Add persistent storage systems to into comprehensive test',
u'2014-08-20T18:37:17Z'),
(u'SQLAlchemy string types - encoding and fixed lengths',
u'2014-08-20T21:27:43Z'),
(u'[WIP] - `dplyr` interface`', u'2014-08-19T15:45:09Z'),
(u'Bug: columns attribute of TableSymbol is None when creating a schema with discover(tables.Table)',
u'2014-08-21T14:34:33Z'),
(u'WIP: Add create_index / drop_index functionality',
u'2014-08-21T20:32:18Z'),
(u'into implementation for SQL using CSV loading', u'2014-08-22T17:15:25Z'),
(u'WIP: RethinkDB for blaze', u'2014-08-23T04:47:39Z'),
(u'Added example rpy2 conversion', u'2014-08-23T01:32:23Z'),
(u'update readme and docs with api changes', u'2014-08-22T18:16:26Z'),
(u'Refactor Chunks', u'2014-08-22T17:27:13Z'),
(u'Continue to test and improve `into`', u'2014-08-22T16:21:23Z'),
(u'[WIP] into(pytables Table, csv) with option to ignore errors in CSV files',
u'2014-08-22T04:35:34Z'),
(u'GZipped CSV <- SQL with new migration system', u'2014-08-23T20:49:44Z'),
(u'Remove old core directory', u'2014-08-23T16:34:17Z'),
(u'Lightweight descriptor for various file formats like Excel, SPSS',
u'2014-08-24T00:25:41Z'),
(u'Moar CSV fixes!', u'2014-09-19T23:07:44Z'),
(u'Parse datetimes in CSV.reader', u'2014-09-17T23:26:43Z'),
(u'Move datetime logic from into to csv.reader', u'2014-09-17T18:40:53Z'),
(u'[WIP] Test table coverage', u'2014-09-17T13:43:57Z'),
(u'Chunked into', u'2014-09-17T12:43:31Z'),
(u'API for moving between type systems', u'2014-09-17T00:58:07Z'),
(u'A Blaze equivalent for: SELECT * WHERE t.column IN list_values',
u'2014-09-16T20:46:45Z'),
(u"Error in into(DataFrame, '/*.%s.gz' % dataset) that used to work",
u'2014-09-16T06:51:06Z'),
(u'How should we handle pulling strings out of HDF5?',
u'2014-09-13T21:40:59Z'),
(u'Ideas to clean codebase', u'2014-09-12T17:27:04Z'),
(u'Adding examples, datasets for examples, and .coveragerc ignores.',
u'2014-09-11T18:57:29Z'),
(u'SparkSQL HiveQL', u'2014-09-10T23:18:29Z'),
(u'SparkSQL map', u'2014-09-10T23:17:20Z'),
(u'Google BigQuery', u'2014-09-09T19:23:15Z'),
(u'Google Spreadsheet Table', u'2014-09-09T18:29:54Z'),
(u'[WIP] Handle sqlite INTO call on Windows', u'2014-09-08T21:59:39Z'),
(u'Consider using this tox setup for testing HDFS related work',
u'2014-09-08T17:03:29Z'),
(u'NetCDF4 Backend', u'2014-09-08T15:44:04Z'),
(u"Investigate use of SQLAlchemy's ORM system for sql generation",
u'2014-09-07T19:42:53Z'),
(u'[RFC] - Arrays', u'2014-09-07T01:23:53Z'),
(u'xfail on sqlite3 command not available on windows',
u'2014-09-07T00:00:21Z'),
(u'Prevent coveralls from commenting', u'2014-09-06T20:24:14Z'),
(u'CSV Headers with Spark', u'2014-09-04T20:36:02Z'),
(u'#362 Sample operation initial work', u'2014-09-04T19:07:41Z'),
(u'General Performance Guideline: Backend comparison',
u'2014-09-04T18:45:26Z'),
(u'Increase testing coverage', u'2014-09-04T17:20:26Z'),
(u'into(Spark/HDFS, SQL DBs)', u'2014-09-03T23:06:30Z'),
(u'Creating a `test_compute_exhaustive.py`', u'2014-09-03T18:57:15Z'),
(u'Add developer docs on how to build a new Expression type',
u'2014-09-03T17:41:11Z'),
(u'Add more usage examples to the docs', u'2014-09-02T22:19:54Z'),
(u'String matching operation', u'2014-09-02T21:34:51Z'),
(u"str(Table.count()) doesn't show count", u'2014-09-02T21:03:51Z'),
(u"SQL <- CSV doesn't work properly with sqlite", u'2014-09-02T16:39:50Z'),
(u'rollapply/rolling/window operation', u'2014-09-02T15:07:15Z'),
(u'Create frontend to match LINQ syntax', u'2014-09-01T19:49:36Z'),
(u'Support frequent releases', u'2014-09-01T16:26:29Z'),
(u'Display expr information', u'2014-08-30T19:04:29Z'),
(u'Compute pool with timeouts for Server', u'2014-08-30T17:44:40Z'),
(u'PyCon Submission', u'2014-08-28T20:57:25Z'),
(u'Datetime support (and more robust support in general) in PyTables',
u'2014-08-28T14:20:27Z'),
(u'Use COPY function from Psycopg', u'2014-08-28T03:01:19Z'),
(u'Submit paper for PyHPC 2014', u'2014-08-26T16:40:43Z'),
(u'Discussion of how to support a large number of backends',
u'2014-08-26T15:04:21Z'),
(u'Improve internal documentation/scripts to update documentation',
u'2014-08-25T16:38:13Z'),
(u'Change scalar_symbol into expr', u'2014-09-26T03:58:53Z'),
(u'[WIP] SciDB backend', u'2014-09-26T01:37:30Z'),
(u'Pytables column head', u'2014-09-26T01:32:40Z'),
(u'string operations', u'2014-09-25T23:49:50Z'),
(u'WIP: Implement ColumnWise for MongoDB', u'2014-09-25T23:11:35Z'),
(u'HBase', u'2014-09-25T22:32:52Z'),
(u'Update server design doc', u'2014-09-25T21:25:46Z'),
(u'Allow ignoring particular exceptions when using glob resources',
u'2014-09-25T20:15:22Z'),
(u"Blaze channel blaze install doesn't include dependencies",
u'2014-09-24T13:47:07Z'),
(u'Rename Like, Regex to TextLike, TextRegex', u'2014-09-24T12:13:43Z'),
(u'Insert projections opportunistically into expressions',
u'2014-09-24T01:29:40Z'),
(u'WIP: Fix mysql into', u'2014-09-24T01:14:05Z'),
(u'Attribute expressions', u'2014-09-22T17:21:13Z'),
(u'Support datetime attributes ', u'2014-09-22T13:07:39Z'),
(u'Support hive, presto through pyhive project', u'2014-09-21T19:42:37Z'),
(u'Rename `*_index` with `index_*`', u'2014-09-21T19:32:28Z'),
(u'API: somethoughts / ideas', u'2014-10-01T19:33:49Z'),
(u'Break isnull type operations out into a separate expression',
u'2014-09-30T22:01:53Z'),
(u'API: PyTables/Pandas/HDF5', u'2014-09-30T17:01:41Z'),
(u'Required kwargs for certain dispatched functions.',
u'2014-09-30T16:36:31Z'),
(u'BUG,DOC: "Examples" link 404', u'2014-09-30T15:02:59Z'),
(u'Misleading error message when building table', u'2014-09-29T21:20:04Z'),
(u'[WIP] Test table api', u'2014-10-02T14:21:57Z'),
(u'into SQL <- CSV sends header as data', u'2014-09-29T13:47:30Z'),
(u'[WIP] Refactor Expr', u'2014-09-27T02:26:13Z'),
(u'Accept dot-delimited schemaname.tablename ', u'2014-09-27T02:22:32Z'),
(u'API: print/show_backends', u'2014-09-26T15:31:01Z'),
(u'Use dir and getattr to dispatch methods based on datashape',
u'2014-09-26T12:18:41Z'),
(u'SQL <- CSV loading errors pop up inappropriately',
u'2014-10-03T20:15:41Z'),
(u'Docs: Include MongoDB examples/docstrings in the website',
u'2014-10-03T20:30:11Z'),
(u'How should we handle user facing warnings?', u'2014-10-03T22:31:52Z'),
(u'Problem converting expression Column to nd.array',
u'2014-10-04T19:12:52Z'),
(u'Datetime access in SQL databases', u'2014-10-06T01:52:39Z'),
(u'More datetime access expressions', u'2014-10-06T01:53:17Z'),
(u'Nested behavior in MongoDB', u'2014-10-06T01:56:14Z'),
(u'Nested behavior in Python, Spark', u'2014-10-06T01:57:01Z'),
(u'Resource for MongoDB connection string', u'2014-10-06T16:32:57Z'),
(u'Handle Gzip complexity in csvopen', u'2014-10-07T15:30:31Z'),
(u'Various fixes 4', u'2014-10-06T15:47:15Z'),
(u'CI: use appveyor to build for windows', u'2014-10-07T16:32:49Z')]
Replace list with DataFrame, np.ndarray, or a filename in your favorite format to store results in different systems.
In [8]:
from blaze import compute, TableExpr, dispatch
from blaze.compute.mongo import MongoQuery
@dispatch(TableExpr, MongoQuery, dict)
def post_compute(expr, q, d):
# Used to communicate to server
# Now just return query
return q.query
In [9]:
compute(users[users.followers > 100][['login', 'followers', 'following', 'blog']].head(10))
Out[9]:
({'$match': {'followers': {'$gt': 100}}},
{'$project': {'blog': 1, 'followers': 1, 'following': 1, 'login': 1}},
{'$limit': 10})
In [10]:
compute(users.location.count_values().head(10))
Out[10]:
({'$project': {'location': 1}},
{'$group': {'_id': {'location': '$location'}, 'count': {'$sum': 1}}},
{'$project': {'count': '$count', 'location': '$_id.location'}},
{'$sort': {'count': -1}},
{'$limit': 10})
Content source: xlhtc007/blaze
Similar notebooks: