These example notebooks introduce probabilistic programming. We first look at applications, how generative models can be applied to populations in BayesDB. We then progress towards creating new generative models that better fit our intuitions about those populations in Venture. BayesDB and Venture are developed by the MIT Probabilistic Computing Group.
We invite you to watch a presentation (slides) on the subject of this tutorial by Vikash Mansingka.
Before we get started...
Signing up with your name and email helps build a community of support and helps improve your user experience. When you sign up, we collect information including the commands you tried, how long they took, what errors they resulted in, any additional data that you import, etc. If you provide your email, we will invite you to a low-traffic announcements list. Please include the name and email you use below in any reports of bugs or surprises. Send those reports to bayesdb@mit.edu or via GitHub.
If security is a primary concern, then you should do a security audit (and share the results with us) before using the software. As this is alpha software, results may not be reliable. DO NOT USE THIS SOFTWARE FOR HIPAA-COVERED, PERSONALLY IDENTIFIABLE, OR SIMILARLY SENSITIVE DATA!
Please fill in your name and email, then use shift-return (or the play button above) to run the cell.
In [4]:
name = ""
email = ""
with open('bayesdb-session-capture-opt.txt', 'w') as optfile:
optfile.write('%s <%s>\n' % (name, email))
# To opt out, use optfile.write('False') instead.
# Even opting out of sending details, you still allow us to count how often users opt out.
# You can opt-in or opt-out on a per-population basis using the session_capture_name option to Population.
# You must choose to either opt-in or opt-out.
For those unfamiliar with the software, languages, or concepts we will use in this tutorial, we recommend:
You do not need extensive knowledge of any of these to read our examples, so feel free to skip ahead. But if you are not very familiar with one of the technologies, then doing initial learning will be very helpful to you in playing around confidently and doing the suggested exercises.
BayesDB allows you to query your data as other SQL database systems do. It also allows you to query the implications of your data. We explore these capabilities using information about satellites orbiting our planet.
TODO: The same in smaller chunks, with those chunks expanded, promised here.
Because a default BayesDB model is unlikely to model your data plausibly, and because we do not yet have the tools to be confident that any model has captured the relationships in a population well, BayesDB is not ready for use for higher levels of analysis.
As you work with your data, do not attempt to use BayesDB for:
For somewhat temporary technical reasons, BayesDB is not ready to handle very large populations, except by sub-sampling them (violating the caveat against inferential analysis!).
While the focus of the group is towards better model types and inference strategies, some of these limitations are still in view to grow past. If these interest you, please work with us towards those goals.
With those caveats, we explore a "new" dataset using BayesDB:
TODO: the same in smaller chunks, with those chunks expanded, is promised here.
To work with your own data, please contact the group to have a conversation about the population you want to explore, about appropriate types of analysis, and to learn how to unlock analysis. We lock this feature because users have frequently misunderstood the limitations of our software, drawing unwarranted inferences. The concepts are easy to misuse, the software is in an early alpha version, and working with our team will help keep egg off your face, or worse.
Venture is a prototype general-purpose probabilistic computing platform. In Venture, one can create novel probabilistic models, and inference strategies that allow efficient learning for those models. Venture is programmed primarily in VentureScript, but also supports applications written in other probabilistic or traditional programming languages. In this tutorial we will explore a mix of the VentureScript language and the Python API to Venture.
TODO: Tutorial examples promised here.
In [5]:
import os
os.getcwd()
Out[5]:
In [ ]:
Copyright (c) 2010-2016, MIT Probabilistic Computing Project
Licensed under Apache 2.0 (edit cell for details).