Intro to Python Package / Environment Management with Anaconda

Nelson Liu

Oct. 17, 2016

What is Anaconda (conda) anyway?

  • A distribution of Python that comes with two handy tools for environment and package management, conda-env and conda, respectively.
  • Makes it easy to create, reproduce, and export your development requirements
  • Easily use multiple python versions on the same machine!
  • Handles native dependencies (C/C++/Cython) very well by using binaries
    • this is important, since numpy, scipy, scikit-learn etc. all have parts written in C for speed!
  • Optimized for scientific python
  • Open source, extremely actively developed, (mostly) compatible with pip

virtualenv, pip, and conda

  • virtualenv is an environment manager for Python
  • pip is a package manager for Python
  • conda does both!
  • table comparing virtualenv, pip, and conda
  • you can still use packages that are on pip in conda, but it's preferable to use conda
    • conda gives you the flexibility to use pip when you need it!

Why Bother?

Many systems nowadays come equipped with Python, so why should I use something else?

The Philosophy Behind Environments

  • Each conda environment is separated from your default system install
    • no risk of messing up dependencies or system python, e.g. with sudo pip install
  • Separate dependencies for every project, avoiding dependency conflicts
  • easy to reproduce and recreate environments
    • this is super important! It ensures that your results and analyses are reproducible by other people.
  • experiment quickly with different package and python versions

Miniconda vs Anaconda

Which should I choose?

  • My personal preference, Miniconda is ideal for people who want a minimal installation and to manage their own set of packages
  • Anaconda much larger, takes longer to setup and download.
  • We'll use miniconda for the purposes of this tutorial, since it's fast to setup.

Miniconda 2 vs Miniconda 3

  • Miniconda 2.x uses Python 2.x by default while Miniconda 3.x uses Python 3.x by default
  • However, you can still create Python 2 or 3 environments in either!
  • Pick whichever one you want to be your "default" Python

Downloading and Installing Conda

  • Get Miniconda here

    • Make sure to pick the right version for your machine and OS, and whatever Python version you want as default!
  • Windows: Just run the .exe. If unsure about any settings, the defaults are fine.

  • Mac and Linux: Open your terminal, and navigate to where the file was downloaded (cd ~/Downloads should do by default). Run the file by running bash [filename], e.g. bash Miniconda3-latest-MacOSX-x86_64.sh. Follow the instructions that appear in order to install.

Did it work?

  • After you've finished, restart your terminal if applicable.
  • Run conda info in the command prompt or terminal to verify that it installed correctly.
  • Important: Run conda update conda to get the latest version of conda!

Setting up your environment

  • Windows: use either the cmd prompt, cygwin (if installed), or the Anaconda Prompt
  • OSX / Linux: use your favorite terminal
  • When the terminal opens, it starts in the root conda environment!

Commands for working with conda environments

Command Operation
conda create -n <env_name> create a new conda environment
source activate <env_name> (OSX/Linux)
activate <env_name> (Windows)
activate a conda environent
source deactivate (OSX/Linux)
deactivate (Windows)
deactivate the current environment
conda env remove -n <env_name> remove conda environment called <env_name> environment

Let's create a new environment for MWL

conda create -n mwl

Activate this environment, and view what's inside

source activate mwl
conda list

Commands for working with conda packages

Command Function
conda env list List all conda environment
conda list List all conda packages
conda install package_name Install a package called package_name
conda remove package_name Remove a package called package_name
conda update package_name Update a package called package_name to the latest version
conda update conda Use conda to update conda / Anaconda

Install scikit-learn and its dependencies

conda install numpy scipy cython scikit-learn

Let's get matplotlib while we're at it so we can visualize data later

conda install matplotlib

When you're done with this environment

  • Run source deactivate (OSX and Linux) or deactivate to go back to the root.
  • From here, you can activate the env again by running source activate <envname> (OSX and Linux) or activate <envname> (windows)