Nobody Expects the Python Packaging Authority

Nick Coghlan - BDFL Delegate for packaging related PEPs
https://bitbucket.org/ncoghlan/misc/src/default/talks/

15 years of Python packaging:

  • 1998: distutils
  • 2004: easy_install, setuptools
  • 2007: virtualenv
  • 2008: pip

Where do we want to be?

  • Need to interoperate with other packaging systems.
  • Want to make it easy for people to get started.
  • Tools need to be reasonably fast, reliable, and secure.

Interoperation

setuptools vs. distribute

setuptools wasn't very open; distribute was forked from it to make it more open. The two merged back together in 2013. Use setuptools 0.8+.

pip vs. easy_install

pip had better defaults than easy_install. latest version of pip can do binary installs (wheels). If you can create new binaries as wheels, pip 1.4+ can do everything you'd need.

The Python Packaging Authority

Where beginners can go for instruction.

The Science Exception

Scientific software tends to have tougher requirements than can be handled by pip; complex build stacks - better to have seperate tools.

hashdists, anaconda - treated like other packaging systems

Make it easy to get started

Include pip in the standard library. Goal is to have pip come with all python 3.4 installs. Should be able to update itself.

Fast, reliable, reasonably secure

What prevents fast distribution?

  • The mirroring system is not great as of yet. To discover all dependencies of a package, you have to actually download it and scan it; PyPI doesn't know. Now there is a new CDN donated by Fastly, so things should be speeding up.
  • Scanning of external links; packages in PyPI can no longer depend on external sources.
  • Enabling binary distribution (wheels) with pip has sped things up. Now build results can be cached locally; new virtualenvs can be built much quicker since the binaries are already built.
  • New metadata standard (PEP 426, PEP 400) in PyPI; REST-ful JSON will be available for decision making without downloading entire archives.

What prevents reliable distribution?

  • Hosting moved to Oregon State University servers
  • New CDN and removing external hosting improves speed and is more reliable

What about the PyPI mirrors?

  • mirror discovery was fundamentally insecure
  • mirrors aren't going away
  • PyPI is a free service with no SLA, so user beware
  • may be worthwhile to have a local mirror if it's business critical

What prevents (reasonably) secure distribution?

  • tools were using HTTP with no package signing
  • PyPI security improvements
    • new SSL cert
    • HTTPS by default
    • docs to new domain
    • python.org to HTTPS
  • PyPI mirrors can't really be trusted. Trust local mirror, then Fastly CDN, then other mirrors