Where Does the `sys.path` Start?

Constructing Python's `import` path

Importing modules in Python seems simple enough on the surface: import mymodule looks across the sys.path until it finds your module. But where does the sys.path itself come from?

Sure, there is a $PYTHONPATH variable which "augments the default search path for module files", but what is the default search path, how is it "augmented", how does easy_install or pip fit into this, and where does my package manager install modules?


Prefix Discovery

Before any script is executed, Python discovers its sys.prefix and sys.exec_prefix. This is done by a search_for_prefix function in Modules/getpath.c; essentially it uses $PYTHONHOME if it is set, otherwise it will:

  1. start at the python executable;
  2. walk up the parent directories looking for lib/pythonX.Y/os.py, and save this directory to sys.prefix;
  3. walk up the parent directories looking for lib/pythonX.Y/lib-dynload, and save this directory to sys.exec_prefix.

(Consider watching Carl Meyer's PyCon 2011 talk, Reverse-engineering Ian Bicking's brain, for a discussion of how this works, and how virtualenv takes it over.)

The sys.path is then initialized with the contents of $PYTHONPATH and the standard library (as contained within the discovered prefixes).

site.py

Next, the interpreter imports site. This module is responsible for finding, and setting up so-called "site-packages". It uses a addsitedir function which not only adds the given directory to the path, but also scans for *.pth files.

Any lines which are found in a *.pth file are appended to sys.path, while those that start with import are executed. This functionality was added largely to satisfy easy_install's requirements (see below).

Our path now contains (in order):

  1. $PYTHONPATH
  2. sys.prefix-ed stdlib
  3. sys.exec_prefix-ed stdlib
  4. site-packages
  5. *.pth in site-packages

Finally, the site module imports sitecustomize, which you can hook to do whatever you want.

homebrew

Homebrew provides its own sitecustomize.py, which it uses to clean the site-packages within the python tree out of the sys.path, and add one within the homebrew prefix.

Lets say homebrew is installed at /brew. Python's prefix is then /brew/opt/python. If you install packages into /brew/opt/python/lib/pythonX.Y/site-packages then will be destroyed when you perform a minor upgrade. Their sitecustomize.py strips all /brew/opt/ out of sys.path, and replaces it with /brew/lib/pythonX.Y/site-packages.

The big drawback to this is that we lose the sitecustomize hook.

pip

Pip installs packages into site-packages as "flat" (i.e. directly importable) packages (along with *.egg-info directories) such that it does not need to use any *.pth files.

easy_install

easy_install creates an easy_install.pth (which is required because it chooses to install things as eggs). It also (ab)uses the *.pth import magic to capture everything which is appended to sys.path and insert it at the front:

import sys; sys.__plen = len(sys.path)
./Jinja2-2.7.2-py2.7.egg
./MarkupSafe-0.23-py2.7-macosx-10.8-x86_64.egg
./docutils-0.11-py2.7.egg
./Pygments-1.6-py2.7.egg
import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)

This moved the new packages to sys.__egginsert (or 0) in the path. If there are multiple *.pth files which use this scheme they seem like they will play nicely.

This places easy_install-ed packages before the $PYTHONPATH, so now it finally looks like:

  1. easy_install-ed packages, via easy_install.pth in site-packages
  2. $PYTHONPATH
  3. sys.prefix-ed stdlib
  4. sys.exec_prefix-ed stdlib
  5. site-packages (including pip-installed packages)
  6. *.pth in site-packages, via pip (or others)

It also creates it's own site.py for whatever reason. It seems to do more reordering of the sys.path, but I can't immediately divine what it is.

Posted . Categories: .