Where Does the `sys.path` Start?
Constructing Python's `import` path
Importing modules in Python seems simple enough on the surface: import mymodule looks across the sys.path until it finds your module. But where does the sys.path itself come from?
Sure, there is a $PYTHONPATH variable which "augments the default search path for module files", but what is the default search path, how is it "augmented", how does easy_install or pip fit into this, and where does my package manager install modules?
Prefix Discovery
Before any script is executed, Python discovers its sys.prefix and sys.exec_prefix. This is done by a search_for_prefix function in Modules/getpath.c; essentially it uses $PYTHONHOME if it is set, otherwise it will:
- start at the python executable;
- walk up the parent directories looking for
lib/pythonX.Y/os.py, and save this directory tosys.prefix; - walk up the parent directories looking for
lib/pythonX.Y/lib-dynload, and save this directory tosys.exec_prefix.
(Consider watching Carl Meyer's PyCon 2011 talk, Reverse-engineering Ian Bicking's brain, for a discussion of how this works, and how virtualenv takes it over.)
The sys.path is then initialized with the contents of $PYTHONPATH and the standard library (as contained within the discovered prefixes).
site.py
Next, the interpreter imports site. This module is responsible for finding, and setting up so-called "site-packages". It uses a addsitedir function which not only adds the given directory to the path, but also scans for *.pth files.
Any lines which are found in a *.pth file are appended to sys.path, while those that start with import are executed. This functionality was added largely to satisfy easy_install's requirements (see below).
Our path now contains (in order):
$PYTHONPATHsys.prefix-ed stdlibsys.exec_prefix-ed stdlibsite-packages*.pthin site-packages
Finally, the site module imports sitecustomize, which you can hook to do whatever you want.
homebrew
Homebrew provides its own sitecustomize.py, which it uses to clean the site-packages within the python tree out of the sys.path, and add one within the homebrew prefix.
Lets say homebrew is installed at /brew. Python's prefix is then /brew/opt/python. If you install packages into /brew/opt/python/lib/pythonX.Y/site-packages then will be destroyed when you perform a minor upgrade. Their sitecustomize.py strips all /brew/opt/ out of sys.path, and replaces it with /brew/lib/pythonX.Y/site-packages.
The big drawback to this is that we lose the sitecustomize hook.
pip
Pip installs packages into site-packages as "flat" (i.e. directly importable) packages (along with *.egg-info directories) such that it does not need to use any *.pth files.
easy_install
easy_install creates an easy_install.pth (which is required because it chooses to install things as eggs). It also (ab)uses the *.pth import magic to capture everything which is appended to sys.path and insert it at the front:
import sys; sys.__plen = len(sys.path) ./Jinja2-2.7.2-py2.7.egg ./MarkupSafe-0.23-py2.7-macosx-10.8-x86_64.egg ./docutils-0.11-py2.7.egg ./Pygments-1.6-py2.7.egg import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)
This moved the new packages to sys.__egginsert (or 0) in the path. If there are multiple *.pth files which use this scheme they seem like they will play nicely.
This places easy_install-ed packages before the $PYTHONPATH, so now it finally looks like:
easy_install-ed packages, viaeasy_install.pthin site-packages$PYTHONPATHsys.prefix-ed stdlibsys.exec_prefix-ed stdlibsite-packages(including pip-installed packages)*.pthin site-packages, viapip(or others)
It also creates it's own site.py for whatever reason. It seems to do more reordering of the sys.path, but I can't immediately divine what it is.
