Where Does the `sys.path` Start?
Constructing Python's `import` path
Importing modules in Python seems simple enough on the surface: import mymodule
looks across the sys.path
until it finds your module. But where does the sys.path
itself come from?
Sure, there is a $PYTHONPATH
variable which "augments the default search path for module files", but what is the default search path, how is it "augmented", how does easy_install
or pip
fit into this, and where does my package manager install modules?
Prefix Discovery
Before any script is executed, Python discovers its sys.prefix
and sys.exec_prefix
. This is done by a search_for_prefix
function in Modules/getpath.c
; essentially it uses $PYTHONHOME
if it is set, otherwise it will:
- start at the python executable;
- walk up the parent directories looking for
lib/pythonX.Y/os.py
, and save this directory tosys.prefix
; - walk up the parent directories looking for
lib/pythonX.Y/lib-dynload
, and save this directory tosys.exec_prefix
.
(Consider watching Carl Meyer's PyCon 2011 talk, Reverse-engineering Ian Bicking's brain, for a discussion of how this works, and how virtualenv takes it over.)
The sys.path
is then initialized with the contents of $PYTHONPATH
and the standard library (as contained within the discovered prefixes).
site.py
Next, the interpreter imports site
. This module is responsible for finding, and setting up so-called "site-packages". It uses a addsitedir
function which not only adds the given directory to the path, but also scans for *.pth
files.
Any lines which are found in a *.pth
file are appended to sys.path
, while those that start with import
are executed. This functionality was added largely to satisfy easy_install
's requirements (see below).
Our path now contains (in order):
$PYTHONPATH
sys.prefix
-ed stdlibsys.exec_prefix
-ed stdlibsite-packages
*.pth
in site-packages
Finally, the site module imports sitecustomize
, which you can hook to do whatever you want.
homebrew
Homebrew provides its own sitecustomize.py
, which it uses to clean the site-packages within the python tree out of the sys.path, and add one within the homebrew prefix.
Lets say homebrew is installed at /brew
. Python's prefix is then /brew/opt/python
. If you install packages into /brew/opt/python/lib/pythonX.Y/site-packages
then will be destroyed when you perform a minor upgrade. Their sitecustomize.py
strips all /brew/opt/
out of sys.path
, and replaces it with /brew/lib/pythonX.Y/site-packages
.
The big drawback to this is that we lose the sitecustomize hook.
pip
Pip installs packages into site-packages as "flat" (i.e. directly importable) packages (along with *.egg-info
directories) such that it does not need to use any *.pth
files.
easy_install
easy_install creates an easy_install.pth
(which is required because it chooses to install things as eggs). It also (ab)uses the *.pth
import magic to capture everything which is appended to sys.path and insert it at the front:
import sys; sys.__plen = len(sys.path) ./Jinja2-2.7.2-py2.7.egg ./MarkupSafe-0.23-py2.7-macosx-10.8-x86_64.egg ./docutils-0.11-py2.7.egg ./Pygments-1.6-py2.7.egg import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:]; p=getattr(sys,'__egginsert',0); sys.path[p:p]=new; sys.__egginsert = p+len(new)
This moved the new packages to sys.__egginsert
(or 0) in the path. If there are multiple *.pth
files which use this scheme they seem like they will play nicely.
This places easy_install
-ed packages before the $PYTHONPATH
, so now it finally looks like:
easy_install
-ed packages, viaeasy_install.pth
in site-packages$PYTHONPATH
sys.prefix
-ed stdlibsys.exec_prefix
-ed stdlibsite-packages
(including pip-installed packages)*.pth
in site-packages, viapip
(or others)
It also creates it's own site.py for whatever reason. It seems to do more reordering of the sys.path
, but I can't immediately divine what it is.