David Mertz, Ph.D.
Fat fingered juggler, Gnosis Software, Inc.
May, 2006
An interesting side project of the Python Enterprise Application Kit (PEAK) is thesetuptools
framework. Usingsetuptools
replaces the standard librarydisutils
as well as adding versioned package and dependency management to Python. Perl users will be familiar with CPAN, and Ruby users with Gems: The toolez_setup
that bootstrapssetuptools
and the expandedeasy_install
that comes with it act in conjunction with "Cheeseshot" (the Python Package Index, also called "PyPI") to achieve the same thing. Moreover,setuptools
lets you package your libraries in a single-file archive called an "egg"--which is a lot like a Java jar, just for Python.
The setuptools
module does a really good job of "getting out of the
way". For example, if you download a package that was built using
setuptools
rather than disutils
, installation should work just as
you are used to: the usual dance of python setup.py install
. In
order to accomplish this, a package bundled using setuptools
includes the small bootstrap module ez_setup.py
in the archive. The
only caveat here is that ez_setup.py
tries to download and install
the necessary setuptool
package in the background--which depends, of
course, on having a networked machine. If setuptools
is already
installed on the local machine, this background step is not necessary;
but if it needs to be installed manually, that loses much of the
transparency. Still, most systems nowadays have an internet
connection; and taking a few special steps for non-networked machines
is not necessarily unduly burdensome.
The real benefits of setuptools
is not in doing roughly what
distutils
does, even though it does enhance the capabilities of
distutils
and simplify what goes into a setup.py
script. The
greatest gain is setuptools
enhancement of package management
capabilities. In a rather transparent way, you can find, download and
install dependencies; you can switch between multiple versions of a
package, all of which are installed on the same system; you can
declare requirements for specific versions of packages; and you can
update to the latest versions of packages you use with a simple
command. The most impressive part of all this is perhaps that fact
that you can even utilize packages whose developers have done nothing
whatsoever to consider setuptools
compatibility. Let's take a look.
The utility ez_setup.py
is a simple script that will bootstraps the
rest of setuptools
. Slightly confusingly, the easy_install
script
that comes with the full setuptools
package does the same thing as
ez_setup.py
. The former assumes setuptools
is already installed,
however, so skips the behind-the-scenes installation. Both versions
accept the same arguments and switches.
The first step in the process is simply downloading the small script
ez_setup.py
, e.g.:
% wget -q http://peak.telecommunity.com/dist/ez_setup.py
From there, you may run the script without any arguments to install
the rest of setuptools
(if you do not do this as a separate step, it
will stil get done the first time you install some other package).
You should see something similar to:
% python ez_setup.py Downloading http://cheeseshop.python.org/packages/2.4/s/� �setuptools/setuptools-0.6b1-py2.4.egg#md5=b79a8a403e4502fbb85ee3f1941735cb Processing setuptools-0.6b1-py2.4.egg creating /sw/lib/python2.4/site-packages/setuptools-0.6b1-py2.4.egg Extracting setuptools-0.6b1-py2.4.egg to /sw/lib/python2.4/site-packages Removing setuptools 0.6a11 from easy-install.pth file Adding setuptools 0.6b1 to easy-install.pth file Installing easy_install script to /sw/bin Installing easy_install-2.4 script to /sw/bin Installed /sw/lib/python2.4/site-packages/setuptools-0.6b1-py2.4.egg Processing dependencies for setuptools
All done. That's all you need to do to make sure setuptools
is
installed on your system.
For many Python packages, all you need to do so install them is pass
their name as a parameter to ez_setup.py
or easy_install
. Now
that we have bootstrap loaded setuptools
, we might as well use the
internally simpler easy_install
(though it makes little difference
which you choose to use in practice).
For example, let us say you want to install the package SQLObject. This can be as simple as the following. Notice in the below messages that SQLObject turned out to depend on a package FormEncode; luckily, it is all taken care of for us:
% easy_install SQLObject Searching for SQLObject Reading http://www.python.org/pypi/SQLObject/ Reading http://sqlobject.org Best match: SQLObject 0.7.0 Downloading http://cheeseshop.python.org/packages/2.4/S/� �SQLObject/SQLObject-0.7.0-py2.4.egg#md5=71830b26083afc6ea7c53b99478e1b6a Processing SQLObject-0.7.0-py2.4.egg creating /sw/lib/python2.4/site-packages/SQLObject-0.7.0-py2.4.egg Extracting SQLObject-0.7.0-py2.4.egg to /sw/lib/python2.4/site-packages Adding SQLObject 0.7.0 to easy-install.pth file Installing sqlobject-admin script to /sw/bin Installed /sw/lib/python2.4/site-packages/SQLObject-0.7.0-py2.4.egg Processing dependencies for SQLObject Searching for FormEncode>=0.2.2 Reading http://www.python.org/pypi/FormEncode/ Reading http://formencode.org Best match: FormEncode 0.5.1 Downloading http://cheeseshop.python.org/packages/2.4/F/� �FormEncode/FormEncode-0.5.1-py2.4.egg#md5=f8a19cbe95d0ed1b9d1759b033b7760d Processing FormEncode-0.5.1-py2.4.egg creating /sw/lib/python2.4/site-packages/FormEncode-0.5.1-py2.4.egg Extracting FormEncode-0.5.1-py2.4.egg to /sw/lib/python2.4/site-packages Adding FormEncode 0.5.1 to easy-install.pth file Installed /sw/lib/python2.4/site-packages/FormEncode-0.5.1-py2.4.egg
As you can see from the messages, easy_install
looks for metadata
information about the package at the www.python.org/pypi
site, then
finds the location for the actual download (in this case the egg
archive lives right at cheeseshop.python.org
; more on eggs soon).
You can do more than just install the latest version of a package, as
is default. If you like, you can give easy_install
specific version
requirements. Let us try to install a post-beta version of SQLObject:
% easy_install 'SQLObject>=1.0' Searching for SQLObject>=1.0 Reading http://www.python.org/pypi/SQLObject/ Reading http://sqlobject.org No local packages or download links found for SQLObject>=1.0 error: Could not find suitable distribution for � �Requirement.parse('SQLObject>=1.0')
At least at the time this article was written, SQLObject only exists up to version 0.7.0, so there is nothing to install.
The package SQLObject is already "setuptools aware"; but what if we
want to install a package whose author has not given thought to
setuptools
? For example, before this article, I never used
setuptools
with my "Gnosis Utilities". Still, let us try installing
the package, knowing only the HTTP (or FTP, SVN, CVS) location where
it lives (setuptools
knows all these protocols). My download
website has archives of the various Gnosis Utilities versions, named
in a usual versioning fashion:
% easy_install -f http://gnosis.cx/download/Gnosis_Utils.More/ Gnosis_Utils Searching for Gnosis-Utils Reading http://gnosis.cx/download/Gnosis_Utils.More/ Best match: Gnosis-Utils 1.2.1 Downloading http://gnosis.cx/download/Gnosis_Utils.More/� �Gnosis_Utils-1.2.1.zip Processing Gnosis_Utils-1.2.1.zip Running Gnosis_Utils-1.2.1/setup.py -q bdist_egg --dist-dir � �/tmp/easy_install-CCrXEs/Gnosis_Utils-1.2.1/egg-dist-tmp-Sh4DW1 zip_safe flag not set; analyzing archive contents... gnosis.__init__: module references __file__ gnosis.magic.__init__: module references __file__ gnosis.xml.objectify.doc.__init__: module references __file__ gnosis.xml.pickle.doc.__init__: module references __file__ gnosis.xml.pickle.test.test_zdump: module references __file__ Adding Gnosis-Utils 1.2.1 to easy-install.pth file Installed /sw/lib/python2.4/site-packages/Gnosis_Utils-1.2.1-py2.4.egg Processing dependencies for Gnosis-Utils
Happily for us, easy_install
figured everything out for us. It
looked in the given download directory, identified the highest
available version number, unpackaged the archive, and repackaged it as
an "egg" that was installed. Importing gnosis
now works fine in an
script. But suppose I now I need to test a script against a specific
earlier version of Gnosis Utilities? Easy enough:
% easy_install -f http://gnosis.cx/download/Gnosis_Utils.More/ � �"Gnosis_Utils==1.2.0" Searching for Gnosis-Utils==1.2.0 Reading http://gnosis.cx/download/Gnosis_Utils.More/ Best match: Gnosis-Utils 1.2.0 Downloading http://gnosis.cx/download/Gnosis_Utils.More/� �Gnosis_Utils-1.2.0.zip [...] Removing Gnosis-Utils 1.2.1 from easy-install.pth file Adding Gnosis-Utils 1.2.0 to easy-install.pth file Installed /sw/lib/python2.4/site-packages/Gnosis_Utils-1.2.0-py2.4.egg Processing dependencies for Gnosis-Utils==1.2.0
There are actually two versions of Gnosis Utilities installed now, with 1.2.0 the active version. Switching the active version back to 1.2.1 is also easy:
% easy_install "Gnosis_Utils==1.2.1" Searching for Gnosis-Utils==1.2.1 Best match: Gnosis-Utils 1.2.1 Processing Gnosis_Utils-1.2.1-py2.4.egg Removing Gnosis-Utils 1.2.0 from easy-install.pth file Adding Gnosis-Utils 1.2.1 to easy-install.pth file Using /sw/lib/python2.4/site-packages/Gnosis_Utils-1.2.1-py2.4.egg Processing dependencies for Gnosis-Utils==1.2.1
Of course, this only makes one version active at a time. But an individual script can choose the version it wants to use it it likes by putting two lines at the top of your script, e.g.:
from pkg_resources import require require("Gnosis_Utils==1.2.0")
With this stated requirement, setuptools
will add the specific
version (or the latest available, if the greater-than comparison is
specified) when an import
statement is run.
We would like to let users install Gnosis Utilities without even
knowing it's download directory. This almost works simply because
Gnosis Utilities has an information listing at the Python Cheeseshop.
Unfortunately, not having considered setuptools
, I had created a
slight "impedence mismatch" in my entry for Gnosis Utilities at
<http://www.python.org/pypi/Gnosis%20Utilities/1.2.1>. Specifically,
the archives are named on a pattern like Gnosis_Utils-N.N.N.tar.gz
(also archived as .zip
and .tar.bz2
, and the last few versions as
win32.exe installers, all of which setuptools
is equally happy
with). But the project name on Cheeseshop is spelled slightly
differently as "Gnosis Utilities". Oh well, a quick administrative
version change at Cheeseshop created
<http://www.python.org/pypi/Gnosis_Utils/1.2.1-a> as a post-release
version. Nothing was changed in the distribution archives themselves,
just a little bit of metadata at Cheeseshop. With the slight tweak, we
might use an even simpler install (note that for testing purposes I
ran an intervening easy_install -m
to remove the installed package).
% easy_install Gnosis_Utils Searching for Gnosis-Utils Reading http://www.python.org/pypi/Gnosis_Utils/ Reading http://www.gnosis.cx/download/Gnosis_Utils.ANNOUNCE Reading http://gnosis.cx/download/Gnosis_Utils.More/ Best match: Gnosis-Utils 1.2.1 Downloading [...]
I omit the completion of the process, since it is identical to what we
have seen. The only change is that easy_install
looks on Cheeseshop
(i.e. www.python.org/pypi/
) for metadata about a package matching
the name specified, and uses that to look for an actual download
location. In this case, the listed .ANNOUNCE
file does not contain
anything helpful, but easy_install
is happy to keep looking at the
other listed URL as well, which proves to be a download directory.
An egg is a bundle that contains all the package data. In the ideal
case, an egg is a zip-compressed file with all the necessary package
files. But in some cases, setuptools
decides (or is told by
switches) that a package should not be zip-compressed. In those
cases, an egg is simply an uncompressed subdirectory instead, but with
the same contents. The single file version is handy for transporting,
and saves a little bit of disk-space, but an egg directory is
functionally and organizationally identical. Java users who have
worked with jars will find eggs very familiar.
You may use an egg simply by pointing PYTHONPATH
or sys.path
at
it, then importing as you normally would, thanks to the import hook
changes in recent versions of Python (you need 2.3.5+ or 2.4). If you
wish to take this approach, you do not need to bother with
setuptools
or ez_setup.py
at all. For example, I put an egg for
the PyYAML package in a working directory that I used for this
article. I can utilize the package as easily as:
% export PYTHONPATH=~/work/dW/PyYAML-3.01-py2.4.egg % python -c 'import yaml; print yaml.dump({"foo":"bar",1:[2,3]})' 1: [2, 3] foo: bar
However, this sort of manipulation of the PYTHONPATH
(or of
sys.path
within a script or Python shell session) is a bit fragile.
Discovery of eggs is probably best handled within some newish magic
.pth
files. Any .pth
files found in site-packages/
or on the
PYTHONPATH
are parsed for additional imports to perform, in a very
similar manner to the way directories in those locations that might
contain packages are examined. If you handle package management with
setuptools
, a file called easy-install.pth
is modified when
packages are installed, upgraded, removed, etc. But you may call your
.pth
files whatever you like (as long as they have the extension).
For example, here is my easy-install.pth
:
% cat /sw/lib/python2.4/site-packages/easy-install.pth import sys; sys.__plen = len(sys.path) setuptools-0.6b1-py2.4.egg SQLObject-0.7.0-py2.4.egg FormEncode-0.5.1-py2.4.egg Gnosis_Utils-1.2.1-py2.4.egg import sys; new=sys.path[sys.__plen:]; del sys.path[sys.__plen:];� �p=getattr(sys,'__egginsert',0); sys.path[p:p]=new;� �sys.__egginsert = p+len(new)
The format is a bit peculiar: it is almost, but not quite, a Python
script. Suffice it to say that you may add additional listed eggs in
there; or better yet, easy_install
will do it for you when it runs.
You may also create as many other .pth
files as you like under
site-packages/
and each may simply list the eggs to make available.
The magic I showed off above of installing a setuptools
naive
package actually only sort-of worked in the example I showed you. That
is, the package Gnosis_Utils got installed, but not quite completely.
All the general functionality works, but a variety of supporting files
were omitted when the egg was automatically generated--mostly
documentation files with a .txt
extension and test files with .xml
extensions (but also some miscellaneous README
, .rnc
, .rng
,
.xsl
, and whatnot scattered around the subpackages). As it happens,
all of these supporting files are "nice-to-have" more than they are
strictly required. Still, we would like to include all the supporting
files.
The setup.py
script for Gnosis_Utils is quite complex, actually. As
well as list basic metadata, in 467 lines of code, it performs a whole
bunch of testing for Python version capabilities and bugs; works
around glitches in old versions of disutils
; falls back to skipping
installation of non-supported parts (e.g. if pyexpat
is not included
in your Python distribution); handles OS line-ending convention
conversion; creates multiple archive/installer types; and rebuilds the
MANIFEST
file in response to these tests. Doing all this work is
mostly thanks to the package co-maintainer, Frank McIngvale; and it
lets Gnosis_Utils successfully install as far back as Python 1.5.1, if
necessary (with reduced capabilities in earlier versions). The quick
moral here is that what I am about to show you does not do as much as
the complex distutils
script: it simply assumes a "normal" looking
and recent version of Python is installed. That said, it is still
impressive just how easy setuptools
can make an installation script.
As a first try, let us create a setup.py
script borrowing from the
setuptools
manual, and try creating an egg using it:
% cat setup.py from setuptools import setup, find_packages setup( name = "Gnosis_Utils", version = "1.2.2", packages = find_packages(), ) % python setup.py -q bdist_egg zip_safe flag not set; analyzing archive contents... gnosis.__init__: module references __file__ gnosis.doc.__init__: module references __file__ gnosis.magic.__init__: module references __file__ gnosis.xml.objectify.doc.__init__: module references __file__ gnosis.xml.pickle.doc.__init__: module references __file__ gnosis.xml.pickle.test.test_zdump: module references __file__
This little effort works; or it sort-of works. We really do create an
egg with these few lines, but the egg has the same shortcoming as the
version easy_install
created: it lacks the support files that are
not named .py
. So let us try only slightly harder:
from setuptools import setup, find_packages setup( name = "Gnosis_Utils", version = "1.2.2", package_data = {'':['*.*']}, packages = find_packages(), )
It turns out that is all we need to do. Of course, in practice you will often want to fine tune this a bit. For example, more realistically, this might list, e.g.:
package_data = {'doc':['*.txt'], 'xml':['*.xml', 'relax/*.rnc']}
Which would mean to include the .txt
files under the doc/
subpackage, all the .xml
files under the xml/
subpackage, and all
the .rnc
files under the xml/relax/
subpackage.
I really only scratched the surface of the customization you can
perform with setuptools
aware distributions. For example, once you
have a distribution (either the preferred egg format, or another
archive type), you may automatically upload the archive and metadata
to Cheeseshop with a single command. Obviously, a complete
setup.py
script should contain the same detailed metadata that your
old distutils
scripts contained--I skipped that just for ease of
presentation, but the argument names are compatible with distutils
.
It takes a little while to become fully comfortable with setuptools
large set of capabilities, but it really makes both maintaining your
own packages and installing outside packages much easier than the
distutils
baseline. And if all you care about is installing
packages, pretty much everything you need to know is contained in this
introduction; the complexity only comes with describing your own
packages, and that complexity is still less than required to "grok"
distutils
.
The latest version of setuptools
can be found on Cheeseshop, at:
http://cheeseshop.python.org/pypi/setuptools/
The home page for PEAK itself is the place to start for an introduction to the library as a whole.
http://peak.telecommunity.com/
The full manual for setuptools
is at:
http://peak.telecommunity.com/DevCenter/setuptools
David Mertz has many versions of each of his thoughts, and overall lacks any unity of ego. David may be reached at [email protected]; his life pored over athttp://gnosis.cx/publish/. Check out David's book Text Processing in Python (http://gnosis.cx/TPiP/).