pydoc
And distutils
David Mertz, Ph.D.
Auteur Provocateur, Gnosis Software, Inc.
June, 2001
Several modules and tools introduced in recent versions of Python have improved Python not so much as a language, but as a tool. The modules discussed in this column make the job of Python developers substantially easier by improving the documentation and distribution of Python modules and packages.
KEYWORDS: documentation installation distribution pydoc doctest distutils Python modules packages
Python is a freely available, very-high-level, interpreted language developed by Guido van Rossum. It combines a clear syntax with powerful (but optional) object-oriented semantics. Python is available for almost every computer platform you might find yourself working on, and has strong portability between platforms.
One year ago, if you were to ask an honest Python evangelist if Python was missing anything important that Perl, for example, had, she would have to fess up an affirmative answer. It wasn't that Python lacked a breadth of module and package support--both Python native and extension modules. It certainly wasn't the clarity of expression or clean object orientation in which Python positively excels. What Python was missing was what Perl developers describe as "social factors." But even here, the missing social factors were not an absence of an active, intelligent and supportive Python community--Python has abounded in that. What the Python of one year ago sorely lacked was a sufficient programmatic infrastructure for the sharing of Python code. Code sharing was ad hoc, decentralized, and just plain too much work in the general case.
The first step in improving the social infrastructure of Python was probably Tim Middleton's creation of the Vaults of Parnassus (see Resources). For the first time, Python developers had a single place to turn for (nearly) all contributed third-party modules, packages and tools. The Vaults still have their quirks that make them possibly less advanced (but nicer looking) than the Comprehensive Perl Archive Network. The Vaults merely point to actual resources rather than mirroring them; the Vaults are manually maintained by Middleton, and updates are sometimes slow to happen; and vex.net who generously hosts the Vaults has had intermittent outages. But overall, the service provided by the Vaults of Parnassus has been an invaluable resource in building the architectural prerequisites of a strong Python community.
With such a common site available, all the Python community needed was consistent and solid ways of installing all these available modules, packages and tools; and equally clear ways of figuring out just what they did. To the rescue came several new standard modules in the standard Python distribution..
pydoc
Ka-Ping Yee has created a quite remarkable module by the name
pydoc
(for the "competition": pydoc
does everything that
perldoc
does, but better and prettier :-)). As of Python
2.1, pydoc
(and its supporting inspect
) is part of the
standard library. However, for users of Python 1.5.2, 1.6 or
2.0, downloading and installing pydoc
is extremely easy--do
so right away (see Resources).
By way of background for any Python beginners reading this, Python has long had some semi-formal documentation standards. These standards have not attempted to contrain developers unduly, but rather simply to seem like the "one obvious way to do it." Fortunately, Python developers, as a rule, have always been far better documenters than typical developers in other languages.
The main element of good Python documentation is the use of
so-called "docstrings." While a docstring is really just a
variable called _doc_
, there is a ubiquitous shortcut for
creating them: just put a bare (triple-)quoted string at the
very beginning of a module, function def
, class definition,
or method def
. As well, there are several more-or-less
standard module-level "magic" variable names that are often
used. Despite the informality of the documentation rules,
almost all 3rd party and standard modules use the same
patterns. Let's look at a simplistic example that has most of
the elements used:
#!/usr/bin/python """Show off features of [pydoc] module This is a silly module to demonstrate docstrings """ __author__ = 'David Mertz' __version__= '1.0' __nonsense__ = 'jabberwocky' class MyClass: """Demonstrate class docstrings""" def __init__(self, spam=1, eggs=2): """Set default attribute values only Keyword arguments: spam -- a processed meat product eggs -- a fine breakfast for lumberjacks """ self.spam = spam self.eggs = eggs
The pydoc
module takes advantage of Python documentation
conventions, as well as having some savvy about Python imports,
inheritance, and the like. Moreover, pydoc
has the absolute
genius of allowing itself to be used in multiple modes of
operation. More on this shortly; for a few moments I'll look
at the manpage
style usage at an OS command-line.
Let's say you had the above module mymod
installed on your
system, but weren't sure what it was for (not much in the
example). You might read the source, but even easier might be:
% pydoc.py mymod Python Library Documentation: module mymod NAME mymod - Show off features of [pydoc] module FILE /articles/scratch/cp18/mymod.py DESCRIPTION This is a silly module to demonstrate docstrings CLASSES MyClass class MyClass | Demonstrate class docstrings | | __init__(self, spam=1, eggs=2) | Set default attribute values only | | Keyword arguments: | spam -- a processed meat product | eggs -- a fine breakfast for lumberjacks DATA __author__ = 'David Mertz' __file__ = './mymod.pyc' __name__ = 'mymod' __nonsense__ = 'jabberwocky' __version__ = '1.0' VERSION 1.0 AUTHOR David Mertz
Depending on your specific platform and setup, the above sample will probably be presented in a text viewer that allows scrolling, searching, and so on, and with some highlighting of key words. For something this simple, what you get is only slightly better than just reading the source, but consider something as as simple as:
% cat mymod2.py from mymod import MyClass class MyClass2(MyClass): """Child class""" def foo(self): pass % pydoc.py mymod2.MyClass2 Python Library Documentation: class MyClass2 in mymod2 class MyClass2(mymod.MyClass) | Child class | | __init__(self, spam=1, eggs=2) from mymod.MyClass | | foo(self)
In this quick report we can tell not only that MyClass2
has
the methods __init__()
and foo()
(and the arguments
thereto), but also which methods are implemented locally, and
which other methods come from ancestors (and where those
ancestors live).
Another nice manpage
like feature is the -k
option for
searching modules for keywords. For example:
% pydoc.py -k uuencode uu - Implementation of the UUencode and UUdecode functions. % pydoc.py uu Python Library Documentation: module uu NAME uu - Implementation of the UUencode and UUdecode functions. [...]
Besides its command-line usage, pydoc
has four other "modes"
that can present the same generated documentation.
Shell mode Inside the Python interactive shell, you may importpydoc
'shelp()
function, and get assistance on any object without leaving the interactive session. You can also just typehelp
by itself to get into an interactive "help interpreter." For example:
#------- Interactive shell with help enhancements -------# >>> from pydoc import help >>> import uu >>> help(uu.test) Help on function test in module uu:
test() uuencode/uudecode main program
>>> help
Welcome to Python 2.0! This is the online help utility.
...introductory message about help shell...
help>
Webserver mode Just use the-p
option, andpydoc
will launch itself as a simple webserver on LOCALHOST. You can use any web browser to browse all the modules installed on the current system. The homepage for this server is a list of modules, grouped by directory (and with attractive color blocks for browsers supporting that). Moreover, every module whose documentation you view becomes generously littered with links to any modules, functions and methods it imports.
HTML generator mode The-w
option can generate an HTML documentation page for anythingpydoc
can document. Pretty much, these pages are the same thing you might browse in webserver mode, but the pages are static and available for archiving, transmission, etc.
TK browser mode The-g
option will create a "graphical help browser," much along the lines ofxman
ortkman
.
distutils
As of Python 1.6, a package called distutils
has become part of
the standard Python library. There are two aspects to what
distutils
does. On the one side, distutils
hopes to make
installation of new modules, packages, and tools uniform and
easy for end-users. On the other side, distutils
also hopes
to make the creation of these easy-to-install distributions
easy on their developers. Let's look at both aspects briefly.
In the very simplest case, a developer will have chosen to
create an installer for your specific platform. If this is the
case, you really don't need to know that anything called
distutils
exists at all. Currently, distutils
is capable
of creating RPMs for those Linux systems that support that
format, and Windows EXE self-installers for Win32 systems.
While these are big players, other platforms exist also and/or
the developer might have had access to your platform (or the
time or interest in making an installer).
Fortunately, short of the simplest case, the next best case is
hardly any more difficult. Assuming you get a distutils
aware source distribution, you can count on a number of things
(if nothing goes wrong, of course). The distribution archive
should be in a standard archive format--usually either .zip
or
.tgz
/ .tar.gz
(in odd cases you might find .tbz
or
tar.Z
, and hopefully .sit
support will be added for MacOS
soon). Most of the time, Windows users use zip's and
Linux/Unix users use tarballs. But it is not hard to unpack
most formats on most platforms. Once you have unpacked the
archive, you'll get a collection of files in a directory named
in the same fashion as the archive was. For example:
E:\archive\devel>unzip -q Distutils-1_0_2.zip E:\archive\devel>cd Distutils-1.0.2 E:\archive\devel\Distutils-1.0.2>ls The volume label in drive E is ARCHIVE. The Volume Serial Number is E825:C814. Directory of E:\archive\devel\Distutils-1.0.2 6-14-01 0:38a <DIR> 0 . 6-14-01 0:38a <DIR> 0 .. 5-03-01 6:30p 15355 0 CHANGES.txt 5-03-01 6:32p <DIR> 0 distutils 5-03-01 6:32p <DIR> 0 doc 5-03-01 6:32p <DIR> 0 examples 10-02-00 11:47p 373 0 MANIFEST.in 5-03-01 6:32p <DIR> 0 misc 5-03-01 6:32p 496 0 PKG-INFO 4-20-01 2:30p 14407 0 README.txt 6-29-00 11:45p 1615 0 setup.cfg 5-03-01 6:17p 1120 0 setup.py 4-20-01 2:29p 9116 0 TODO 4-11-00 9:40p 836 0 USAGE.txt
Most module distributions will have fewer files and directories
than this one. The only thing you really need is the file
setup.py
, which contains instructions for the install.
Realistically though, one hopes that there are some other
files in the directory so that setup.py
has something to
install. From here, all you need to do is:
E:\archive\devel\Distutils-1.0.2> python setup.py install
At least that should be all you need to do. If something
goes wrong, read the README.txt
or README
file which is
likely to be included. And after that, check out Greg Ward's
"Installing Python Modules" (see Resources).
But what is going on "under the hood?" As you can guess from
the name, setup.py
is really just a plain Python script, so
it can do anything when it runs. But in almost all cases,
setup.py
will have a pretty stereotypical form. It might
look something like:
#!/usr/bin/env python """Setup script for the sample #1 module distribution: single top-level pure Python module, named explicitly in 'py_modules'.""" from distutils.core import setup setup (# Distribution meta-data name = "sample", version = "1.0", description = "Distutils sample distribution #1", # Description of modules and packages in the distribution py_modules = ['sample'], )
The real work here is performed by the imported distutils
,
specifically by the setup()
function. Basically, setup()
takes a bunch of named arguments, including a list of things to
install (besides py_modules
, there might be packages
or
ext_modules
or some other things).
The magic of distutils
is that creating a module
distribution uses the very same setup.py
file as installing
it does. Once you--the module developer--have created a
setup.py
script (and maybe 'setup.cfg or other adjuncts) that
specifies what needs to get installed, all you need to do to
create the distribution is (one or more of the following):
% python setup.py sdist % python setup.py bdist_wininst % python setup.py bdist_rpm
Depending on which specific distribution you specify, you will create either a standard archive (tarball or zip file, depending on platform) or a full installer (as discussed above).
We are not quite there yet, but Python is getting to be not just one of the easiest to use programming languages, but also one of the easiest to use programming communities. Some of these new tools still have one or two kinks to iron out still, but in a general way, everything one needs to make Python "transparent" to users has been put in place.
The Vaults of Parnassus:
http://www.vex.net/parnassus/
If you are using a version of Python earlier than 2.1, you can
still obtain pydoc
from:
http://web.lfw.org/python/pydoc.py
You will also need to pick up the supporting inspect
module:
http://web.lfw.org/python/inspect.py
Python documentation conventions are discussed in Guido van Rossum's Python Style Guide:
http://www.python.org/doc/essays/styleguide.html
Some enhancements to Python documentation conventions have recently been proposed in Python Enhancment Proposals (PEPs) 256, 257 and 258. These may or may not become part of future Python versions, but it might be interesting to look at the ideas at:
http://python.sf.net/peps/
Another new module that helps in assuring the quality and
correctness of documentation is doctest
. The basic purpose
of this module is to allow automatic evaluation and
verification of the interactive session behavior that is often
pictured within docstrings. Read the module documentation at:
http://python.org/doc/current/lib/module-doctest.html
The latest distutils
can be found at the below link. For
most user of Python 1.6+, it is easiest to stick with the
version of distutils
that came with Python. But users of
Python 1.5.x would do well to download this important package
from:
http://www.python.org/sigs/distutils-sig/download.html
Greg Ward's Installing Python Modules is a good introduction
to the end-user of distutils
. If you download a current
Python documentation set, you get a version of this document.
But the current version of the document should probably be at:
http://www.python.org/sigs/distutils-sig/doc/inst/inst.html
Greg Ward has also written Distributing Python Modules which
discusses distutils
from the point-of-view a module
developer. It can also be found in the current Python
documentation; and also at:
http://www.python.org/sigs/distutils-sig/doc/dist/dist.html
David believes that each developer should contribute according to her ability and receive support according to her need. David may be reached at [email protected]; his life pored over at http://gnosis.cx/publish/. Suggestions and recommendations on this, past, or future, columns are welcomed.