Linux Zone Feature: The Twisted Framework

Part Four, Secure Clients and Servers


David Mertz, Ph.D.
Selector, Gnosis Software, Inc.
August, 2003

In this final installment of his series on Twisted, David looks at specialized protocols and servers contained in the Twisted package, with a focus on secure connections.

Introduction

One thing the servers and clients in my prior installments had in common is that they operated completely in the clear, cryptographically speaking. Sometimes, however, you want to keep your connection free from prying eyes (or from tampering/spoofing).

While protocols for determining permissions on server resources are interesting, for this installment, I want to look at protocols involving actual wire-level encryption. But for general background, readers might want to investigate web-oriented mechanisms like Basic Authentication, which is described in RFC-2617 and implemented in Apache and other web servers. The Twisted package twisted.cred is a general, but complex and complicated, framework for providing authentication services in general-purpose Twisted servers (not limited to web ones).

There are two widespread APIs for wire-level encryption over the Internet: SSL and SSH. The former, SSL (Secure Sockets Layer) is widely implemented in web browsers and web servers; in principle, however, there is no reason SSL is specifically tied to the HTTP protocol. SSL combines a public-key infrastructure, complete with a "web-of-trust" based on Certificate Authorities, with creation of a session key for standard symmetrical encryption during the life of a particular connection.

Twisted does come with an SSL framework; however, as with most things in Twisted, exactly how it might work is poorly documented--I tried downloading two likely support packages to try to get the Twisted v.1.0.6 script test_ssl.py to work (see Resources). I am sure that with some version of the right 3rd party libraries (and some Twisted version)--and perhaps after corrections to erroneous examples--it is possible to use SSL with Twisted, but I have not done so for this article.

The other widespread API for wire-level encryption is SSH (Secure Shell), well known from the tool of the same name (in lowercase: ssh). Many of the underlying cryptographic algorithms are shared between SSL and SSH, but SSH is focussed on creating encrypted shell connections (rather than using snooper-friendly programs/protocols like telnet and rsh). Twisted lets you write your own custom SSH clients and servers, which is quite nice. While you certainly can write a basic interactive remote shell, like that provided by the client and server shh and sshd, you can also create more specialized tools to use these secure connections for higher-level purposes.

An Ssh Weblog Client

In continuing with the example of this series of articles, I created a tool to examine hits to my webserver log file, but to do so over an encrypted SSH channel. This purposes is realistic, actually--perhaps I do not want to publically reveal the hits I get to someone monitoring my packet stream.

Before I could get far in my efforts, I needed to figure out what the line import Crypto in the twisted.conch package was trying to find. The name is obviously a hint, but I was also somewhat familiar with the Python cryptography library maintained by Andrew Kuchling (see Resources). A bit of googling, a download, and an install later, Twisted's test_conch.py would run without complaint. So on to the project of creating a custom SSH client.

I based my client on the example provided in the Twisted file doc/examples/sshsimpleclient.py. I have simplified somewhat (as well as customizing); you you might want to look at what else is in the distributed example. As with most Twisted components, twisted.conch consists of several layers, each of which can be customized. I guess the name "conch" is a play on the word "shell" in Secure Shell.

The transport level is a customization of SSHClientTransport. We may define several methods, but need to at least define .verifyHostKey() and .connectionSecure(). In our implementation, we trust every host key, and simply give control back to the asynchronous reactor core by returning a defer.succeed object. Of course, if you wanted to verify a host against a known key, you could do that in .verifyHostKey().

Creating the channel is where the other layers come in. A child of SSHUserAuthClient performs the actual login authentication; and if successful, it established a connection (for which I define a child of SSHConnection). This connection, in turn, creates a channel--a child of SSHChannel. It is the channel, which I named simply Channel that does the actual custom work. Specifically, the channel does things like send and receive data and commands. Let us look at my specific client:

ssh-weblog.py

#!/usr/bin/env python
"""Monitor a remote weblog over SSH

  USAGE: ssh-weblog.py user@host logfile
"""
from twisted.conch.ssh import transport, userauth, connection, channel
from twisted.conch.ssh.common import NS
from twisted.internet import defer, protocol, reactor
from twisted.python import log
from getpass import getpass
import struct, sys, os
import webloglib as wll
#
USER,HOST,CMD = None,None,None
#
class Transport(transport.SSHClientTransport):
    def verifyHostKey(self, hostKey, fingerprint):
        print 'host key fingerprint: %s' % fingerprint
        return defer.succeed(1)

    def connectionSecure(self):
        self.requestService(UserAuth(USER, Connection()))
#
class UserAuth(userauth.SSHUserAuthClient):
    def getPassword(self):
        return defer.succeed(getpass("password: "))
    def getPublicKey(self):
        return  # Empty implementation: always use password auth
#
class Connection(connection.SSHConnection):
    def serviceStarted(self):
        self.openChannel(Channel(2**16, 2**15, self))
#
class Channel(channel.SSHChannel):
    name = 'session'    # must use this exact string
    def openFailed(self, reason):
            print '"%s" failed: %s' % (CMD,reason)
    def channelOpen(self, data):
        self.welcome = data   # Might display/process welcome screen
        d = self.conn.sendRequest(self,'exec',NS(CMD),wantReply=1)
    def dataReceived(self, data):
        recs = data.strip().split('\n')
        for rec in recs:
            hit = [field.strip('"') for field in wll.log_fields(rec)]
            resource = hit[wll.request].split()[1]
            referrer = hit[wll.referrer]
            if resource=='/kill-weblog-monitor':
                print "Bye bye..."
                self.closed()
                return
            elif hit[wll.status]=='200' and hit[wll.referrer]!='-':
                print referrer, ' -->', resource
    def closed(self):
        self.loseConnection()
        reactor.stop()
#
if __name__=='__main__':
    if len(sys.argv) < 3:
        sys.stderr.write('__doc__')
        sys.exit()
    USER, HOST = sys.argv[1].split('@')
    CMD = 'tail -f -n 1 '+sys.argv[2]
    protocol.ClientCreator(reactor, Transport).connectTCP(HOST, 22)
    reactor.run()

The overall structure of the client is like most of the Twisted applications we have seen. It creates a protocol, and monitors events in an asyncronous loop (i.e. reactor.run()).

The interesting part comes in the methods of Channel(). As soon as the channel is opened, we execute a custom command--in this case, a tail -f on the weblog file whose name is specified on the command line. Naturally, the host, which is still a completely generic sshd server rather than anything Twisted specific, starts sending some data back. The method dataReceived() parses the data as it comes in (incrementally as tail produces more). For this specific client, we decide when to terminate based on the actual content of the weblog being parsed--which amounts to having a web-based way to kill the monitoring application. While that specific configuration is probably unusual, the example demonstrates the general concept of severing the connection when some condition is met (it could be any condition). A session looks like:

Sample session of weblog monitor

$ ./ssh-weblog.py [email protected] access-log
host key fingerprint: 56:54:76:b6:92:68:85:bb:61:d0:f0:0e:3d:91:ce:34
password:
http://gnosis.cx/publish/  --> /publish/whatsnew.html
http://gnosis.cx/publish/whatsnew.html  --> /home/hugo.gif
Bye bye...

This is pretty much the same as all the other weblog monitors this series created. I ended the above session by pointing a browser at <http://gnosis.cx/kill-weblog-monitor> from another window (otherwise, it would watch indefinitely).

Modifying The Ssh Client

It is a simple matter to create other SSH clients that achive other purposes. For example, I copied ssh-weblog.py to the name scp.py, and made just a few changes to the code. The _main_ body parses options slightly differently, and the docstring was adjusted; beyond that, I simply modified the .dataReceived() method to read:

scp.py (modified Channel method)

def dataReceived(self, data):
    open(DST,'wb').write(data)
    self.closed()

(the variable CMD was set to "cat "+sys.argv[2]).

Viola! I have implemented the tool scp that accompanies many SSH clients.

These examples are both "run and collect" tools. That is, they are not interactive during the session. But you could easily create another tool that made additional calls to self.conn.sendRequest() within Channel methods. In fact, if the client was some kind of GUI client, you might add those data collection forms as callbacks within the reactor. That is, perhaps when certain forms are completed, new remote commands could be issued, and the results again collected for processing or presentation.

An Ssh Weblog Server

An SSH server uses much of the same structure as the client. As before, I simplify and customize doc/examples/sshsimpleserver.py for my example. One twist is that a server is best created using an SSHFactory child that has been configured with appropriate keys and classes.

In our SSH weblog server, we configure a password and username for an authorized user. In the example, they are hardcoded, but you could obviously store them otherwise; perhaps configure a list of authorized weblog monitors. Let us look at the example:

ssh-weblog-server.py

#!/usr/bin/env python2.3
from twisted.cred import authorizer
from twisted.conch import identity, error
from twisted.conch.ssh import userauth, connection, channel, keys
from twisted.conch.ssh.factory import SSHFactory
from twisted.internet import reactor, protocol, defer
import time
#
class Identity(identity.ConchIdentity):
    def validatePublicKey(self, data):
        return defer.succeed('')
    def verifyPlainPassword(self, password):
        if password=='password' and self.name == 'user':
            return defer.succeed('')
        return defer.fail(error.ConchError('bad password'))
#
class Authorizer(authorizer.Authorizer):
    def getIdentityRequest(self, name):
        return defer.succeed(Identity(name, self))
#
class Connection(connection.SSHConnection):
    def gotGlobalRequest(self, *args):
        return 0
    def getChannel(self, channelType, windowSize, maxPacket, data):
        if channelType == 'session':
            return Channel(remoteWindow=windowSize,
                      remoteMaxPacket=maxPacket, conn=self)
        return 0
#
class Channel(channel.SSHChannel):
    def channelOpen(self, data):
        weblog = open('../access.log')
        weblog.readlines()
        while 1:
            time.sleep(5)
            for rec in weblog.readlines():
                self.write(rec)
    def request_pty_req(self, data):
        return 1    # ignore, but this gets send for shell requests
    def request_shell(self, data):
        self.client = protocol.Protocol()
        self.client.makeConnection(self)
        self.dataReceived = self.client.dataReceived
        return 1
    def loseConnection(self):
        self.client.connectionLost()
        channel.SSHChannel.loseConnection(self)
#
class Factory(SSHFactory):
    publicKeys = {'ssh-rsa':keys.getPublicKeyString(
                            data=open('~/.ssh/id_rsa.pub').read())}
    privateKeys ={'ssh-rsa':keys.getPrivateKeyObject(
                            data=open('~/.ssh/id_rsa').read())}
    services = {'ssh-userauth': userauth.SSHUserAuthServer,
                'ssh-connection': Connection}
    authorizer = Authorizer()
#
reactor.listenTCP(8022, Factory())
reactor.run()

For brevity, the parsing and formatting of the weblog records is omitted, but the idea of using a open channel to write new records as they become available is almost the same as with the client approach. Of course, in this case, any generic SSH client can connect to the specialized server:

Sample session of weblog monitor

$ ssh gnosis.python-hosting.com -p 8022 -l user
[email protected]'s password:
141.154.146.89 - - [26/Aug/2003:02:47:40 -0500]
"GET /voting-project/August.2003/0010.html HTTP/1.1" 200 8986
"http://gnosis.python-hosting.com/voting-project/August.2003/0009.html"
"Mozilla/5.0 (Macintosh; U; PPC Mac OS X; en-us) AppleWebKit/85
(KHTML, like Gecko) Safari/85"
[...]

Much as with the client approach, an enhanced version might become more interactive; the .dataReceived() method of the channel could be customized to do something useful with data sent from the (generic) client.

Social Dynamics

The biggest reservation I have about recommending the Twisted framework is, unfortunately, the "wild west" feel among its developer group. The software itself is quite powerful. But even more than in most open source projects, there is insufficient API consistency between releases, the documentation remains rough, and a thick skin is the main prerequisite for seeking help on its mailing list; you can get helpful responses, but only after wading through the acerbic ones.

As this installment demonstrated--especially in my attempts to fill in pieces missing from the examples and documentation, Twisted could really stand to have a helpful community behind it. Hopefully, with time, both the documentation and mailing list will improve in quality; the facilities hiding in the various corners of the Twisted framework are quite impressive.

Resources

Twisted Matrix comes with quite a bit of documentation, and many examples. Browse around its homepage to glean a greater sense of how Twisted Matrix works, and what has been implemented with it (or wait for the next installments here):

http://twistedmatrix.com

The Python Cryptography Toolkit, maintained by Andrew Kuchlink, can be download at the following URL. This toolkit includes numerous well-investigated public-key, private-key, and cryptographic hash functions, as well as some miscellaneous other protocols:

http://www.amk.ca/python/code/crypto.html

The sourceforge project "Python OpenSSL Wrappers" (POW) looks like an useful tool for SSL programming in Python. However, it does not appear (from my trial-and-error) to be what Twisted is looking for in its SSL subsystem:

http://sourceforge.net/projects/pow

Most likely, for Twisted, the SSL wrapper you want is pyOpenSSL. At least after I installed that, I got past an import exception in Twisted's test_ssl.py (but only so far as what appears to be an error in the test script):

http://sourceforge.net/projects/pyopenssl/

Some background on HTTP authentication techniques can be found in RFC-2617:

http://www.ietf.org/rfc/rfc2617.txt

An introduction to the SSL protocol can be found at:

http://developer.netscape.com/tech/security/ssl/howitworks.html

A simple version of a weblog server was presented in the developerWorks tip, Use Simple API for XML as a long-running event processor:

http://www-106.ibm.com/developerworks/xml/library/x-tipasysax.html

About The Author

Picture of Author David Mertz believes that it is turtles all the way down. David may be reached at [email protected]; his life pored over at http://gnosis.cx/publish/. And buy his book: Text Processing in Python (http://tinyurl.com/jskh).