This page contains some of the software I have written. The
software is licensed under the GNU GPL, which allows
you to freely download, modify and redistribute the program and source
code. Should you decide to distribute a modified version of these
programs, then you must make the full source code, including your
modifications, available to any recipients.
Each project has its own page, click the appropriate heading next
to each description.
- This is a small editor for the boxfile format used by the
open source OCR engine
- This is a collection of Unix command line tools which resemble
coreutils, but operate on XML files rather than plain text
files. The initial ideas are explained in this
- This is a Bayesian text and email classifier, suitable for spam
filtering or other classification tasks. It is small and fast. See
this tutorial if you want to do spam
- This program builds a web graph and calculates a generalization
of PageRank using Markov chain technology. The project was created to
operate on a dataset of just under one million real web pages offered
to the public by Google.
- subpixelgs is a hack which lets programs such as gv, Ghostview or
GSView transparently display Postscript and PDF files using subpixel rendering. This was
written before the technology became common on free Unix systems.
- This is the Java source code for the Markov Chain Monte Carlo
demonstration applets which you can find elsewhere on this site. The code is provided here
AS IS, and is not currently being actively developed.