Katahdin

Katahdin is a programming language where the syntax and semantics are mutable at runtime. It was the 2007 master's project of Chris Seaton at the University of Bristol Department of Computer Science. My supervisor was Dr. Henk Muller.

Katahdin employs the theory of parsing expression grammars and packrat parsing. Unlike other contemporary work, Katahdin applies these techniques at runtime to allow the grammar to be modified by a running program. New constructs such as expressions and statements can be defined, or a new language can be implemented from scratch. It is built as an interpreter on the Mono implementation of the .NET framework.

Introduction to Katahdin

In most programming languages you can define new functions and types. In the same way, Katahdin allows you to define new expressions, statements and other language constructs.

For example, most programming languages have a modulo, or remainder operator. If your language doesn't have one, creating one would involve learning about the internals of the implementation, modifying the grammar and adding semantic actions throughout the code base. You would then have to recompile and install your new implementation. Anyone else wanting to use it would have to be given a set of patches for your changes against the official implementation. In Katahdin, the complete syntax and semantics of the modulo operator can be defined from scratch in just a few lines:

class ModExpression : Expression {
    pattern {
        option leftRecursive;
        a:Expression "%" b:Expression
    }
   
    method Get() {
        a = this.a.Get...();
        b = this.a.Get...();
        return a - (b * (a / b));
    }
}

In Katahdin this is a runtime operation, so immediately after defining ModExpression the modulo operator becomes part of the language in line with all the other operators. You can define a new construct on one line and use it on the next.

There are no special constructs in Katahdin that cannot be modified, including white-space and basic tokens. If all of the constructs in a language are defined as above, Katahdin can be turned into an interpreter for that language. In this way, Katahdin can be used as a generic interpreter for any language, or one language can be used from within another by composing the two or more definitions. My implementation includes proof-of-concept language definitions for FORTRAN and Python.

There are real world scenarios where one would want to use one language from within another. People use SQL from within other languages all the time by writing the SQL in strings. In Katahdin, the definition of SQL can be composed with the main language and used as if it were all part of one language. There are also good examples of why one would want to define new constructs in a running program. For example, Java 1.5 included a new for-each-statement. Programmers wanting to use the statement had to wait for Sun's implementation to be drafted, implemented, tested and distributed. In Katahdin programmers can create new statements on their own as they need them, without modifying the implementation. If you want a new for-each-statement in your language you can add one in a minute.

Literature

A Programming Language Where the Syntax and Semantics Are Mutable at Runtime

My master's thesis describing the background, theory, implementation and current status of Katahdin. Also published as a technical report at the University of Bristol.

Slides describing Katahdin

Slides used to present my thesis.

Katahdin: Mutating a Programming Language's Syntax and Semantics at Runtime

An unpublished paper on Katahdin, written at an earlier stage in development.

Katahdin Consulting Business Plan

A hypothetical business plan for employing Katahdin, written for an associated course unit.

All of this literature is copyright © Chris Seaton, 2007.

Source Code

The Katahdin source code has been put into the public domain. It can be used for any purpose, in combination with any license.

Requires:

Download:

katahdin-0.2.tar.gz (108 KB)

Current Status

I am currently training to be an officer with the British Army so I have no opportunity to continue development. The source code is available for download above and is in the public domain so there are no licensing issues. If you are interested, please take Katahdin and study it, put it to good use, or adopt it for development.

I can still be contacted at chris@chrisseaton.com, although I will be unable to check my e-mail for long periods of time.

This page was published June 2007.

Frequently Asked Questions

Can't you already do this in Lisp?

Write me a set of Lisp macros that allow you to run an unmodified off-the shelf FORTRAN or Python program!

References to Katahdin