Introduction
Katahdin is a programming language where the syntax and semantics are mutable at runtime. It was the 2007 master's project of Chris Seaton at the University of Bristol Department of Computer Science. My supervisor was Dr. Henk Muller.
Katahdin employs the theory of parsing expression grammars and packrat parsing. Unlike other contemporary work, Katahdin applies these techniques at runtime to allow the grammar to be modified by a running program. New constructs such as expressions and statements can be defined, or a new language can be implemented from scratch. It is built as an interpreter on the Mono implementation of the .NET framework.
Introduction to Katahdin
In most programming languages you can define new functions and types. In the same way, Katahdin allows you to define new expressions, statements and other language constructs.
For example, most programming languages have a modulo, or remainder operator. If your language doesn't have one, creating one would involve learning about the internals of the implementation, modifying the grammar and adding semantic actions throughout the code base. You would then have to recompile and install your new implementation. Anyone else wanting to use it would have to be given a set of patches for your changes against the official implementation. In Katahdin, the complete syntax and semantics of the modulo operator can be defined from scratch in just a few lines:
class ModExpression : Expression {
pattern {
option leftRecursive;
a:Expression "%" b:Expression
}
method Get() {
a = this.a.Get...();
b = this.a.Get...();
return a - (b * (a / b));
}
}
In Katahdin this is a runtime operation, so immediately after
defining ModExpression the modulo operator becomes
part of the language in line with all the other operators. You can
define a new construct on one line and use it on the next.
There are no special constructs in Katahdin that cannot be modified, including white-space and basic tokens. If all of the constructs in a language are defined as above, Katahdin can be turned into an interpreter for that language. In this way, Katahdin can be used as a generic interpreter for any language, or one language can be used from within another by composing the two or more definitions. My implementation includes proof-of-concept language definitions for FORTRAN and Python.
There are real world scenarios where one would want to use one language from within another. People use SQL from within other languages all the time by writing the SQL in strings. In Katahdin, the definition of SQL can be composed with the main language and used as if it were all part of one language. There are also good examples of why one would want to define new constructs in a running program. For example, Java 1.5 included a new for-each-statement. Programmers wanting to use the statement had to wait for Sun's implementation to be drafted, implemented, tested and distributed. In Katahdin programmers can create new statements on their own as they need them, without modifying the implementation. If you want a new for-each-statement in your language you can add one in a minute.
Literature
A Programming Language Where the Syntax and Semantics Are Mutable at Runtime
My master's thesis describing the background, theory, implementation and current status of Katahdin. Also published as a technical report at the University of Bristol.
Slides describing Katahdin
Slides used to present my thesis.
Katahdin: Mutating a Programming Language's Syntax and Semantics at Runtime
An unpublished paper on Katahdin, written at an earlier stage in development.
Katahdin Consulting Business Plan
A hypothetical business plan for employing Katahdin, written for an associated course unit.
All of this literature is copyright © Chris Seaton, 2007.
Source Code
The Katahdin source code has been put into the public domain. It can be used for any purpose, in combination with any license.
Requires:
- Mono ≥ 1.2.3.1
- Gtk+ ≥ 2.6.10 (only if you want to build the debugger)
- Gtk# ≥ 2.4.3 (only if you want to build the debugger)
Download:
Current Status
After graduating I commissioned into the Royal Army Medical Corps and so I have no opportunity to continue development. The source code is available for download above and is in the public domain so there are no licensing issues. If you are interested, please take Katahdin and study it, put it to good use, or adopt it for development.
I can still be contacted at chris@chrisseaton.com, although I will be unable to check my e-mail for long periods of time.
This page was published June 2007.
Frequently Asked Questions
Can't you already do this in Lisp?
Write me a set of Lisp macros that allow you to run an unmodified off-the shelf FORTRAN or Python program!
References to Katahdin
- Bryan Ford's Packrat Parsing and Parsing Expression Grammars Page
- Lambda the Ultimate - Katahdin: Modifying your programming language as it runs
- reddit - Katahdin is a language where the syntax is mutable at runtime
- dzone - Katahdin: A Language with Runtime-mutable Syntax
- Digg - Katahdin: Modify your programming language as it runs
- Yukihiro Matsumoto's (designer of the Rugy language) blog post referring to Katahdin