Somewhere between a meta-language, a programming language, and a rubbish lister https://catseye.tc/node/Tamsin
Find a file
2023-11-01 11:50:31 +00:00
bin Add Prolog/Erlang style list sugar. All tests pass. 2014-05-12 13:54:58 +01:00
c_src Fix compiler error messages. All tests pass! 2014-09-13 21:21:06 +01:00
doc Prep for release of 0.5-2017.0502. 2017-05-02 12:12:40 +01:00
eg Fix compiler error messages. All tests pass! 2014-09-13 21:21:06 +01:00
fixture Add mini-tamsin fixture. 2014-09-17 09:20:12 +01:00
lib More useful semantics for proper-quoted string literals. 2014-09-02 22:17:13 +01:00
mains Giving Mini-Tamsin a good ol' college try. 2014-09-16 16:08:25 +01:00
src/tamsin This is probably closer than I think to done. Ugly, IMO, though. 2015-01-13 16:43:35 +00:00
.gitignore COMPILER CAN COMPILE ITSELF. RESULTING COMPILER PASSES ALL TESTS. 2014-05-11 20:01:11 +01:00
.hgignore COMPILER CAN COMPILE ITSELF. RESULTING COMPILER PASSES ALL TESTS. 2014-05-11 20:01:11 +01:00
.hgtags Added tag 0.5 for changeset 7597a8c4b1c6 2014-05-25 18:26:16 +01:00
HISTORY.markdown Prep for release of 0.5-2017.0502. 2017-05-02 12:12:40 +01:00
LICENSE Getting some ducks in a row. 2014-05-01 15:32:16 +01:00
Makefile Giving Mini-Tamsin a good ol' college try. 2014-09-16 16:08:25 +01:00
README.markdown Update links in README. 2023-11-01 11:50:31 +00:00
test-codegen.sh That's just super. 2014-10-04 17:40:41 +01:00
test.sh Giving Mini-Tamsin a good ol' college try. 2014-09-16 16:08:25 +01:00

Tamsin

Tamsin is an oddball little language that can't decide if it's a meta-language, a programming language, or a rubbish lister.

Its primary goal is to allow the rapid development of parsers, static analyzers, interpreters, and compilers, and to allow them to be expressed compactly. Golf your grammar! (Or write it like a decent human being, if you must.)

The current released version of Tamsin is 0.5-2017.0502. As indicated by the 0.x version number, it is a work in progress, with the usual caveat that things may change rapidly (and that version 0.6 might look completely different.) See HISTORY for a list of major changes.

Code Examples

Make a story more exciting in 1 line of code:

main = ("." & '!' | "?" & '?!' | any)/''.

Parse an algebraic expression for syntactic correctness in 4 lines of code:

main = (expr0 & eof & 'ok').
expr0 = expr1 & {"+" & expr1}.
expr1 = term & {"*" & term}.
term = "x" | "y" | "z" | "(" & expr0 & ")".

Translate an algebraic expression to RPN (Reverse Polish Notation) in 7 lines of code:

main = expr0 → E & walk(E).
expr0 = expr1 → E1 & {"+" & expr1 → E2 & E1 ← add(E1,E2)} & E1.
expr1 = term → E1 & {"*" & term → E2 & E1 ← mul(E1,E2)} & E1.
term = "x" | "y" | "z" | "(" & expr0 → E & ")" & E.
walk(add(L,R)) = walk(L) → LS & walk(R) → RS & return LS+RS+' +'.
walk(mul(L,R)) = walk(L) → LS & walk(R) → RS & return LS+RS+' *'.
walk(X) = return ' '+X.

Parse a CSV file (handling quoted commas and quotes correctly) and write out the 2nd-last field of each record — in 11 lines of code:

main = line → L & L ← lines(nil, L) &
       {"\n" & line → M & L ← lines(L, M)} & extract(L) & ''.
line = field → F & {"," & field → G & F ← fields(G, F)} & F.
field = strings | bare.
strings = string → T & {string → S & T ← T + '"' + S} & T.
string = "\"" & (!"\"" & any)/'' → T & "\"" & T.
bare = (!(","|"\n") & any)/''.
extract(lines(Ls, L)) = extract(Ls) & extract_field(L).
extract(L) = L.
extract_field(fields(L, fields(T, X))) = print T.
extract_field(X) = X.

Evaluate an (admittedly trivial) S-expression based language in 15 lines of code:

main = sexp → S using scanner & reverse(S, nil) → SR & eval(SR).
scanner = ({" "} & ("(" | ")" | $:alnum/'')) using $:utf8.
sexp = $:alnum | list.
list = "(" & sexp/nil/pair → L & ")" & L.
head(pair(A, B)) = A.
tail(pair(A, B)) = B.
cons(A, B) = return pair(A, B).
eval(pair(head, pair(X, nil))) = eval(X) → R & head(R).
eval(pair(tail, pair(X, nil))) = eval(X) → R & tail(R).
eval(pair(cons, pair(A, pair(B, nil)))) =
   eval(A) → AE & eval(B) → BE & return pair(AE, BE).
eval(X) = X.
reverse(pair(H, T), A) = reverse(H, nil) → HR & reverse(T, pair(HR, A)).
reverse(nil, A) = A.
reverse(X, A) = X.

Interpret a small subset of Tamsin in 30 lines of code (not counting the included batteries.)

Compile Tamsin to C in 563 lines of code (again, not counting the included batteries.)

For more information

If the above has piqued your curiosity, you may want to read the specification, which contains many more small examples written to demonstrate (and test) the syntax and behavior of Tamsin:

Note that this is the current development version of the specification, and it may differ from the examples in this document.

Quick Start

The Tamsin reference repository is hosted on Codeberg.

This repository contains the reference implementation of Tamsin, called tamsin, written in Python 2.7. It can both interpret a Tamsin program and compile a program written in Tamsin to C.

The distribution also contains a Tamsin-to-C compiler written in Tamsin. It passes all the tests, and can compile itself.

While the interpreter is fine for prototyping, note that some informal benchmarking revealed the compiled C programs to be about 30x faster. Note however that while the compiler passes all the tests, it is still largely unproven (e.g. its UTF-8 support is not RFC 3629-compliant), so it should be considered a proof of concept.

To start using tamsin,

  • Clone the repository — git clone https://codeberg.org/catseye/Tamsin
  • Either:
    • Put the repo's bin directory on your $PATH, or
    • Make a symbolic link to bin/tamsin somewhere already on your $PATH.
  • Errr... that's it.

Then you can run tamsin like so:

  • tamsin eg/csv_parse.tamsin < eg/names.csv

To use the compiler, you'll need GNU make and gcc installed. Type

  • make

to build the runtime library. You can then compile to C and compile the C to an executable and run the executable all in one step, like so:

  • tamsin loadngo eg/csv_extract.tamsin < eg/names.csv

Design Goals

  • Allow parsers, static analyzers, interpreters, and compilers to be quickly prototyped. (And in the future, processor simulators and VM's and such things too.)
  • Allow writing these things very compactly.
  • Allow writing anything using only recursive-descent parsing techniques (insofar as this is possible.)
  • Allow writing parsers that look very similar to the grammar of the language being parsed, so that the structure of the language can be clearly seen.
  • Provide means to solve practical problems.
  • Keep the language simple — the grammar should fit on a page, ideally.
  • Recognize that the preceding two goals are in tension.
  • Have a relatively simple reference implementation (currently less than 5 KLoC, including everything — debugging support and the C runtime used by the compiler and the Tamsin modules and implementations.)

License

BSD-style license; see the file LICENSE.