Synbil: what?

I think I've stumbled across a way to build languages more portably by default than the traditional methods. The potential in this method's other benefits, such as full embedding-as-a-library, makes me quite excited for speedup of famously slow languages.

To explain what it is that I found, I have to explain a common pattern that I'd run into when developing compilers.

So here's the scenario right

I'm on system A, with language X, and want to make language Y available on A.

I write the bootstrap compiler in (or transpile it to) X, now Y is on A. I like to get things working before I get them working well, so the backend of this compiler usually is Y itself, rather than whatever assembly is on A. All well and good, but this is where the headache begins...

Now nine times out of ten, if I want a compiler for Y, I want a self-hosted compiler. But then I've ended up having to write two compilers for the benefits of one, and generally endured bootstrapping headaches. Why bootstrap at all? This led to a provocative question.

What if we shrunk the compiler by building syntax in a metalanguage?

This would certainly fix the problem at hand. Instead of "building a compiler" for language X, make a library in metalanguage M that implements the precedence and parsing, make a library that implements macros for the syntax to transform it into Y, and you're done!

No more compiler-bootstrap and compiler in X and Y, just two versions in M, one 100% in language X, and one mostly in language Y, with the rest being the implementation of Y, written in X.

Thinking about this led to more provocative questions, mostly centered around the idea of making a language not just portable, but downright pocket-able.

Why not build syntax incrementally?

Why not write syntax in our code?

How small can the metalanguage be?

and maybe even,

Can we implement types in userspace?

We'll talk about all of these.

These all had immediate problems, which I'll go though now, as if we were designing it together, and working through them one by one, before we talk about specific design.

Problem 1: Now we have to deal with a metalanguage

I know, not ideal. We'll settle by making it as small as possible. We'll decide now that only primitives for syntax-building are going to be included, whatever those look like.

Problem 2: Writing syntax uses syntax

On the one hand, it'd be nice to implement a language as a library. But how does one go about implementing the syntax? We need code that runs, and to write explicit parser code in userspace would put us right at square one.

Instead, we imagine a metalanguage where the compiler and programmer are in constant communication, with the programmer explicitly building some sort of parser throughout the metaprogram. I've thought of a few ways to do this:

  1. Every time symbols are added, build an entire BNF tree. Metacompiler generates parser from BNF.

Problem 3: Imports require a clean namespace

The whole idea with modular programming, imports, and code reuse is the assumption that one's namespace is kept clean, to prevent things like value, module, or type name clashes.

But we're not importing variable names here, it's much more inconvenient than that! We're importing syntax names, which means symbol clashes need to be resolved before the import, and symbols may contain several identifiers - think about the symbol if{_}else{_}. Let's work through some possible solutions here: