I wrote a programming language. Here’s how you can, too.
Keep reading for FREE
There are two major types of languages: compiled and interpreted:
If you are writing an interpreted language, it makes a lot of sense to write it in a compiled one (like C, C++ or swift).
A programming language is generally structured as a pipeline. That is, it has several stages.
Each stage has data formatted in a specific, well defined way. It also has functions to transform data from each stage to the next.
The first step in most programming languages is lexing, or tokenizing. ‘Lex’ is short for lexical analysis, a very fancy word for splitting a bunch of text into tokens.
The word ‘tokenizer’ makes a lot more sense, but ‘lexer’ is so much fun to say that I use it anyway.
A token is a small unit of a language.
A token might be a variable or function name (AKA an identifier), an operator or a number.
The parser turns a list of tokens into a tree of nodes. A tree used for storing this type of data is known as an Abstract Syntax Tree, or AST.
At least in Pinecone, the AST does not have any info about types or which identifiers are which. It is simply structured tokens.
The parser adds structure to to the ordered list of tokens the lexer produces. To stop ambiguities, the parser must take into account parenthesis and the order of operations.
Simply parsing operators isn’t terribly difficult, but as more language constructs get added, parsing can become very complex.
The predominant parsing library is Bison. Bison works a lot like Flex. You write a file in a custom format that stores the grammar information, then Bison uses that to generate a C program that will do your parsing. I did not choose to use Bison.
Put simply, the action tree is the AST with context. That context is info such as what type a function returns, or that two places in which a variable is used are in fact using the same variable.
Because it needs to figure out and remember all this context, the code that generates the action tree needs lots of namespace lookup tables and other thingamabobs.
Once we have the action tree, running the code is easy. Each action node has a function ‘execute’ which takes some input, does whatever the action should (including possibly calling sub action) and returns the action’s output. This is the interpreter in action.
reading habits, gather your
remember what you readand stay ahead of the crowd!
Save time with daily digests
No ads, all content is free
Save ideas & add your own
Get access to the mobile app
4.7 App Rating
MORE LIKE THIS