Crude v2 specification

Version 2021-10-21

Crude was designed and implemented by bergsans as a vehicle for learning about parsers and interpreters. It is an interpreted imperative language in the style of JavaScript.

Tokens

The following are reserved words in the language: change clear concat convert define false for if length let print return set sleep slice true.

Any maximal sequences of ASCII letters (uppercase or lowercase) that are not reserved words are classified as identifiers.

The following operators are recognized: + - * / % ^ = == != < <= > >= ! && ||.

The following delimiters are recognized: ( ) [ ] { }.

The following separators are recognized: , ; and the end-of-file token.

Integers are any sequence of one or more digits.

Strings are enclosed in double quotes ("); within them, any Unicode character may appear except for a double quote. The backslash character (\) is not treated specially (in particular, it doesn't impose a different interpretation of the next character); it only represents itself literally.

Whitespace is allowed (and ignored) between tokens. The recognized whitespace characters are space (0x20), line feed (0x0A), carriage return (0x0D), and horizontal tab (0x09). Other forms of (ASCII or Unicode) whitespace are not allowed between tokens. Strings can contain whitespace (or any character except for `"`); other token types, by construction, cannot.

Any character which doesn't match any of the above token categories, or whitespace, is disallowed, and causes a tokenization error to be issued; in this case, the program is not processed further.

Program structure

A program is a sequence of zero or more statements. Statements come in seven kinds, described below.

Blocks and environments

A block is not a kind of statement, but is used by some of the statement types. In a program, a block is a sequence of a { token, zero or more statements, and a } token.

An environment is a data structure used by the runtime to store bindings of variables to values. Program execution starts in a dedicated environment called "the global environment" (specified below), but as different statements run, more environments are created. Unless otherwise stated, statements are executed in the current environment, which starts out being the global environment.

When lookup happens for a variable binding (in order to read or write it), first the current environment is searched, then its parent, and so on up to the global environment. If lookup fails to find a binding in any of the environments, a runtime error is signaled.

Whenever a block executes, it creates a child environment of the current environment, and evaluates its statement in this new environment, in order.

if statement

Grammar

if ( Expression ) Block

(Expression and Block are defined below.)

At runtime, an if statement first evaluates Expression and then, if the result is truthy, executes Block. The result of the if is the result of the last statement of Block if it executed, otherwise undefined.

for statement

Grammar

for ( Identifier , LowerBound , UpperBound ) Block

LowerBound and UpperBound are both Expressions.

At runtime, a for statement first creates a child environment and binds Identifier to LowerBound. Then it executes the following loop:

The result of a for loop is undefined.

let statement

Grammar

let Identifier = Expression ;

At runtime, a let statement evaluates Expression, then creates a child environment and binds Identifier to the result of the evaluation. The child environment with the new binding is used for executing the rest of the statements in the block. The result of a let statement is undefined.

set statement

Grammar

set Identifier = Expression ;

At runtime, a set statement looks up Identifier; call the environment where it is found E. The statement then evaluates Expression, and then binds Identifier in E to the result. The result of a set statement is the result of the evaluation.

define statement

Grammar

define Identifier ( Parameter , ... ) Block

There can be zero or more Parameters; each Parameter is an Identifier.

At runtime, a define statement constructs a closure, a 3-tuple with the Parameters, the Block and the current environment. This closure is bound to Identifier in the current environment. The result of a define statement is undefined.

return statement

Grammar

return Expression ;

At runtime, a return statement in a function evaluates Expression, then immediately hands over execution to the function's caller, without executing any more statements in the function, returning the result of the evaluation as the result of the entire function call. A return statement outside of any function similarly aborts the entire program, letting the result of the evaluation be the result of the entire program. A return statement has no result.

Expression statement

Grammar

Expression ;

At runtime, an expression statement evaluates Expression. The result of the statement is the result of the evaluation.