Visibility Jam. July 19-21, 2024. Starting tomorrow.

The Metaprogramming Dilemma

Ginger Bill
Designing this language has been difficult but fun. Two of the original goals of this language were simplicity and metaprogramming however, these together could be an oxymoron. But before I explain why, I first need to explain what I mean by "metaprogramming".

Metaprogramming is an "art" of writing programs to treats other programs as their data. This means that a program could generate, read, analyse, and transform code or even itself to achieve a certain solution. The approaches of metaprogramming can be split into a few distinct categories:

  • Introspection (and Reflection for OOP languages)
  • Compile Time Execution (CTE)
  • Template Programming
  • Macros (Textual and Syntactic)
  • Parametric Polymorphism ("Generics")

Many languages have metaprogramming functionalities: C has textual macros; C++ has textual macros, a functional templating language, and rudimentary introspection; Nim has all of the above; Go has external textual "templates/macros". Each approach has its advantages and disadvantages but all can be used together to achieve different solutions and results.

Introspection is already part of the language and is a functionality I think is necessary in a "modern" language. It is needed to have something like a "type-safe printf" (at runtime) and the ability to serialize data with ease. However, introspection does require extra memory to be stored for the type information. (n.b. Reflection is only appropriate for object oriented programming language which this language is not.)

Compile Time Execution (CTE) is an idea that I've been pondering for a while and it's already part of Jon Blow's language, Jai. It would be a stage of the compiler which runs any Odin code the user requests before the creation of the executable. The data modified and generated by this stage will be used as the initialization data for the compiled code. I have come to the conclusion that this CTE stage would be required to be ran through an interpreter to achieve the results needed. However there are few problems with this CTE stage. The main problem being: pointers will point to invalid memory addresses. This is because the memory space of the interpreter in completely different to the memory space of the executable (compiled code). Numerous types are stored with pointers internally and these values would be invalid at runtime. Due to this problem (and few others), this powerful feature becomes extremely delicate.

Templates can be thought of a subset of macros. Templates are usually a simple substitution mechanism that operate on the Abstract Syntax Tree (AST). Macros on the other hand, could be a simple text substitution system (akin to C's preprocessor) or even a compile AST modification and generation tool (similar to Lisp or Nim). Templates could even be an entire language built into the language (like C++'s templates). This makes them both extremely powerful but also "magic".

Parametric polymorphism, or commonly referred to as "Generics" (which is a very "generic" name too), is the ability to duplicate certain "snippets" of code that have a similar structure but different types/names/etc. A basic example is a generic sorting function which accepts an array of a certain type and a sorting function for that specific type. In a language such as C, this would either have to achieved through code duplication (copy&paste or macros), or through void pointers to remove the type information. The former method can become cumbersome and prone to mistakes while the latter, removes a lot of type safety and prevents compiler optimizations. In a language that does have "generics", this "problem" can be solved. However, the problem hand is very really a generic problem and "genericizing" the problem doesn't actually solve the original problem. "Generics" can be emulated through the use of templates and macros which means that it may not need to be a built in feature of a language.

This now brings me to the dilemma. How far do I want to go with metaprogramming in this language? How far can I go whilst keeping Odin a simple language? Or is the very definition of metaprogramming not simple? Or should metaprogramming be left to an external (standardized) tool (like `go generate`)?

- Bill

The main difference I see between those styles of meta programming is when the transformation from meta-annotated source to actual expanded source happens:
  • introspection: at runtime
  • CTE: during the semantical pass
  • Template& paramatric: during parsing and/or semantical
  • macros: before or during lexing

I'm of the opinion that the textual metaprogramming should probably remain in outside tools. Let the programmer decide which language to use when printing out the source of the program he wants to compile instead of forcing them to use a specific (and possibly cludgy) text processor.

Syntactical metaprogramming (AST manip) could be inserted as extra passes (possibly loaded as dlls) in the compiler. Though that will require creating a good api for them. Not to mention that code to manipulate an abstract syntax tree can be pretty hard to translate back into the intention of the manipulation.

So I haven't done any meta programming, so am ignorant in this regard.

For generics, what is the problem to be solved? Couldn't it be solved in tooling, such as the editor? Or as part of the language?

For example in pseudo code:

fun createEntity(<T> Tx,int intx, uint uinty)
return 0;

would have an editor expand to every known type for Tx. And you can make it referenceable in code by createEntity_Int etc if you don't want overloading. In fact, you might set a convention of:

fun createEntity<T>(<T> Tx,int intx, uint uinty)
return 0;

so you get essentially unique typed functions, and don't allow calling createEntity directly.
The dilemma is should I add these varieties of metaprogramming. The actually implementation is a detail I'm not really that fussed with at the moment. How much metaprogramming do I need in this language while still keeping the language simple?

Maybe but to me, that's a heavy reliance on external tools to solve your problem rather than just the language itself.