Deciding Upon Odin's General Declaration Syntax

The "base language" is nearly done and I'm soon to begin work on the "metaprogramming part" of the language soon. As a result, I would like to "finalize" the general declaration syntax for this language.

For this syntax, I had a few main "goals" however, I've come to realize that having all of them is most likely impossible. The goals being:

(1) A grammar that is fast to parse (i.e. LALR(1), LL(1) or at least very small lookahead)
(2) Obvious declaration kind (no need to determine kind during semantic stage)
(3) Consistent syntax between kinds of declarations
(4) Consistent syntax between procedures, lambdas, and procedure types
(5) Minimal amount of "keywords" and "operators"
(6) Not fucking LISP

Initially I had been using a Jai-like syntax for Odin until very recently. Its syntax does have problems such as inconsistencies in declarations and its restrictiveness compared to other syntax families. So I've been experimenting with a Pascal/Oberon/Go/Rust/Swift/etc. like syntax which uses "prefixes" for declarations, which I'll call Prefix-like.

n.b. Most "modern" language seem to use the Prefix-like style but that does necessarily mean it's good.

Jai-like syntax fails on (3), why Prefix-like fails on (4), if both adhere to (1) and (2):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
// Declaration ordering

// Variable 
// Variable Inferred Type
// Multiple Variables
// Constant 
// Constant Inferred Type
// Type 
// Procedure 

// Jai-like
foo: int = 123;
foo := 123;
a, b, c: int;
bar: int : 123;
bar :: 123;
blah :: type int; // `type` could be removes if it is a record type which has a keyword prefix
doop :: proc() {}

// Prefix-like
var foo int = 123;
var foo = 123;
var a, b, c int;
const bar int = 123;
const bar = 123;
type blah int;
proc doop() {}


As shown above, Jai-like syntax is highly irregular when it comes to declarations whilst, prefix-like is very consistent with the prefix-keyword describing the declaration kind. Prefix-like also allows the ability add different kinds of declarations with ease, e.g. `let`, `import`, `macro`. Jai-like could do the same but it would not be consistent.

Another advantage Prefix-like syntax is the ability to do grouped declarations (Pascal/Go):

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
var x int;
var {
	a int;
	b f32;
	c string;
}
import "sys/windows.odin";
import {
	"atomic.odin";
	"fmt.odin";
	"hash.odin";
	"math.odin";
}


and from Go, the ability to have "naturally" occurring enumerations:

1
2
3
4
5
6
7
8
// Go code
type Food int
const (
	FOOD_APPLE Food = iota // 0
	FOOD_BANANA            // 1
	FOOD_CABBAGE           // 2
	FOOD_DRAGONFRUIT       // 3
)


This is one of my favourite features that occurs from the prefix-like syntax.

n.b. For more information about how Go's constant system and enumerations work, I recommend these webpages:
* https://blog.golang.org/constants
* https://splice.com/blog/iota-elegant-constants-golang/

Enumerations are still possible in Jai-like syntax but only through the addition of a `enum` type. This `enum` type can have the added bonus that you can namespace the values to that type (Fruit.APPLE), have extra type information, and string values for debugging purposes.


One of the nice advantages of Jai-like is that (4) can be adhered to:

1
2
3
4
5
6
7
8
9
// Jai-like
foo :: proc(x: int) {}  // procedure declaration
foo := proc(x: int) {}; // named-lambda declaration
proc(x: int)            // procedure type

// Prefix-like
proc foo(x int) {}        // procedure declaration
var foo = proc(x int) {}; // named-lambda declaration
proc(x int)               // procedure type


Another "advantage" that I personally find pleasing to the eye about Jai-like is that the "entity's name" is first.

---

There are more things do discuss between these two syntax "families" however, I don't want this post to be enormous :P

I would like to know your opinions about the subject and if there are any other good declaration syntaxes out there that may be a better fit for this language or even ways to improve Jai-like. For the mean time, I will stick with the recently introduced prefix-style until I finally decide upon the best syntax for this language -- I don't want to bike-shed too much ;)

Regards,
Bill

Edited by Ginger Bill on
gingerBill
As shown above, Jai-like syntax is highly irregular when it comes to declarations ...

I see no irregularities in the Jai-like declarations. Every declaration begins with one or more variable names and a colon. These are followed by a type, which is optional, if it is followed by either another colon or an equality sign and after that an expression.
What might seem irregular are 'type' and 'proc()'. But these are in my mind just part of expressions. 'type' apparently sometimes has to precede type expressions for parsing reasons and 'proc()' begins procedure literals.

Am I wrong?
Jai's declaration syntax is inconsistent for numerous reasons. Type and procedure declarations require a keyword whilst variable and constant declarations only require some punctuation. If type declarations didn't require a keyword and I didn't adhere to (2), then Jai's syntax could become a little more consistent.

1
2
3
name :: value; // Is this a constant or type declaration?
foo :: proc(); // Is this a type declaration or an incomplete procedure "prototype" (i.e. #foreign)
bar : X : [2]int; // The `X` in this case makes not sense if this a type declaration


Having the ability to know what kind of declaration it is at the AST level is very useful and makes the semantic checking stage nicer. Another problem is that procedure declarations and procedure types begin exactly the same way, with `proc`. This is to make parsing easier and faster as it doesn't need an arbitrary look-ahead (see how Swift handles this).

A "solution" to this problem is to add even more punctuation however, I do not like this "solution" as it is cryptic for the reader as it requires them to learn what each symbol means whilst something like a keyword is easy to read as it is a word.
1
2
3
4
name := expr;      // variable
name :: const_val; // constant
name :! type_expr; // type
name :@ proc_decl; // procedure


But this is still "inconsistent" as for variables and constants, there is an optional type allowed after the first colon but for a type declaration, this doesn't make any sense.

I hope this clears up the problem I have with Jai-like syntax. In general, I want a clear and mostly consistent syntax and I don't think a Jai-like syntax is the answer.

Edited by Ginger Bill on
I don't see how a comparison with Jai's syntax is meaningful, when Jon has made it clear it hasn't even been really worked on yet.
The main reason was that this was the original syntax of Odin. I'm not actually discussing Jai's syntax but the general gist of it.

It is useful to discuss this as I want a decent idea what the syntax should be for this language and exploring different syntax families will help with this. I have only discussed Jai-like and prefix-like to begin with as these do not need significant whitespace.

Most "modern" languages seem to go for the prefix-like style, which include:
Rust, Go, Swift, Nim, Lua, Ada, Pascal, Oberon, Clojure, and more. Pretty much any language deriving from C, LISP, or ML are usually "prefix-like languages".
As mentioned, the syntax is currently placeholder. But I don't use the 'proc' keyword (I do not like it, nor do I like 'var' for declaring variables) ... the only thing that uses a keyword right now is when you want to put a type in the expression slot; apart from that it's completely consistent.

there is an optional type allowed after the first colon but for a type declaration, this doesn't make any sense.

That's not true, because types are of type 'Type'. (Well I dunno how it is in Odin, but that is how it works for me).

Edited by Jonathan Blow on
Thank you Jon for replying.

I know Jai's syntax is a placeholder for now and that's fine. I'm just exploring different syntax ideas [for Odin] and the main two that I like are prefix-like and jai-like.

When parsing, the `proc` keyword removes the need for an arbitrary look-ahead which would be necessary in this case; it removes some ambiguity in parsing. I don't like `var` that much either but at the moment, I'm just experimenting with the syntax. It didn't take long to change as my current system is quite flexible to accommodate change of syntax.

In Odin, types technically have no type but I could make types be of type `type` (the keyword) which would solve that problem slightly. The inconsistencies I have are only at the parsing level rather than the semantic level. If don't care about knowing the exact kind of declaration at parsing, all but one inconsistency disappear. That inconsistency regards procedures.

1
Thing :: proc(); // Is this a procedure type or procedure prototype?


In Odin and Jai, the #foreign tag is used to specify if a procedure signature without a body is an external procedure or a type but this isn't that clear for me to read. I know you do something like the below to solve this problem but this is an inconsistency.

1
2
3
Thing :: #type ();
// Whilst this below would be fine
Foo :: int; // Type declaration resolved in semantic stage



The other thing that is lost without a "mandatory" keyword prefix for type declarations is clarity -- knowing it's a type declaration without having to look up that value.

---

If I use a Jai-like syntax, I do lose is that lovely enumeration thing with constants (that I nicked from Go). However, I could easily add this within a enum type (which I had previously) and make "unnamed" enum declarations add its values to the current scope (like C).

I'll need to think this through more in the new year and have a good new year yourself.


There is actually a funny subtlety here. With this part that you quoted:

1
2
3
Thing :: #type ();
// Whilst this below would be fine
Foo :: int; // Type declaration resolved in semantic stage


It's true that both of these things work, but it's *not* an inconsistency in exactly the way you are saying, I don't think.

The "Foo :: int" is not some kind of special type declaration. It is just a declaration like any other one, that says "Foo is the same thing as whatever int is". If int happened to be bound in the global namespace to the string "Hello, Sailor", then that is what Foo would be. But instead it is bound to something of type Type, so that is what Foo becomes. So this is actually very general.

The only reason we need the #type in the first case is as a parser hint ... because a lot of syntax is shared between types and expressions (as you obviously know) I put a helper in that case. But it's not quite right to say that type declarations are inconsistent, because there really isn't any such thing as a type declaration. It is just a declaration like any other declaration, and in both cases in the above-quoted box, the thing on the right is an rvalue. It's just that in cases that start with ( or [ I need the parser hint right now.

That's not to argue that this is the right way to do it or anything, just explaining how it is currently.

Edited by Jonathan Blow on
Maybe a clearer way to say it is ... #type isn't about anything to do with type declarations, it is just about helping you use a type as an rvalue when it wouldn't otherwise parse.

So for example, if you had a procedure that takes a Type:

foo :: (t: Type) { ... }

You could call it like this:

foo(#type [100] Vector3);

and that would just work (though it may be in that case that if I made the parser just try to understand the [100] Vector3 it probably would ... but I am not getting too deep into that stuff until later when the syntax gets finalized).


The type-as-an-rvalue thing can seem weird to C/C++ programmers, where typedef is a special weird thing. It's much more along the lines of the way functional languages treat types.

Edited by Jonathan Blow on
I think I completely understand your system now. You don't differentiate between declaration kinds: variable, constant, type name, procedure, etc; rather you just have variable (mutable) `:=` and constant (immutable) `::` declarations. This does make things much more consistent but I'm not sure that I like it or not as it is a little less clear and pushes more work on the semantic checking stage.

It is completely different to my current system but I think I can tweak it if I so do desire.

Having first class data types is a little weird for me, mainly as I'm a C and C++ programmer, but it does seem extremely useful but I don't think it makes much sense in my language (Odin).


---

The vast majority of this project is research rather than actual programming. I have been looking at numerous languages including: Jai, Go, Pascal, Rust, Swift, Ada, FORTRAN, and much more. These little discussions are extremely helpful as they give me insight into how this language should be. I want to get this language right as I want it to be the language that I use.

Thank you for the replies and all of these discussions, Jon.

Yes, that's exactly how it works right now.

As always though, I dunno what that means in terms of how it will work in the final version.