About "template" like stuff (the typical parametric polymorhpism)

ratchetfreak

#10585

January 28, 2017

One of the biggest problems I have with C++ templates is duck-typing and that the requirement of the type parameters is not encoded except in the code of the template. I have yet to get decent error messages when I mess up the implicit contracts.

Mostly because they miss something like concepts (or what I think concepts should be) where you can name a group of requirements and check that a template definition stays within the requirements (which makes sure the template can actually be instantiated) and that a type parameters fits within the requirements and the compiler can just say "type Foo doesn't follow concept Bar because ..." instead of some semi-related error because a function doesn't return the correct type resulting in another function call being unable to be resolved or something like that.

For example with an iterable thing you could create a concept Iterable with 1 requirement: if you have a variable a of a type that follows the Iterable concept you can call iterator(a) and it returns a value of a type that follows the Iterator concept. The Iterator concept then has 3 functions it should require: finished(it) -> bool, advance(it)->void, element(it)-> Element.

Though from this point the type parameter could be expanded to include the specific functions required in case the user doesn't want to follow your naming conventions.

Edited by ratchetfreak on January 29, 2017, 12:11am

Ginger Bill

#10593

January 29, 2017

I want to explore a different way of doing parametric polymorphism. Most languages do it the same and it does lead to "going down the rabbit hole", like what you have suggested.

I personally don't think something like concepts is worth the trouble as the complexity of its implementation and complexity of understanding it is much greater than its usefulness (in my opinion). This language is meant to be a simple one and not the behemoth that is C++ (or Rust for that matter).

Maybe, the best idea is have a limited set of "traits/concepts/interfaces". Maybe just one for iterators. However, the path I would like to explore is code generation. C++ templates (and the like) are a kind of poor man's code generator and because of its design, require a ton of complexity to enforce its rules.

I personally find that the only things that I need to be "generic" or "parametric" are 3 things:
* Dynamic Arrays
* Dynamic Hash Tables
* General sorting function

In C, I can emulate all of these with macros and x-macros and its good enough to solve 99% of the problems I have. I don't really need anything else like that.

When compile time execution (CTE) is implemented, it will be possible to emulate most of the functionality that those type restrictions bring and be written in normal code.

1	#run assert(is_type_iterable(type_info(Foo_Type)));

Maybe semantic macros + CTE is the answer that will solve most of mine (and others) problems whilst still being "simple enough" concepts.

- Bill

ratchetfreak

#10596

January 29, 2017

There is a fourth reason people use templated in C++: composition of behaviors.

For example a linked list with a custom allocator. If it needs to allocate a node it will call allocate with the allocator to get a little block of memory to work with.

User code may create a pool with freelist to cache the allocated nodes and a malloc fallback speed up the allocation of the nodes.

In C you would do that with function pointer callbacks and a context pointer. In C++ they use templates to reduce the function pointer calls into static calls so they can get inlined and further optimized.

Ginger Bill

#10601

January 29, 2017

With your custom allocator example, I really dislike tying a specific allocator to a type. It may allow for some optimizations but it's not particularly useful. The custom allocation system that I have in Odin is very similar to what I do in C. The `Allocator` is just a function pointer and a pointer to allocator data (if it's needed). This allows you to easily change the allocator for anything.

Calling a function pointer doesn't have any overload than calling a normal function but it does mean it cannot be optimized like the C++ template.

What I'm trying to say: I much prefer the C way of doing things compared to the C++ way. I would never use templates for that reason as it's just added complexity with no real benefit.

An other question to ask is, does the linked list need to "carry" that allocation information (i.e. have behaviour) or just be treated as data and have the behaviour elsewhere? (n.b. I know it was a basic example and probably not the best example either).

---

The more I think about it, the more I'm starting to dislike the typical parametric polymorphism. Maybe, we can find a "better" approach to the sorts of problems that it solves.

ratchetfreak

#10602

January 30, 2017

The allocator thing is only an example, it could really be any other dependency that's known/set at compile time. Very often the size needed of the dependency struct is non-zero and if you care about performance you really don't want to chase a pointer.

A much better way to express those things is using those C-style function pointers but force-#bake them in the functions that use them so the optimizer can do some work with them.

ratchetfreak

#10620

January 31, 2017

Let me list what I find important in parametric polymorphism:

1. definition-time checking of expected type-parameter interface. This is IMO java's biggest strength with its generics: if you create a generic class you have to specify which interfaces (if any) you expect the type parameter to implement and the compiler will check that you use only those methods.

2. With that you can then check at usage-time whether the type parameter actually conforms to the interface and provide a clear and concise error message.

Granted these things are easier to do when you can tie methods to types. However with something like a "trait bundle" you can do without methods (or even overloads).

A trait bundle would be a collection of Types, procs and constants. Procs can then take a trait bundle and use the data inside it. For example an Iterable trait declaration of a linked list:

Iterable :: trait{
    container : Type;
    iteratorTrait : trait Iterator;
    get_iterator : proc (list : container ) -> iteratorTrait.iterator_type ;
}

Iterator :: trait {
    iterator_type :Type;
    element : proc (it : iterator_type) -> ^el_type;
    el_type : Type : StripPointer(ReturnType(element));
    finished : proc (it : iterator_type) -> bool ;
    progress : proc (it : ^iterator_type);
}

IteratableLinkedList :: trait Iterable {
    container :: LinkedList;
    iteratorTrait :: IteratorLinkedList;
    get_iterator :: proc (list : container ) -> iteratorTrait.iterator_type { return list.head;}
}

IteratorLinkedList :: trait Iterator {
    iterator_type :: ^Node;
    el_type :: Data;
    finished :: proc (it : iterator_type) bool -> {return it == null;}
    progress :: proc (it : ^iterator_type) -> {it^ = it.next;}
    element :: proc (it : iterator_type) -> ^el_type {return it.data^;}
}

A trait bundle could have optionally a provoking member so that meta-programming can grab an existing trait bundle without having to know the name (for example a for loop that requires the Iterable bundle).

If a proc member is missing from the declaration you can default it to use the same name and then use normal overload resolution to get the proc and error if it can't find it.

For types you could add a meta programming expression to derive it from one of the other declarations.

Ginger Bill

#10625

January 31, 2017

I completely understand the appeal of traits and the likes, but I don't think it's the right fit for this particular language. I do not believe that the benefits of traits outweigh the complications it brings to the language.

I want this language to be a simple to read and write. Adding traits is a huge complexity that I don't think is worth it. I am not saying the idea of traits is a bad one, just that I don't think it's the right fit.

Side note: This is one of the reasons I've not added methods or built in concurrency features into the language yet, I don't want to go down that rabbit hole yet, if at all.

Maybe I'm just a die-hard C programmer and semi-stuck in my ways, but I would like to try completely different approaches compared to what the flavour-of-the-day is doing (e.g. Go, Rust, Swift, Clojure, Kotlin, Nim, whatever and amen). If it turns out that is the best approach, I'll properly consider it.

ratchetfreak

#10683

February 2, 2017

From a pure C perspective there are only 2 ways to get different behaviors out of the same bit of source code (the purpose of polymorphism): function pointers or macros. Macros should go burn in hell. Function pointer result in optimization barriers where information on one side of the call cannot be used on the other unless the optimizer can constant fold the function pointer down to the call site or deduce the limited set of functions that can be called at the call site (aka devirtualize the calls).

Relying on the compiler to do whole-program optimization (which gets deferred to link time with multiple translation units) like that when the programmer knows exactly what needs to happen and could express it cleanly is one of my pet peeves with "modern" programming.

With parametric polymorphism you add a compile-time parameter to the proc and change the behavior of the code based on that parameter. Tying non-trivial behavior to a single parameter isn't very easy unless you bundle them somehow. Classically that is done ad-hoc with overloads and class-methods. I don't like them because they force the user code to conform to the code style of who wrote the template and they may collide with your own reasoning about function names. Traits solve that very nicely by creating what is essentially a struct with proc pointers and constants.

There is the bigger issue that, because of the ad-hoc style, most implementations don't have a way of encoding the assumptions on the parameter that the polymorphic code makes, which leads to template definitions possibly not making sense and very bad error messages.

ratchetfreak

#10724

February 4, 2017

So I just stumbled onto https://www.youtube.com/watch?v=Tl7mi9QmLns ...

It basically shows that go uses a trait bundle (which they call type info) to fake their generic data structures.

Their map type info contains the struct sizes needed for pointer arithmetic and the function pointers for the key's hash and equality functions.

This is what I was aiming for but generally more dev friendly when using it to get pointer offsets and guaranteed constant folding of the values in the traits bundle.

Ginger Bill

#10731

February 4, 2017

That idea of faking generics is quite clever and would be good enough to solve most problems that people use generics for. I'm still not sure if I'm sold on it yet but it is growing on me.

The problem that Go was trying to solve was caused by allowing structs and arrays to be comparable if their contents are. Which is why you cannot just do a memcmp on them and thus not no simple hashing functions. So what they needed to solve, it's a very clever solution but caused by "problem" with the language itself.

I should note, I'm not saying no to this idea/proposal. The problem is more complicated than that. I'm just wondering if it's possible to have something that allows you to create this construct yourself without having it as a built in construct. Plain-old-generics clearly doesn't allow it without something like concepts and now the language is just complicated.

Are traits the best idea? Is a form of code generation better (semantic macros et al.)? Or something I've not thought of?

I'm asking these questions because the more and more I wonder about these sorts of problems, I wonder if they are really the problems that I want to solve in this language.

Thank you for this discussion.

ratchetfreak

#10733

February 4, 2017

I don't think full-on code generation is the right choice for the simple things like generic containers. For one because the code to generate the actual code will end up obscuring what is being generated. For another the editor can't really help with making sure what you generate is correct.

User code is very likely going to end up building a parser/template system on top of that to simplify creating them.

About semantic macros, the only thing I can really envision for them is something like C++ style templates. But I'm not sure that's what you mean with them.

Edit: I should add that for implementation of the traits bundle it doesn't look to be that hard. Some syntax for the declarations and in the proc that uses them it's basically a struct or namespace in just the proc body.

Edited by ratchetfreak on February 4, 2017, 4:35pm

graeme

#10740

February 5, 2017

fwiw i think c++ makes two mistakes--1) thinking types are hard to reason about, and treating them special; and 2) defining templates in terms of code generation despite only wanting to achieve generics

so I tend to agree with gb on this--generics (as found in c#, hindley-milner, etc) are one thing, and they're more or less arbitrarily powerful. so once you have them you don't really need a more flexible code generation system, but all your code generation is expressed through type relationships. at the risk of sounding profane, programmers have never wanted type systems for their own sake--they statically checked guarantees and expressiveness. types are excellent for that. but it shouldn't stop us looking elsewhere, or for type systems along different axes

i think programmers as a whole have a bias towards the declarative, and rarely acknowledge that declarative languages are the hardest to get right

ratchetfreak

#10741

February 5, 2017

graeme
fwiw i think c++ makes two mistakes--1) thinking types are hard to reason about, and treating them special; and 2) defining templates in terms of code generation despite only wanting to achieve generics

so I tend to agree with gb on this--generics (as found in c#, hindley-milner, etc) are one thing, and they're more or less arbitrarily powerful. so once you have them you don't really need a more flexible code generation system, but all your code generation is expressed through type relationships. at the risk of sounding profane, programmers have never wanted type systems for their own sake--they statically checked guarantees and expressiveness. types are excellent for that. but it shouldn't stop us looking elsewhere, or for type systems along different axes

i think programmers as a whole have a bias towards the declarative, and rarely acknowledge that declarative languages are the hardest to get right

You really only have to look at one thing to see that C++ templates missed their mark: std::true_type and std::false_type. 2 types for the sole reason to be treated as markers for true and false in template "logic".

It's kinda the reason I want to move away from types as the main parameter and instead make it a part of what the parameter contains. Because then you can embed other things into that parameter that may be valuable for the implementation and the code doesn't need to rely on external information.