11 posts
Generics in plain C?
Edited by Shastic on
1) You can use void*.

Which means you lose type safety.

2) You can use macros.

When debugging you can't step into functions that are macros. The compiler is pasting the same thing over and over into your code base, which can increase the size of your executable, and slow compile times, though I don't know by how much, but anything other than an instant compile does not feel good. If there is an error, the compiler spams messages up the screen, which is unpleasant. Complex macros are difficult to understand if you don't like thinking that way, especially if you haven't written them yourself.

3) You copy paste the same code over and over, with minor changes and cuts.

It avoids the problems of the first two, but it can bloat the code base like macros, but I'm not sure how bad it would be. Its not nice to do, because you have to do it all by hand.

***

I was thinking of automating the third option because I like things to be simple. The idea is a helper program that scans for commented markers in your project's source files, and uses them to copy paste generic functions with the correct type and name inserted into a single header file. That way I don't have to do any extra work to use them, other than type in a commented struct and hit compile. It would be like a stretchy buffer without macros.

Is this a bad idea? I haven't seen any code that does it, so I thought I'd ask.
Mārtiņš Možeiko
2265 posts / 2 projects
Generics in plain C?
Edited by Mārtiņš Možeiko on
how about void* and macros together?
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 typedef struct { void* ptr; int cap, size; } array; void append(array* a, void* elem, size_t size) { if (a->cap == a->size) { a->cap *= 2; a->ptr = realloc(a->ptr, a->cap * size); } memcpy((char*)a->ptr + size*a->size++, elem, size); } #define append(arr, elem) append(&arr, &elem, sizeof(elem)) void* get(array* a, size_t index, size_t size) { return index < a->size ? (char*)a->ptr + index * size : NULL; } #define get(arr, type, index) (type)get(&arr, index, sizeof(type)) 

Jeff Knox
6 posts
Generics in plain C?
C11 has _Generic selections, which might work.

Blog post with more explanation.

Basically something like:
 1 _Generic( 'a', char: 1, int: 2, long: 3, default: 0) 

will return 2, because 'a' is an int.

So you can define a single macro to call the correct function:
 1 #define increment(var) _Generic(var, int: incrementi, float: incrementf, default: increment_)(var) 

It becomes more complex with multiple variable functions, but otherwise code bloat looks minimal from what little testing I've done on godbolt.
11 posts
Generics in plain C?
Edited by Shastic on
mmozeiko
how about void* and macros together?
  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 typedef struct { void* ptr; int cap, size; } array; void append(array* a, void* elem, size_t size) { if (a->cap == a->size) { a->cap *= 2; a->ptr = realloc(a->ptr, a->cap * size); } memcpy((char*)a->ptr + size*a->size++, elem, size); } #define append(arr, elem) append(&arr, &elem, sizeof(elem)) void* get(array* a, size_t index, size_t size) { return index < a->size ? (char*)a->ptr + index * size : NULL; } #define get(arr, type, index) (type)get(&arr, index, sizeof(type)) 

Thanks for that. Its very elegant. Its what I wanted without all the copy pasting.

I just realized for my idea to work like stretchy buffer, I'd need function overloading.

knox
C11 has _Generic selections, which might work.

Blog post with more explanation.

Basically something like:
 1 _Generic( 'a', char: 1, int: 2, long: 3, default: 0) 

will return 2, because 'a' is an int.

So you can define a single macro to call the correct function:
 1 #define increment(var) _Generic(var, int: incrementi, float: incrementf, default: increment_)(var) 

It becomes more complex with multiple variable functions, but otherwise code bloat looks minimal from what little testing I've done on godbolt.

Thanks. I'll have a think about that one.
Mārtiņš Možeiko
2265 posts / 2 projects
Generics in plain C?
And if you are OK with going clang compiler only, you can do pretty much same overloading as C++, but in C:
 1 2 3 4 5 #define overload __attribute__((overloadable)) void overload inc(float* x) { *x += 1.f } void overload inc(double* x) { *x += 1. } void overload inc(int* x) { *x += 1 } 

Then just call inc(&x) and it will just work.
Jeff Knox
6 posts
Generics in plain C?
mmozeiko
And if you are OK with going clang compiler only, you can do pretty much same overloading as C++, but in C...

An excellent point. This is the best option, imo. I recently switched to clang-only mode and couldn't be happier. Function overloading, builtin vector types with automatic operator overloading and they work with SIMD, Clang Blocks for lambdas/defer, lots of nice builtins, statement expressions, designated initializers*, fully compatible with MSVC... And all in C. It's amazing.

*designated initializers might be available in MSVC, but at one point I think there were issues with C99 compatibility.
11 posts
Generics in plain C?
I don't want to tie myself to one compiler or OS. I like the freedom of being able to leave if I am displeased.

I see Casey is saying on Twitter he found 3 bugs in Clang in 2 weeks, so he's gone back to the MS compiler.

I decided to go with Mārtiņš Možeiko's first idea, since speed of compilation is the most important thing to me, and so far its looking okay. Not as nice to look at as stretchy buffer but it will do.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  struct array arr; arr_init(int, arr, 10); int i; for (i = 0; i < MAX; i++) arr_append(int, arr, i); arr_print(int, arr, print_int); arr_reverse(int, arr); arr_print(int, arr, print_int); int *c; c = arr_get(int, arr, 0); int d = 100; arr_insert(int, arr, 1, d); int e = 100; arr_remove(int, arr, e, cmp_int); arr_pop(int, arr, 5, &c); arr_popl(int, arr, &c); arr_extend(int, arr, arr); struct array copy; arr_copy(int, arr, copy); arr_sort(int, arr, cmp_int); arr_print(int, arr, print_int); for (size_t i = 0; i < arr.count; i++) printf("arr_ptr: %d\n", *arr_ptr(int, arr, i)); 

I was surprised that using memcpy and memmove was easier to reason about than for loops. It feels more data oriented to shift entire segments of an array around, than to step though an array element by element picking parts of it out.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 // Insert an element in the array at the given index. bool arr_insert(size_t size, struct array *arr, size_t index, void *elem) { if (index <= arr->count) { if (arr->count + 1 > arr->cap) if (!arr_resize_(size, arr, 1)) return false; // for (size_t i = arr->count; i > index; i--) // arr->ptr[i] = arr->ptr[i - 1]; // arr->ptr[index] = elem; // arr->count++; // Move every element after the insert index to the right // one position. void *src = arr_ptr_(size, arr, index); void *dst = arr_ptr_(size, arr, index + 1); memmove(dst, src, size * (arr->count - index)); memmove((char *)arr->ptr + size * index, elem, size); arr->count++; return true; } return false; } 

Casey hasn't convinced me on the stdlib though. The stdlib should Just Work TM on all platforms, otherwise it shouldn't be allowed to use the name. There should be an international committee that sues the owners of the operating systems if they don't fix the bugs, and still use the name, but there obviously isn't ever going to be one, as there is no money in it.

To me the stdlib should be as standard as the for loop. What? You expected the for loop to work??? Ha, ha! You naive fool! You must write your own for loop!

If the C stdlib is allowed to have bugs in it on any OS, the language is broken. Period.

Mārtiņš Možeiko
2265 posts / 2 projects
Generics in plain C?
Shastic
I see Casey is saying on Twitter he found 3 bugs in Clang in 2 weeks, so he's gone back to the MS compiler.

All compilers have bugs. I have found so many in MSVC I stopped counting them...
Difference from clang is that clang is updated more often - so if you're always getting latest clang version, you'll see more bugs. But if you stick with same version, then you usually won't find much new bugs (just existing "old" ones).

The stdlib should Just Work TM on all platforms, otherwise it shouldn't be allowed to use the name.

The big thing here is "work". Because there is one one definition on how it should work. Each platform has its own differences how memory allocation works, how filesystem works, etc... So if you're testing & relying on its behavior on one platform, then on another you might have surprising differences (overhead, timing, etc...)
11 posts
Generics in plain C?
Edited by Shastic on
I found a nice way to do generics in C. It does compile time type checking going into and returning from a function, and it lets you step into the code with the debugger. The downside is you need to hash include the file with the associated type above it, and it may cause some bloating.

The method is demonstrated here, where the programmer makes use of a simple JOIN macro to build function names:
https://github.com/glouw/ctl/tree/master/ctl

I found reading code like this a bit nasty, especially if a function is calling other functions:

 1 2 3 4 5 static inline T* JOIN(A, end)(A* self) { return JOIN(A, back)(self) + 1; } 

So I thought of putting a macro like this above each function to improve readability.

 1 2 3 4 5 6 7 #define arr_end_ JOIN(A, end) static inline T* arr_end_(A* self) { return arr_back_(self) + 1; } 

36 posts
Generics in plain C?
Edited by graeme on
I think martins approach has the best power-to-weight ratio but... since we're past that...

Apple recently replaced the webkit allocator on macos with one written in C. It gets generics by passing dispatch tables into functions marked always_inline. Essentially, if all the functions in the call tree between the declaration of the table and an indirect call dispatched from the table are always_inline, clang must know the function being called and is made to optimise with that information. And if the dispatched functions are themselves marked always_inline, the process continues. You choose between runtime dynamic dispatch and compile time generics by controlling the inlining. So, in compiler-speak, they implement dynamic dispatch but have fine control over monomorphization optimizations.

https://trac.webkit.org/changeset/279867/webkit

Sounds like you'd have to sweat over the codegen to do this effectively, and who knows what the compile speed hit is
11 posts
Generics in plain C?
Edited by Shastic on
graeme
I think martins approach has the best power-to-weight ratio but... since we're past that...

What I didn't like about the void* approach is there is no type checking going into the function, unless you do something like this, which is a bit of a nuisance:

 1 2 3 4 5 6 7 8 9 void *int_var(int *val) { return val; } bool arr_add(array *arr, void* val) { ... } arr_add(arr, int_var(val)); 

On second thoughts, maybe it wouldn't be so bad...

What surprised me when I was looking around the net for a way to do generics in C, is how many methods people have found.
11 posts
Generics in plain C?
Edited by Shastic on
I just wanted to correct my last post, in case someone reads this in the future.

I found this annoying when debugging, because you have to step into two functions to step into one, so I abandoned the idea.

 1 arr_add(arr, int_var(val)); 

I returned to the one I linked to above with the JOIN macros, and realized void * can be used as a type along with all the other types. So you can have type checking, and nice debugging, but with bloat, but if the bloat becomes a problem, you can convert it to use the void* array with a simple search and replace, that only touches the names of the functions. So when code is new and you are changing it a lot, use the typed versions, and when the code settles down, you can easily convert it to the void* array.

Here's some code from my array_template.h

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 #define A JOIN(arr, T) struct A { T data; size_t capacity, count; size_t size_type; }; // Return an unchecked pointer to the element in // the array at the given index. #define arr_ptr JOIN(A, ptr) void * arr_ptr(struct A *arr, size_t index) { return (char *)arr->data + index * arr->size_type; } // Return a bounds checked pointer to the element in // the array at the given index. #define arr_at JOIN(A, at) T arr_at(struct A *arr, size_t index) { if (arr->count > 0 && index < arr->count) return arr_ptr(arr, index); else return 0; } // Allocate memory for a new array #define arr_alloc JOIN(A, alloc) bool arr_alloc(struct A *arr, size_t size_type, size_t capacity) { arr->capacity = 0; arr->count = 0; arr->size_type = 0; if (!(arr->data = memmgr_malloc_array(capacity, size_type))) return false; arr->capacity = capacity; arr->size_type = size_type; return true; } // Insert 'data' into the array at the given index. #define arr_insert JOIN(A, insert) bool arr_insert(struct A *arr, const T data, size_t index) { if (!(index <= arr->count)) return false; if (arr->count + 1 > arr->capacity) if (!arr_resize(arr, 1)) return false; memmove(arr_ptr(arr, index + 1), arr_ptr(arr, index), (arr->count - index) * arr->size_type); memmove(arr_ptr(arr, index), data, arr->size_type); arr->count++; return true; } ... etc 

Creating the arrays.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 typedef void* voidp; #define T voidp #include "array_template.h" typedef char *charp; #define T charp #include "array_template.h" typedef struct user_data *udtp; #define T udtp #include "array_template.h" typedef int *intp; #define T intp #include "array_template.h" 

Using it in source code:

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 // Using the array of type string struct arr_charp arr; char *value; arr_charp_alloc(&arr, sizeof(value), 3); arr_charp_push_back(&arr, "VALUE_0"); arr_charp_push_back(&arr, "VALUE_1"); value = arr_charp_at(&arr, 0); // Convert the string array to use the void* array to reduce code bloat struct arr_voidp arr; char *value; arr_voidp_alloc(&arr, sizeof(value), 3); arr_voidp_push_back(&arr, "VALUE_0"); arr_voidp_push_back(&arr, "VALUE_1"); value = arr_voidp_at(&arr, 0); 

Mārtiņš Možeiko
2265 posts / 2 projects
Generics in plain C?
Debuggers allow specifying filter to skip stepping into functions you don't want. For example, VS 2012+ can do this: https://blog.wholetomato.com/2020...anted-functions-in-visual-studio/
11 posts
Generics in plain C?
Edited by Shastic on
Nice.

The last remaining irritant is having to manually cast from void* to the correct type in the debugger, but I can get over it.

This is what I had for that idea before I moved to the other one. I got confused passing the string pointers in and out, and it seemed much easier when data was T. Perhaps it will become clear if I try it again.

  1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 struct array { void *data; size_t capacity, count; size_t size_type; }; // Return an unchecked pointer to the element in // the array at the given index. void * arr_ptr_(struct array *arr, size_t index) { return (char *)arr->data + index * arr->size_type; } #define arr_ptr(type, arr, index) (type*)arr_ptr_(arr, index) // Return a bounds checked pointer to the element in // the array at the given index. void * arr_at_(struct array *arr, size_t index) { if (arr->count > 0 && index < arr->count) return arr_ptr_(arr, index); else return 0; } #define arr_at(type, arr, index) (type*)arr_at_(arr, index) // Allocate memory for a new array bool arr_alloc_(size_t size_type, struct array *arr, size_t capacity) { arr->capacity = 0; arr->count = 0; if (!(arr->data = memmgr_malloc_array(capacity, size_type))) return false; arr->capacity = capacity; arr->size_type = size_type; return true; } #define arr_alloc(type, arr, capacity) arr_alloc_(sizeof(type), arr, capacity) // Insert 'data' in the array at the given index. bool arr_insert_(struct array *arr, void *data, size_t index) { if (!(index <= arr->count)) return false; if (arr->count + 1 > arr->capacity) if (!arr_resize(size_type, arr, 1)) return false; // Move every element from the insert index to the right // one position. memmove(arr_ptr_(arr, index + 1), arr_ptr_(arr, index), (arr->count - index) * arr->size_type); memmove(arr_ptr_(arr, index), &data, arr->size_type); arr->count++; return true; } #define arr_insert(type, arr, data, index) \ arr_insert_(arr, JOIN(type, var(data)), index) 

Perhaps that &data in the memove was the problem.