Has anyone else given Zig a serious try, only to get caught up in language debugging?

I really wanted to like Zig, mainly because it seemed simpler than all the fad `C-killer` languages popping up, and I thought it didn't have the 'big agenda' like that other unnamed language we hear about all over the low-level programmer interwebs. It also has compile time execution, which is the main reason I wanted to use it. I was planning on porting my take of the Handmade Hero engine to Zig, while taking advantage of the metaprogramming it offers.

Zig is presented as a simple language with a
Focus on debugging your application rather than debugging your programming language knowledge.

It started out as an implied C replacement, but has since grown to become very opinionated, in my opinion. :-) It does retain more of the simplicity of C, and its designer, Andrew Kelley, is really focused on maintaining that simplicity. But while Kelley himself is quite an amazing guy from the streams I've seen of him, it feels like Zig is being overly restrictive by trying to force the "One True Way" philosophy on its users, which ends up just causing lots of headache debugging the language.

Some upsides
---
  • compile time execution!
  • extremely simple and readable generics
  • completely interops with C without much friction. It can compile C, translate C to Zig, and import .h files.
  • maybe a C backend? I think it will eventually output C, but that code isn't finished yet unless I'm mistaken.
  • `defer` statement (runs on scope exit) and `errdefer` (only cleans up if there was an error before leaving scope)
  • really nice community, quick to offer help (I can't stress this enough, they are really great people)


Some downsides
---
  • There is no operator overloading, so vector math suffers in a big way, although its builtin SIMD features might alleviate this somewhat.
  • It doesn't allow function overloading, either. This isn't a deal breaker, but C'mon, it was one of the two decent features from C++ actually worth keeping.
  • It also has no default function arguments.

There may be work-arounds for those last two, but again, more language debugging.

One of the worst offenders, however, is the lack of `for` loops. It has a builtin foreach, but no ranges. So you can only loop over known-length arrays (either compile-time known or runtime). So we have to resort to `while` loops for iterating over an integer range. Here is a workaround for the missing `for` loop:
1
2
3
4
5
{var i: u32 = 0;
while (i < rects.len) : (i += 1)
{
    // do the things...
}}


Notice how the entire structure has to be wrapped in a block so the index doesn't leak into the scope. We are living in 2021, right? But having a `for` loop would mean 2 ways to loop over a range, so it is taboo from what I can tell.

Here is a (half?) joke response to an actual serious proposal to deal with the lack of a basic `for` loop.
1
2
3
4
5
6
// Joke proposal
if (with(@as(usize, 0))) |*i| while (i.* <= 10) : (i.* += 1) {
    std.debug.print("{}\n", .{ i.* });
}
// The serious proposal:
with(var x: usize = 0) while(x < 10) : (i += 1) {...}

And the equivalent shorter brainf!ck code:
-[>+<-----]>---<++++++++++<++++++++++[>>.+<.<-]>>---------.-.


On Simplicity
---
On the overview page, we are lead to believe that a port of C code to Zig is significantly simpler due to error handling. And perhaps it is somewhat simpler due to Zig having `defer` statements, which are nice. But I went to check out the original logic code after the error handling is done, and this is representative of what it's like to deal with pointers:
1
2
3
4
// C code:
*ptr = sample;
// Zig code:
@ptrCast(*f32, @alignCast(@alignOf(f32), sample_ptr)).* = sample;


And the main reason given for lots of rejected proposals is that Zig must be readable, with only one way to do things.

Last is an example of me trying to port over Casey's famous dead-ballz-simple-and-fast memory arena code that we all fell in love with upon first watching him code it without any language friction whatsoever.

It took me 9 hours of language debugging, issue tracking, web and/or soul searching to come up with this beauty. I'm only posting the core logic here because I've already lost the brevity contest in this post.
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
if (arena.used + size <= arena.base.len)
{
    // This first attempt works, but what a mess...
    //var memory = @bitCast([]align(@alignOf(T))u8, arena.base[arena.used..]);
    //result = @ptrCast(*T, memory[0..@sizeOf(T)]);

    // Another way to do it!  (please don't tell anyone)
    var address = @ptrToInt(arena.base.ptr + arena.used);
    result = @intToPtr(*T, address);
    arena.used += size;
}


Looking back on this snippet, I suppose making int casting and pointer casting explicit isn't that big of a deal, but the shear amount willpower required to fight the language was unexpected. And the way I got this to work is probably seriously frowned upon for being unsafe.


All in all, it's probably better than C++ due to its simplicity, but I wish Zig was the silver bullet.

Edited by Jeff Knox on Reason: Initial post
I haven't looked at Zig (at all) but your proposal for the loop seems weird to me. If you want a regular "c for" loop, ask for a "c for" loop, not a more "complex" syntax. But I don't think the authors of Zig come here anyway.
mrmixer
. . . but your proposal for the loop seems weird to me.


Sorry, I should have been more clear. This was an existing proposal that someone else made as a github issue. I just stumbled across it when I was trying to figure out how to do a normal `c for` loop.

Edited by Jeff Knox on Reason: typo
knox
And the way I got this to work is probably seriously frowned upon for being unsafe.


Not really, Zig is not Rust :^)

Nobody is going to yell at you for the code being unsafe, but I do believe there is some nuance there that you might not be fully appreciating.

Let me start by saying that I have experienced this myself a while ago while experimenting with porting antirez/rax to Zig as an exercise. In rax there is a lot of reinterpreting memory going on that doesn't translate nicely.
So I can confirm that you do lose some agility when writing Zig code when compared to C, in some cases.

Now, here's what I think you're missing.

1
2
3
var address = @ptrToInt(arena.base.ptr + arena.used);
result = @intToPtr(*T, address);
arena.used += size;


The problem with these lines is not the "unsafety", but the fact that they are simply wrong. What you're missing is precisely what the Zig type system was stressing you about: memory alignment. You can read more about memory alignment here, but in terms of practical consequences, that code will not work on ARM CPUs because they don't support unaligned memory access.

If you only program with x86 in mind maybe you might not care about this detail since unaligned access is allowed (although less performant) but Zig cares a lot about portability and so the language wants you to get these details right.

In your example code above you are tossing into the memory slab a set of types with different sizes, potentially causing some allocations to break alignment. Bailing out of the pointer type system prevents the compiler from catching the error, but that's just something that's there for your utility (just to stress the point that Zig is not Rust) and in fact the arena allocator implementation in the standard library does the same (i.e. uses ptrToInt), but gets alignment right by aligning forward the pointer to the first free byte.

https://github.com/ziglang/zig/bl.../heap/arena_allocator.zig#L73-L77
1
2
3
4
5
const cur_buf = cur_node.data[@sizeOf(BufNode)..];
const addr = @ptrToInt(cur_buf.ptr) + self.state.end_index;
const adjusted_addr = mem.alignForward(addr, ptr_align);
const adjusted_index = self.state.end_index + (adjusted_addr - addr);
const new_end_index = adjusted_index + n;


The same thing appllies with your earlier example:
1
2
3
4
// C code:
*ptr = sample;
// Zig code:
@ptrCast(*f32, @alignCast(@alignOf(f32), sample_ptr)).* = sample;


If the compiler is forcing you to ptrCast `sample_ptr`, then my guess is that it's a void pointer that carries no alignment information. Know also that @alignCast in safe build modes (debug and release safe) will panic if the alignment doesn't match expectations, instead of potentially incurring in UB (or degrading performance).

My final takeaway is that simplicity is not ease and in C it's definitely easier to reinterpret memory, but it's also equally easy to mess up some details, especially while refactoring. In this case Zig was trying to point out a real thing that you should care about and gave you a chance to rely on the type system to make that property easy to preserve while refactoring.

If you find this idea intriguing, check out sentinel-terminated pointers, they are great to prevent null-termination bugs in strings.


Edited by Loris Cro on
You're overstating your case by a lot with regard to unaligned accesses. Unaligned memory access has had little to no performance impact on x86 CPUs for about a decade, and ARM has had hardware support for unaligned accesses for at least that long, although the performance might still be significantly worse for all I know.

Nevermind the absurdity of the claim that the language is nobly trying to warn you about the dangers of unaligned access on legacy ARM systems by making pointer casts a bit more tedious. Zig requiring you to do casts to implement an arena allocator is not fundamentally different from C requiring you to do casts to implement an arena allocator, it's just that Zig is more syntactically annoying about it because being superficially annoying is Zig's primary personality trait as a language.
notnullnotvoid
You're overstating your case by a lot with regard to unaligned accesses. Unaligned memory access has had little to no performance impact on x86 CPUs for about a decade, and ARM has had hardware support for unaligned accesses for at least that long, although the performance might still be significantly worse for all I know.


I am definitely not up to speed with the latest developments of CPU architectures, but I do remember Redis having issues with the RaspberryPi until all alignment issues got fixed. Before v4 Redis did not support ARM (that was like 3 years ago, not in the far past).
https://redis.io/topics/ARM

notnullnotvoid
Nevermind the absurdity of the claim that the language is nobly trying to warn you about the dangers of unaligned access on legacy ARM systems by making pointer casts a bit more tedious.


I used ARM as an example, I'm sure there are other architectures out there with similar issues. And even then, if you only care about modern CPUs, stuff like SSE and AVX still has alignment requirements, what about those?

notnullnotvoid

it's just that Zig is more syntactically annoying about it because being superficially annoying is Zig's primary personality trait as a language.


Well you clearly have your mind made up already about Zig. I'll be honest: I think that lashing out like this detracts from your original argument about alignment. You clearly don't have to convince me of anything, but you might want to elaborate on how you got to that conclusion for the other people here, OP in primis.
Incorrectly aligned memory is not just a major performance factor. It's bad practice and can lead to random crashes if the hardware or library being called doesn't allow it. Memory alignment is often essential for certain algorithms to even function (like bit shifting 2D array look-ups with a power-of-two stride to avoid multiplication). DSP remote procedure calls, NEON and SSE have rules about memory alignment that must be followed, so I would be thankful if a language pointed out such mistakes in compile time using static analysis, instead of crashing randomly on another user's system. Even if a SIMD instruction can emulate unaligned access on the surface, your memory controller still has to load both aligned chunks of memory before merging them, which can be coded more efficiently if you do the aligned loads yourself and reuse data from registers to multiple vector extractions (bit shifts across SIMD lanes).

Only having for-each loops is something that would seriously prevent the developers from improving their algorithm design skills, by being stuck with iterating over all members and never being able to conveniently use planar formats or neighboring elements by index. I rather throw out useless for-each/iterator loop crutches, just to prevent new developers from getting stuck in the position independent element mindset.
stuff like SSE and AVX still has alignment requirements, what about those?

SSE and AVX also have unaligned loads, with once again little to no performance penalty.

I would be thankful if a language pointed out such mistakes in compile time using static analysis

I agree, it would be great if there was a langauge that could use static analysis to identify alignment issues in situations where it matters. Too bad Zig does nothing even remotely like that!
Hey, thanks for responding, Loris!

kristoff
What you're missing is precisely what the Zig type system was stressing you about: memory alignment.


I can appreciate that. For my purposes I mainly develop on x86, and I have a TODO to get the alignment right for other platforms. Casey actually handles alignment in the arena allocator in one of his episodes, but I used a simplified version to demo here.

And I'll be sure to not lump Zig in with the Rust safety-squad in the future. ;-)
Hi, just want to add my 2 cents about "for" loops: they're handy and we're used to them because they're the same in every language, but if we want to be honest with ourselves, it doesn't really make sense to use them for arbitrary iterations, and Zig's solution of "while with continue expression" is perfectly fine to me. (And I wouldn't call that a "workaround").

Actually in C, I need from time to time a for loop that skips the initialization part (
1
for(; i < 10; ++i)
), it's not pretty, so Zig's "while with continue expression" hits right at home to me.

Edited by Guntha on

Okay, now you've piqued my interest. XD

What exactly do you find dishonest about a range [0-10)? Is it because you claim we should always use the length of a known range, and not hard code the length? I'm actually curious.


As an aside, I don't really see an issue with for (; i < 10; ++i); it simply looks to me like the programmer wants to loop over 10 elements of something.

But it brings up the point: This sort of thing is highly subjective, which is why there should never be any language bike shed policies imposed, and it is the reason why all BDFL languages are the Wrong Direction for the programming community (in my estimation).

This includes Jai.

The reason C became popular was because it grew--in a compression oriented fashion--out of existing code from dozens (or more?) of good programmers. All the new languages trying to compete with C are picking the wrong battles, and trying to be a top-down design based (for all practical purposes) on the opinions of a single programmer (the BDFL).

If we are truly being honest with ourselves, we would stop with the syntax/paradigm/orientation/"safety"/fad bike shedding, and move toward the simplest language that allows ideal metaprogramming/code generation facilities so the programmer, and not the language designer, can decide what color their own ******* shed will be.

Again, in my estimation.


Replying to Guntha (#24505)