uint8_t* cast -> struct* + deref (strict-aliasing)

Sorry in advanced if it was already discussed in videos, haven't seen them all yet.

I'm more a C++ guy so Handmade Hero is quite interesting experience for me to see more C-style approach. I was "analyzing"/ "digging deeper" into approach used in Handmade Hero for memory management, when uint8_t* is casted to some struct* and then dereferenced. From what I understand in general it's undefined behaviour.

http://en.cppreference.com/w/cpp/language/reinterpret_cast
If AliasedType does not satisfy these requirements, accessing the object through the new pointer or reference invokes undefined behavior. This is known as the strict aliasing rule and applies to both C++ and C programming languages.
http://cellperformance.beyond3d.c...liasing.html#cast_to_char_pointer
The converse is not true. Casting a char* to a pointer of any type other than a char* and dereferencing it is usually in volation of the strict aliasing rule.
As noted by Pinskla it is not deferencing a char* per se that is specifically recognized as a potential alias of any object, but any address referring to a char object. This includes an array of char objects, as in the following example which will also break the strict aliasing assumption.

So am I understand it correctly that this is measured risk? Since compiler optimisations from Mike Actons' article shouldn't play here and due to how compilers handle it in reality.

P.S. I found out from one article that sometime ago MSVC wasn't doing strict aliasing optimisation. So it's even less a problem. Yet it's more like understanding question is such work with memory allowed guaranteed by standard or it's how compilers work?

Edited by Darkius on
Generally speaking, strict aliasing was always a bad idea and continues to be so :)

I am not sure what the spec currently does or does not say. In general I find the spec to be written primarily by people whose programming methodologies I find severely lacking, so I tend to ignore it as much as possible.

Definitely the default compilation options of all compilers I currently use (LLVM and MSVC) do not cause any problems for sub-allocating memory out of blocks. If that were to break in a future revision, that would be the time when I stop using that compiler, whatever it is :) My hope is that I will not actually have to use C or C++ anymore within the next five years or so, as I have no faith in their futures as languages, so as long as the compilers still work for sensible code for the time being, that's all that really concerns me.

If you're planning on programming in C/C++ for the long haul, you might have to care more about what does and doesn't get stated in the spec, since you never know when compilers might start demanding strict adherence.

- Casey
Ok. I see. Thanks for clarification
cmuratori
Generally speaking, strict aliasing was always a bad idea and continues to be so :)

If I had a vote on the standards committee, I would include a keyword which meant the opposite of restrict. It seems to me that 99.9% of the time, the strict aliasing rule is more or less correct, at least for incompatible struct/class types. (Pointers to basic types and void pointers are another matter.) The only time when it isn't is when the programmer is doing advanced memory hackery, such as implementing an allocator. If you're doing that, you already know which pointers need to be used with care.

But actually, there's a more general problem here, which is that the C and C++ standards are full of nasal demons, but are very light on guarantees. Wouldn't it be nice if as well as a bunch of "don't do that, because it would prevent an optimisation", there was more of "do that, because the optimisation WILL happen if you do".

At the very least, the standards committee should agree on a way to write memory allocators which has their stamp of approval, and is guaranteed to work in all subsequent standards.
Problem with that is that it's not easy to figure out the expected access pattern of a function from its signature.

For example will a "void Foo(some_struct* a, some_struct* b, u32 number)" only access the elements directly referenced by the pointer (which allows them to be called with Foo(array, array+1, 64);) or see them as a buffer of number elements which can't be called like that.

Also you usually get the pointers from a function somewhere. Trying to decide whether the pointers you are passing will not violate the restrict at compile time eeks into the halting problem. Much simpler to add an if at the start of a function that checks the access pattern it derived from the source of the function and then branches to an optimized function if no overlap.

That information needs to be added to the function for the compiler to be able to do any automatic verification on it. Also if you make restrict transitive then a doubly linked list is invalid.
ratchetfreak
Problem with that is that it's not easy to figure out the expected access pattern of a function from its signature.

I probably should have been more clear about this.

Aliasing is, in general, a global property of a program. However, it's possible to have a conservative approximation which gives you reasonably strong guarantees. Like "const", you can make it a compiler-checked part of the type system. Unlike "const", if you pick a sensible default, you should never have to annotate types unless you're writing memory hackery.

(Incidentally, my postgrad work was implementing a strongly-typed kind of aliasing as part of the type system of a programming language.)
Actually I just think it's all extremely stupid, to be honest. I have really low tolerance for the whole "spec-based optimization" ridiculousness.

First of all, when you just write reasonable C code to begin with, you don't _need_ much in the way of aggressive compiler optimizations most of the time. So the times when you actually care about it, it's easy to add restrict.

Most of this stuff is just there because the way C++ committee people advocate coding (and their own STL/template nightmare ideas of standard libraries) _produce_ code that is so obtuse and verbose that you actually need things like assuming no aliasing everywhere in order to make it OK.

It's all totally absurd.

For any sane C code, the way more important thing is _correctness_ of the generated code, because bugs are way more important than performance these days on code that's not a C++ nightmare. So having all these hidden "gotchas" like aliasing optimizations or "same memory arena" pointer arithmetic constraints is actually way, way, WAY worse than just not doing the optimizations.

And like Pseudonm73 pointed our as well, I do think that a way more productive use of the committee's time would have been to focus on keywords that _add_ optimizations, so you can just put them where they need to be and not worry about the compiler doing something absurd in random parts of your code because some technical reading of a spec said it could.

DJB tweeted about this recently, actually, and I couldn't agree more:

- Casey

Edited by Casey Muratori on
While not a case of strict-aliasing, another potential problem with these types of casts is non-alignment of data.

When reading a char buffer from network or disk, it is often convenient to cast a pointer to the required type in order to "extract" some data from said buffer. But this will often result in mis-aligned data access if care is not taken. (alignment is one of the ways in which pragma pack modifies code gen, so that can help)

In HH, I suspect from memory of the implementation, that such a situation may exist in the asset store, where arbitrary length chunks are allocated from a large buffer, then the header of that chunk cast to a struct. There is likely no guarantee that this struct would be properly aligned (though there may be a reasoned argument that it is, I'd have to look at the code :)

Fortunately, Intel hardware pays no penalty for non-aligned access. But on other systems, such as Arm, you will get zinged each time.
raxfale
Fortunately, Intel hardware pays no penalty for non-aligned access. But on other systems, such as Arm, you will get zinged each time.


That's not correct. On ARM it depends on architecture. For anything less than ARMv6 and except ARMv6-M it is true - hardware will generate unaligned exception. But in some cases OS will handle that for you and will load or store unaligned data for you. This is on older Android phones.

But on newer ARM architectures (that pretty much includes all iPhone models and any modern Android and Raspberry Pi) unaligned access is perfectly fine, same as intel hardware.
Good to know, one less thing to worry about :). Do you know if there is a performance hit ? (I have seen segfaults on unaligned access in the past, and even the os trap seemed to not handle certain floating point instructions)

Without putting words in your mouth, mmozeiko, do you then consider alignment requirements to be purely a legacy issue ?

Are we heading for a future where all our structs should be tightly packed ? It would seem to me that tightly packed would give an advantage in cache utilisation... at the expense of a single read occasionally spanning two cache lines.

To take the thread even further off topic, how is one lonely programmer to make such a decision... I certainly don't know the details of the breadth of issues involved. I feel that if standards bodies have a place then this sort of thing is it. For the moment, cs dogma is that alignment is the rule. So that is what I do. But things do change over time, I'll try and keep up :-)
Well it depends on your target. If you are coding for micro-controllers (Cortex-M0 or something even smaller), then alignment matters. On newer ARM systems (like high-end Androids) it's pretty much irrelevant. Same as on intel.

Not sure about structure packing. Do you really have so many structures that are 1/2/3 bytes over 4-aligned size? Usually you put ints and floats, so it doesn't happen so much.

I wouldn't blame alignment issue on standards bodies or programming language. It is specific of hardware. If hardware is designed where it doesn't allow alignment, there is some reason for that (cost, performance, ...). You must know your hardware. For ARM or Intel the official manuals are pretty clear on alignment requirements.
Hmm, I think I would argue that for only an extra line or two of code you could treat alignment issues in a portable manner such that it adapts to the underlining hardware. (ie std::align, alignof)
For unaligned access on x86(-64), it seems that depending on the specific architecture. For some, the performance difference is negligible/zero (Sandy Bridge and Nehalem) (except for SSE instructions) and older architectures have significant performance problems. However, it seems that alignment requirement may be lifted in the future. The problem it seems from my testing is when you cross a cache line if the register is too big to "fit in".
Implementing an allocator or doing arenas will basically never be a problem with regards to strict aliasing. First, char can alias any type, which allows moving and clearing memory of any type. The more important point is that the "object" at a memory location has the type of what was last written to it, no matter how many pointer casts happen in between. It is also completely fine to reuse the same memory address by rewriting it with another type. The problem with strict aliasing only occurs when memory is written as one type of object and then read as an unrelated type. In this case the compiler may do unexpected things. The widely supported workaround in this case is using a union; write to one union member and read as another. This workaround is however not standards compliant but supported by all sane compilers. (I actually prefer code that violates strict aliasing to go through a union since it clearly documents the intent of reinterpreting memory.)