Arena Cache Friendliness

Hello there,

I've watched soon 40 episodes of Handmade Hero (no spoilers please) and got to thinking about how the Memory Arena is stored.

Currently the pointer to the data that the arena manages can be stored far from the meta data (size and used members). Consider code that pushes data to the arena, such as
1
2
3
4
5
6
InitializeArena(myArena, ...);
for (unsigned int i = 0; i < someNumber; ++i)
{
    void* data = createSomeData();
    PushArray(myArena, data, ...);
}

, wouldn't it then be more cache friendly to store the size and used members as close as the data pointer as possible?

I don't know what logic is done do keep and evict data from the cache so if the question doesn't make sense do let me know :)
Kipt
wouldn't it then be more cache friendly to store the size and used members as close as the data pointer as possible?

I don't think so: the members of the arena (base, size, used) are only used when allocating or releasing some memory. They are not related to the use of the data stored in that memory, meaning that when working on the data you wont access the arena members.

Your code example doesn't look right. When using PushXXX functions, you are not moving memory around, you are requesting some space in the memory that an arena manage. Those functions return a pointer to the memory that you can then use as you wish.

1
2
3
4
5
6
7
InitializeArena(myArena, ...);
DataType* dataArray = PushArray(myArena, someNumber, DataType);

for (unsigned int i = 0; i < someNumber; ++i)
{
    useSomeData(dataArray + i);    
}
mrmixer
Your code example doesn't look right.

You're right, it should be more like:
1
2
3
4
5
6
7
8
InitializeArena(myArena, ...);
for (;;)
{
    void* data = PushArray(myArena, 5, sizeof(SomeType));
    initData(data);
    if (someCondition(data))
        break;
}

Although that example would be better with just "pre-pushing" a bigger array than doing it in a loop, as long as you can determine the size beforehand.

mrmixer
the members of the arena (base, size, used) are only used when allocating or releasing some memory.

And when you need to access the data!
1
2
3
void* base = gameState->myArena.base; // Must access Arena struct data.
for (unsigned int i = 0; i < someNumber; ++i)
    modifyData(base + i);


Granted, it would only be one miss and probably not worth thinking about. And it would still be a miss with the pointer being stored just before the data if you're trying to read past a cache line worth of data in the arena.
Kipt
And when you need to access the data!

You generally store the pointer you got back from PushXXX so you don't use the arena base pointer anymore. Getting back to a specific data pointer from the base pointer in the arena system is harder than just storing the pointer (unless you store only one type in the arena in which case the arena is just an array).

For the cache question: even if you access the data like you typed you probably wouldn't miss. When loading data + 0 the cache would be populated with data + 1, data + 2... if it is contiguous in memory and fits in the cache line. But I think there is no "general way" to avoid cache misses since the problem is related to the specific data and use of that data.

Edited by Simon Anciaux on Reason: Added cache part