Register
Handmade Hero»Forums»Code»Dynamic arrays
Vjekoslav Krajačić
9 posts
Dynamic arrays
1 week, 2 days ago Edited by Vjekoslav Krajačić on May 15, 2020, 10:18 a.m.
Hi.
What's your solution when you need contiguous storage for data, but you don't know the size ahead of time. And I don't mean just while developing. It's dependent on the user input, so you have to ship it that way. My first attempt was to create some kind of sliced array, where each slice is a bigger chunk of memory. That way, data should still be cache friendly most of the time, except when it needs to jump to other slice. I want fast access and fast iteration time, so instead of linking the slices, I have another array of pointers to those slices. Each slice is same size, and all elements are same size. Here's my implementation using memory arena. What do you think? How would you improve it?

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
struct SlicedArray
{
	MemArena arena;

	uptr *slices[128]; /* this can be dynamic too */
	sizet sliceCount;
	sizet itemCount;

	sizet effectiveItemSize;
	sizet itemsPerSlice;
};

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
INLINE void initSlicedArray(SlicedArray *array, sizet sliceSize, sizet itemSize, u32 alignment)
{
	array->effectiveItemSize = ALIGN_POW_2(itemSize, alignment);
	/* ensure that at least one item fits */
	ASSERT(sliceSize >= (array->effectiveItemSize + sizeof(MemBlockFooter)));

	sizet effectiveSliceSize = sliceSize - sizeof(MemBlockFooter);
	array->itemsPerSlice = (effectiveSliceSize / array->effectiveItemSize);

	sizet arenaBlockSize = sliceSize;
	initArena(&array->arena, win32Allocate, win32Deallocate, arenaBlockSize);
	array->sliceCount = 0;
	array->itemCount = 0;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
INLINE void *pushItem(SlicedArray *array)
{
	sizet prevSliceCount = array->sliceCount;
	void *result = pushSize(&array->arena, array->effectiveItemSize, noAlignNoClear());
	++array->itemCount;
	array->sliceCount = array->arena.blockCount;
	if (prevSliceCount != array->sliceCount)
	{
		array->slices[array->sliceCount - 1] = (uptr *)array->arena.base;
	}
	return result;
}

1
2
3
4
5
6
7
8
INLINE void *getItem(SlicedArray *array, sizet index)
{
	sizet sliceIndex = (index / array->itemsPerSlice);
	sizet itemIndex = (index % array->itemsPerSlice);
	void *slice = array->slices[sliceIndex];
	void *result = (u8 *)slice + (itemIndex * array->effectiveItemSize);
	return result;
}

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
struct Item; /* some data type */

SlicedArray array;
initSlicedArray(&array, MEGABYTES(1), sizeof(Item), 4);

Item *item = pushItem(&array); /* adding */
Item *firstItem = getItem(&array, 0); /* fetching */

/* iteration */
sizet itemIndex;
for (itemIndex = 0; itemIndex < array.itemCount; ++itemIndex)
{
	Item *item = getItem(&array, itemIndex);
}

Simon Anciaux
786 posts
Dynamic arrays
1 week, 1 day ago
There are GingerBill's articles on memory allocation, you can also search for stb stretchy buffer.

One thing that is also possible is to reserve memory without committing it. You just reserve a huge chunk of address space (1GB for example) but only commit 1MB. When you reach the end of the first MB you commit a second one... That way the memory is contiguous, it's just a big array.

Vjekoslav Krajačić
9 posts
Dynamic arrays
6 days, 8 hours ago
I've read both of those resources before. I couldn't find solution to this problem on GingerBill's articles. Closest to this was pool based allocator, but it still uses preallocated memory buffer. Stretchy buffer is just syntactic sugar for using realloc. And if you want to just reserve the memory you still need to know how much of it, right?

To rephrase the question, is there anything better for this kind of situations than using realloc? It feels to me that, as the memory allocated using realloc becomes bigger you're in nastier situation because you start copying a lot of data if OS can't reallocate using old address. Realloc also has the problem that you cant use it for storage where multiple threads write. I just wanted to know is there some kind of best practice when you don't know how much memory you will need, but you want to keep the data contiguous?
Simon Anciaux
786 posts
Dynamic arrays
6 days, 2 hours ago
I don't think there is a best practice. It always depends on the problem you're trying to solve.

In my (small) experience if you don't know the amount of memory that will be used, you can generally guess or dictate an upper bound that will cover most cases, and then either fail or handle special case with a performance penalty.
Dumitru Frunza
24 posts

Apprentice
github.com/dfrunza/hoc

Dynamic arrays
8 hours, 13 minutes ago
I've researched this problem a while back, and the basic idea outlined in your solution is an approach being explored in the literature, and thus could qualify as "best practice".

The main focus of the variants to this solution are on the optimization of the time it takes to access an array item and the minimization of wasted space in the (last) slice.
Here's a paper that presents an implementation with a constant access time and O(sqrt(n)) storage consumption, in case you need to go that far:
Resizable Arrays in Optimal Time and Space

Concerning the alternative, which is to reserve 1GB of memory from the OS, and commit pages when the array needs to grow - the problem I have is this - the reserved memory becomes address space that is unaccessible to other parts of the program.

Programming is hard, let's go shopping!
Simon Anciaux
786 posts
Dynamic arrays
1 minute ago
There is 8 TB of process address space available on Windows 7 and 8; and 128 TB on Windows 10. So there is plenty for everybody. Or do you mean something else ?