Storing Strings Data

I'm new to handmade hero, been marathoning for 4 months and I'm at episode 130-ish, I could say that it changed on how I think of programming, and it's really changing me for good.

I have a xls file that contains descriptions of skills and power ups. I choose xls because it's easy to tack and easy for content authoring, also probably for localizations if I really planned for it.
Then this xls file gets processed and turned into something that the game could use, probably into csv then to binary, I don't know the ideal solution for this, don't want to choose XML or JSON for it's lack of speed to process it.

How would you store strings data on the memory data block contigously?

I see two problems that needs to be tackled,
- Storing the descriptions data on contiguous array
- How to access it fast by index

If I use char* to store it, drawbacks is I need to know the size of the descriptions up front, and the memory data block will be variable sized allocated. Using std::string would be easier to index but I would like to push myself for doing it this way for learning purpose.

What are your suggestions on solving this problems?

Does this get discussed later on? On which episodes?

Thanks!
This is a minor database problem

What you can do is group the fixed size data (like the actual stats) into a single array and then have each also store a pointer to the the variable length string contents.

Then it becomes a problem of how to allocate the string memory. That can simply be a large char* buffer with a simple memory arena where the pointers will point into.

If you want to get fancy then you can align the strings to the cache boundaries and group strings that are commonly read together.
In handmade hero you can watch the episodes about the asset file format it start at week 30 (episode 147).
Casey tackle almost the same problem but with assets instead of strings .


As @ratchetfreak said, you create a array of (size, pointer) where pointer gives the file offset of the string and size gives how many character to read.
The Idea is something like:
 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
offset     type
--------------------------------
0:         file format signature
4:         StringsCount
8:         {CharCount, StringOffset} // string 0 entry
16:        {CharCount, StringOffset} // string 1 entry
24:        {CharCount, StringOffset} // string 2 entry
.          .  
.          .
.          .
??:        {CharCount, StringOffset} // string StringCount-1 entry
//End of the array

//Start of the strings data
StringOffset0:          String0
StringOffset1:          String1
  .                       .
  .                       .
  .                       .
  .                       .
StringOffset(Count-1):  String(Count-1)
End Of File


You can index the array which start at offset 8 in this case, as 8 bytes structures.
Then you can get the string by reading at offset, size characters long string as specified in the entry.
Then strings data comes after the array.

Even though the strings data comes after the array, when creating the file, the strings data will be structured in memory before the array to know what index every string will be at.

The string offsets can be from the beginning of the strings data instead of the beginning of the file which might be easier for you to do.

Edited by Ameen Sayegh on
Thanks guys for your detailed response

I guess I need to see how Casey implemented it first on that episodes to be more clear