Handmade Hero » Forums » Code » Better performance when working with strings
itzjac
35 posts
#20606 Better performance when working with strings
3 weeks, 2 days ago Edited by itzjac on Feb. 26, 2019, 1:43 p.m.

Hi!

In Episode 154 Casey mentioned the poor performance when using the standard string library.
This reminded me back to the first project I worked, we worked with the CryEngine2.

CryEngine2 used the CString class, the standard string library for Windows CString, I believe the latest CryEngine still uses the same string library, am not
100% it was used only for MFC or also for the runtime game, you can answer that yourself here anyways CryEngine source.

When we were working with strings for our own project, I clearly remember we changed all the CString to std::string, I can imagine this had a huge impact in the compilation times as Casey pointed out, but it also turned our base code into a very confusing nightmare.
All of the very experienced coders, the senior ones, they agreed to convert all to std::string back then, not so experienced after all. With Casey's episode it brought my attention to that particular question back then, why? Maybe for portability purposes it was better, but neither us nor CryEngine2 Editor was ported to linux or any other platform.

Is CString too superior to std::string that the CryEngine guys decided to go go go Windows?

Regarding the usage of std::string. Anyone with experience using the CString, would the performance be better if we avoided std::string in our project (compilation times, maybe runtime execution also)?

Are you aware of any other string library that has better performance for games or should we just apply Compression OP and implement what's needed ourselves?

Cheers!

Guntha
26 posts
#20607 Better performance when working with strings
3 weeks, 2 days ago Edited by Guntha on Feb. 26, 2019, 2:50 p.m.

Hello itzjac,

I'm not sure there can be a generic string library that has "better performance" for any kind of game, and I can hardly think of a game where string processing performance is critical, or that even needs string processing at all. The only example I can think of is a language-agnostic online chat room, and yet this is more a font rendering problem than a string manipulation problem.

I remember reading in the very good book Game Engine Architecture, 2nd Edition (there's a 3rd edition) by Jason Gregory that they try to avoid using a string class at all for their games, except a "Path" class for manipulating file-system paths easily and in a platform-agnostic way.
itzjac
35 posts
#20608 Better performance when working with strings
3 weeks, 2 days ago

Hi Guntha,

And I wouldn't disagree with your statement. Further than asking for a generic library, I would like to get input on why to pick a library or another. Specifically, between CString and std::string I don't see a lot of benefits going one way or the other, I don't have any real numbers to confirm it.

On the other hand, been so critical and common string manipulation, I guess it will better be on the "do it yourself" way than choosing from any library, at least for the run time part. What about the tools?
ratchetfreak
435 posts
#20609 Better performance when working with strings
3 weeks, 2 days ago

It's more likely that std::string needs to cater to too many types of applications where it needs to make concessions for some use cases that will hurt performance for others.

One of those things is the short string optimization.

Another factor is how many fields the string object has and how much data is put into the backing buffer.

Which apis you need to interface with and how they handle the strings they get passed/pass back to you.

Being able to use realloc-like memory allocations when needing to grow the string. If you never build strings of unknown length you don't want that.

Another feature to consider is copy-on-write. This makes passing strings fairly cheap and O(1) (an atomic increment is kinda a cache-killing annoyance though).

Related to that is substring sharing.

However some of those features are impossible to implement given the fixed api that the std spec provides. That is one of the reasons why no-one who cares about performance will used std::unordered_set which requires that you use external chaining for the implementation to conform to the buckets section of the api.