On day 178, someone asked Casey why did he consider using atomics for the timers instead of using thread local storage, and he responded that thread local storage has multiple problems and may impact the performance more while atomics are "almost entirely free".
But in the recent Engine Simplication video, Casey said that his multithreading coding style has changed a lot, and then explained why using atomics here is very bad and that he considered using thread local storage instead.
And on day 665, while discussing local static variables, he again said thread storage is "very slow" and implementing local static variables that way is terrible.
This all seems very confusing to me. Is thread local storage fast or slow? Should I use it? When should I use it? What about atomics? If they're all terrible then what are the alternatives?