5sw
It doesn't matter what you perceive the speed to be, you can only trust actual measured data.
I'd say its the only thing that matters (for desktop software, which is what I write normally). But I agree of course, in theory. I of course measure ALL my code, and always did. I even have created a significant application, only for measuring the speed of my allocator, and for testing various pieces of code. All written in assembly.
But since you insist. Now I have a working game. And I have onscreen measurements in milliseconds of every key piece of code. So I will enable those macros and see if there actually are a difference. Just a second...
It means actually nothing. There is absolutly no difference. The average m/s is the same, the maximum and minimum m/s is the same, the percentage of frametime on physics, drawing and sounds are the same. There is 800 "entities" on screen (10*10 pixels by two filled circles). There is about 400 bullets in total comming from those entities, flying through the "air".(lines) There is blinking from red light, when a bullet hits an "entity", and lightning and sounds of fire, like it was the 4th of July or the endtimes. This is gdi though. But GDI can do 280 fps on this machine, so you can still do a lot. I have lighting calculations for the background, and for every single "TILE". 66*50. The tiles are written by 1 fillrect, and one framerect, per tile. The background just a single clear. All "entities" are constantly in flight, using both acceleration, and deacceleration from and to new positions. They signal when hit, they perform target scans, and hittesting of bullets agains all other "entities" each frame. Also hittest bullets against the TILES.
Aligning all routinelabels, and jumplabels for every loop, to 16 bytes, and memory too, gives ZERO difference. Except the exefile gets a lot bigger. This alignment thing, is just like the cyclecounting thing. It is a MYTH. Or to be 100% fair, all I can say is that to me, for my spesific code, it appears that it is a myth. And it is a myth on like 30-40 applications. So I am not just pulling this out of my ass. It very consistently, does not matter.
It may mean something for testing small pieces of code. But you only be fooling yourself. When you are using every resource for all kinds of data, in a real application, it does not matter anymore. At that time, what matters is efficient cache usage. (Memory bottleneck). You don't have the luxury of doing perfect measurements, nor of coding perfect code then. The most you can do is a reasonable workable compromise between your various sub-systems, and you can tune and tune and tune, sometimes making big leaps, that then are nullified for reasons unknown and only guessed at, over the next week of added code. And then you do it again. And you know you only have time to make a relatively few "tuning" tests, out of the incredible space of all possible tuning tests you could make to squeese out more performance. And in this situation, this data alignment no longer matters. It an academic "plaything" that has no real value. People who measure cycles of instructions are wasting their time, and that of others.
5sw
I think that's not what people mean in general when they talk about premature optimization. It's just optimization before the time is right for it. A former coworker of me once spent three days optimizing a function pretty much at the beginning of a project. He got it's runtime down from 10 ms to 1 ms, which was rather impressive. But in the finished product this function was called exactly once during startup. I think that's pretty much the definition of premature optimization. Don't optimize anything until you know (because you measured it) that a certain part of code needs to go faster.
Yes, I agree, but that's what I try to say too. Only strategic optimization matters much, and for the once you can't find, other have already done, you have to discover them by starting somewhere. But you also need to test along the way. To learn something. You test each subsystem. Then you test them together. Which is never the same thing. Then you add something, like lightning, or shadows or something, and now the entire thing is skewed totally, making you into a questionmark. I mean, first week, I could do 3000 entities without any perceived slowdown. But now, I can do at most 1000 in total. But each day I learn new ways to kill cycles, but its not an exact science. Yes, in a sense it is exact, but my brain is not. There too many variables for being able to accurately measure all of them happening at once, and for many different situations. I have now an average of "only" 1,8 m/s per frame, even at the worst case, but the worst frame per sec is still 26 m/s. Those numbers are more or less confusing. What REALLY tells me something is right or wrong, is how I _perceive_ playing the game. how resposive it feels to my keyspresses, when a lot is going on.