A Very Inefficient Way to Profile a Program

I was trying to write my own DirectSound implementation (well still am but this is unimportant to the story), and I noticed that every time I would randomly pause the execution of the program to check the audio state, more often than not it would be in the visual output portion of the program. So I got to thinking what an amusing thing it would be to profile a program by having it run and then randomly pause it tens or hundreds of thousands of times during its execution and then simply record where the program happened to be during each pause.

With an arbitrarily large number of (truly)random pauses and assuming the computer doesn't throttle the processor or anything like that it seems to me like it could be quite accurate--albeit completely unfeasible.

I guess there is no real point to the post, I just thought it was funny and that I might share a bit and see what everyone thinks, or maybe even have a competition to see if they could think of an even worse way to do the job (just changed the title from "The Most Inefficient" to "A Very Inefficient" in anticipation):cheer:

Edited by Ruy Calderon on
Actually that is not at all a bad way to do the job. That is, in fact, the entire idea behind a whole branch of profiler design called "sampling profilers" or sometimes "statistical profilers"! For example, Intel's own "VTune" profiler is exactly this kind of profiler.

For more information:

http://en.wikipedia.org/wiki/Prof...gramming%29#Statistical_profilers

- Casey
Just getting back home now, but that's really cool, thanks for the read! One of the reasons I thought it would be too inefficient to use is because it seems like the number of samples you'd have to take would increase exponentially as your program became larger and larger, and the accuracy would be limited by the disparity between the longest calls and the shortest calls. Now for the embarrassing full disclosure: I kind of envisioned pausing the program in visual studios debugger and writing the function it was on down--however I think it's safe to assume that Intel and their competitors have found a better way of going about it than this.
Actually you are helped by the nature of the profiling process, which is to say that when you are profiling, you mostly care about finding the parts of the program that are taking the most time. So in general, it doesn't matter that much how large the program is - one second of CPU time is always one second of CPU time, so the number of samples you have to take to identify the hotspots is always going to be the same in some sense, you know?

Stated alternately, the accuracy of n samples in a sampling profiler is related to the number of instructions the CPU executes per second, not the complexity of the program, because it doesn't matter how "large" the program is, it can still only execute that many instructions in a second, so your coverage of the code that is taking the time stays the same.

- Casey
Huh, that makes sense, I didn't think about it from the perspective of the processor, just the program. I guess it's like taking a fixed-scale picture of an object--it doesn't matter if you have 1000 of them to get a picture of an entire mountain or a single one to get a picture of a ladybug--the information conveyed by each pixel remains the same.

Edited by Ruy Calderon on
Also known as the poor man's profiler.
Speaking of the poor man's profiler, here's a neat trick for the Visual Studio debugger: if you add an entry called "@clk" to the Watch window, it will display a running count of the elapsed execution time (in microseconds) which will automatically update as you step through your code. Adding another entry called "@clk=0" will display a new line whose value is always zero, but clicking the spinning-arrow "refresh" button next to the value field will reset the value of @clk.

Giant caveat: if you're profiling code in a debug build, you should not be too concerned that things are slow; it's a debug build, after all! But it can still be useful to quickly determine the relative performance cost of several lines of code, without requiring any code changes or recompilation.
Oh. My. God. That is so freaking cool! Thanks for sharing this trick, I'm sure I'm not the only one who will be keeping that one in the back pocket for the future. Caveat noted though :)

EDIT: I'm just thinking about the implications of this, do you know if something in visual studio takes advantage of this capability to write out timing data to the .map file for instance. I mean if they have the variable built in I'd have a hard time believing they haven't done anything more with it - I feel like Microsofts problem is usually doing too much not too little.

Edited by Ruy Calderon on