uucidl
in this case the general tool here is type checking. When is it useful? What kind of mistakes it is able to prevent? Are there other alternatives to deal with the same issue?
A long time ago, I wrote a program that was, basically, diff. Don't ask why; the point is, I did.
If you're familiar with diff (or, indeed, many programs which have to do with editing text files), one interesting thing about it is that there are two types of interesting "index": Line numbers, and positions between lines. The reason why is that line edits actually happen
in the space between lines, not at lines. If you insert a line in a text file, that insertion takes place
between lines.
The typical numbering scheme (used by RCS) is that the first line in the file is numbered 1, the second line is numbered 2, and so on. But for positions, 0 is the position before line 1, 1 is the position between line 1 and line 2, and so on.
As you can imagine, it's extremely easy to get these mixed up, but also it introduces a bunch of adjustments by 1 which are easy to miss or misinterpret.
Now this program wasn't in C++, but I basically did Casey's trick: wrapping the value in the equivalent of a struct. Something like this:
| // I didn't write code laid out like this.
struct line { explicit line(int value) : v(value) { } int v; };
struct pos { explicit pos(int value) : v(value) { } int v; };
line before(pos p) { return line(p.v); }
line after(pos p) { return line(p.v + 1); }
pos before(line l) { return pos(l.v - 1); }
pos after(line l) { return pos(l.v); }
|
That's not very much code, but it saved me hours of debugging.
Back when I worked in visual effects, I found that the same thing was true of points, vectors, and normals. Keeping the three concepts distinct at the type level meant that the compiler caught a lot of usage bugs.
All too often, the type system of the programming language is designed for the benefit of the compiler; you declare something as an integer so the compiler knows what register to store it in. I think this has it backwards. The type system should be designed primarily for the benefit of the programmer. The closest I've seen is Hindley-Milner type systems, which really do seem to be designed with the programmer in mind. Unfortunately, H-M languages tend not to let you "feel the bits" that you're working with like a lower-level language does. I like to think that there's a sweet spot still to be discovered.
As a final comment, I'd like to rant for a moment about Hungarian notation.
Fixing this kind of type error was the original thinking behind Hungarian notation. Charles Simonyi used to work on Excel at Microsoft, and he noticed that one common class of type error was programmers doing things like mixing an integer which was logically a "row" in a spreadsheet with one that was logically a "column". His idea was to prepend the variable name with the semantic type. But the way that Microsoft (and Windows programmers in general) seem to use it is to prepend the variable name with the
physical type.
The compiler
already knows that "LPCTSTR lpszPathName" is a pointer to a C string, and if you try to misuse it that way (e.g. by passing it to something that expects a pointer to some other type), the compiler will give you a warning or error. What the compiler doesn't know is that it should be handled as a file path (e.g. it has a maximum length, that it has a structure with an optional drive letter and path components separated by backslashes on Windows, etc) and shouldn't be passed to a function that wants a user name.
OK, so that's an artificial example; any sober programmer is unlikely to pass a variable called "FilePath" to a function called "CheckUserName()". But similarly, "FilePath" is unlikely to be anything other than a string, so the "lpsz" prefix requires extra typing and uses valuable screen real-estate for no gain.
Maybe this made some sort of sense in the 16-bit era where there was good reason to visually distinguish near and far pointers. It's the 21st century now.
(As an aside, I also note that IDEs or text editors which support auto-completion make the situation worse, since they almost always auto-complete the postfix of an identifier, not the prefix. Even in the land of auto-complete, you need to invoke the type of what you want before you can think of typing the name of what you want. How crazy is that?)
So if you're determined to use Hungarian notation (which should still be used with a very light touch, if at all), doesn't it make more sense to prefix with the semantic type rather than the physical type? So if you decide, say, that "fp" means "file path", you could use variable names like "fpSave" and "fpBackup" rather than "lpszSavePath" and "lpszBackupPath".
End rant.