Allen, you might want to look at
http://andrewrussell.net/2016/06/...in-river-city-ransom-underground/
for a more complete explanation!
But the short answer is, it's because
we don't actually know what's supposed to happen in 3D, and a Z-buffer doesn't really help us. We are essentially flattening real 3D objects into flat 2D planes, but those are just approximations and to the viewer it is clearly
not perceived as a flat plane, etc. So when it comes to figuring out what should be in front of or behind something, we cannot appeal to any "real" 3D structure of the scene, since we don't actually have one!
- Casey