I have a coding example that I think generalizes to a case I see a lot in games - a couple posts ago when I talked about the knee-jerk reaction to add state or cache calculations, this is the kind of thing I'm getting at. Even before I was interested in functional programming I knew that duplicating data was a ripe opportunity for bugs, but it was playing around with Haskell that made me realize just how far you can go to avoid pushing state.
I think one of the reasons we tend to push state and cache calculations is because often it's a little less work to write the code in the first place. Adding a new flag and setting it is easier than adding a new function. A more complicated example - came across some code recently where the coder set up an stl map so he could quickly look up a few elements from a larger hierarchy of stuff. Probably less work than writing the recursive search; but now we have to maintain the map and the original data. And then there's performance. Nobody wants somebody else to look at their code and say, "Hey, why'd you do it this slow way?" - particularly in game development, coders who write fast code seem to be admired, even if that code doesn't work. "He shaved 5 ms off the renderer! Nevermind that there's massive pop-in now!"
Back on Spidey 2, the designers were in charge of writing the scripts to handle fades, and because they're not programmers by trade, it got a bit funky, though I wouldn't necessarily have done better at the time, because it was one of those problems that seems simple but turns out to have some issues. Bugs kept cropping up relating to the fade curtain - it would fade to black and stay faded, or it would fade in too soon, or the like. And we'd go in and find the script that was doing things incorrectly and fix it, and then another bug would crop up a few days later.
The curtain was an example of how state can screw one up. You might initially set this up ("always use the simplest possible design") as two functions BeginFadingOut and BeginFadingIn which set a boolean isFadingOut. But then you get two chunks of code (not necessarily even on different threads) that both think they need to use the curtains and you get a mess - one'll start it fading in before the other one's ready.
Just one example - I don't exactly remember if this was one of the bugs or not, but it could have been - spidey might die, and respawn, but respawn in an exterior when he died in an interior, which means streaming in some new terrain data. We might have had two scripts, one for respawning, and one for transitioning from an interior to an exterior. Both might have called the "close curtain" function, then waited for it to close, down their work, and then opened it up. But lo - the "respawn spidey" script is opening the curtain before the "transition" script has finished streaming in the data, and we get an ugly, visible load.
One thing we tried, then, was incrementing and decrementing a counter. Better, and when a script failed it was because the script was clearly not using the system correctly, so it became easier to blame the scripts.
What I should have realized when we were doing these spot fixes of offending scripts was that the curtain mechanism itself was to blame. We were effectively pushing state...duplicating data. Whether the curtain was fading in or out depended on the state of the game. The curtain should have been pulling state. It probably should have looked something like this:
bool ShouldIBeDown()
{
return AreWeSettingUpACutscene() || IsSpideyRespawning() || AreWeTransitioning();
}
And then, what translucency to draw the curtain (or how high to draw it, if it's a more literal curtain) could have been determined simply by measuring how long ShouldIBeDown had been true. The functions that respawn Spidey or transition to different areas would wait on IsCurtainDown() before doing their work.
Now, you say, "That breaks encapsulation!" "That creates circular dependencies!" because the curtain is dependent on transitions and Spidey's respawn system and the cutscene system and they're all dependent on the curtain. Before, the curtain was a low level service they could all just layer on top of. I'd say - hey, I'd rather be conflated than buggy, and for this case it kind of makes sense for all these things to be conflated - but what about a more general situation, where some more encapsulation or layering would make sense?
One possibility that's C++ friendly is the curtain's ShouldIBeDown() function could be virtual. Another possibility, not so C++ friendly, would be to use some kind of lambdas - function pointers or functors or delegates or something like that. You could add a list of curtain-closing conditions this way.
Speaking of delegates, with Schizoid I was using delegates like candy, and one particularly bone-headed thing I did that I later stopped doing once I realized how bug-prone it was: changing delegates mid-stream. I had some enemies that would change their minds, and tried to implement that, at first, by changing their whatDoIWantToDo delegate. This is pushing state - the curtain problem all over again. Instead of having one function tell the entity to do one thing by pushing one delegate and having another function tell the entity to do the other thing by pushing another delegate, I needed to put that all inside a single function if( condition() ) doOneThing() else doTheOther(). If I had a rule of thumb for delegates or function pointers now, it would be - you can pass them in constructors but never change them once their object is live. So that's what I did, and I lived happily ever after.
I guess we all have experienced something similar to your curtain example at one point or another. I often encountered similar problems with GUI stuff, like mouse cursor hiding or gamepad input routing.
Reference counting might help a bit but it often adds more problems than it solves, because it can go wrong quite easily. I've already seen code like "while(!visible()) visible_ref++;" ... no, really.
The solution I tend to choose nowadays is that there is only one piece of code that decides of the final result, but instead of pulling states itself from everywhere in the game, it has a stack of triggers, each trigger being owned by the relevant part of the game (cutscene, respawn, transition). And each trigger has three states: "on", "off" and "don't care". At every update, I search the stack in sequence, looking for the first "on" or "off" value. Each part of the game can only change its own trigger, and in the end it's only a matter of deciding the right order.
I don't know if my explanation is clear, but in fact it's a lot less code than it sounds, and it is very convenient to debug.
Posted by: Drealmer | August 07, 2009 at 08:00 AM
Duplication of data is actually desirable, now that we're moving to many-core architectures with disjoint local memories. Processes have no way to communicate, except by sending copies of data through fifos.
Alan Kay: "Actually I made up the term "object-oriented", and I can tell you I did not have C++ in mind."
I would make a "curtain" process that accepts multiple messages, like
1. lower curtain
2. raise curtain
3. lower curtain, and don't raise again until i say so
Your "transition" script would call message #2 and #3, because it knows it will visually glitch otherwise. The "respawn" script would call #1 and #2.
Implementation is not for the caller to know. Probably the "until I say so" part would involve something like a reference count?
The design focus should be on sussing out all the messages required to meet the expectations of callers.
Posted by: Bryan McNett | August 12, 2009 at 05:56 PM