I was recently at the Conference Formerly Known As XFest, and one of the sessions I saw had a panel of various heavyweight technical directors weighing in on various things. Some of the things they all seemed to more-or-less agree on: memory allocation bad, STL bad.
Now, I'm kind of out of the loop with our new multithreaded universe, since I haven't had access to any dev kits for the past several months (but I did just order a multi-core, multi-processor machine recently - it should arrive any day now - so I should be able to catch up soon) but these guys seemed to be opposed to memory allocation and STL on consoles in general - a bias that they've carried with them since PS1 days, it seemed like.
I'd just like to weigh in with my own two cents. I thought I detected a certain amount of coder machismo in their rants, some Not-Invented-Here syndrome.
Okay, so it was a Microsoft show, and the Microsoft allocator sucks -- in fact, there was a whole talk basically on working around the Microsoft allocator -- so I can understand why everyone was going "look out for fragmentation, etcetera!"
I think it was Tony Cox who was the one voice of dissent -- he pointed out that any time you're creating a game entity and deciding where to put it in memory, you're doing allocation, whether it's done by a heap manager or your own custom fixed-pool thing.
So here's my take. And it's "agile" - use new, delete, malloc, free. Once you find out they're fragmenting, or not thread-safe, deal with it. It's quite easy to fix allocation later once you find out it's not working out for you. My favorite allocator is Doug Lea's malloc - first introduced to me by Wade Brainerd when we found out that the built in allocation on the Dreamcast wasn't doing it for us. Turned out dlmalloc didn't work that well on the DC either, but it was easily configurable to do what we wanted. (And I did run across a bug in dlmalloc a few years later, but Doug Lea was very responsive and fixed it in the latest build. How crusty am I? "I found a bug in Doug Lea's malloc!" Almost as good as finding a bug in Donald Knuth code.) dlmalloc is the allocator that the PS2 SDK used by default - how happy it made me.
Seems like everyone stresses about fragmentation. Fragmentation is actually a *solved problem* - http://portal.acm.org/citation.cfm?id=286864 - but the default Microsoft allocator doesn't use the low-fragmentation algorithm. Doug Lea's does. So as long as you're not doing weird allocations - like allocating and reallocating a block that's an eighth of your entire heap - you're good.
I don't actually know if dlmalloc is thread safe. It probably isn't.
And, long story short, there have been many console games that shipped on time, sold millions of units, got high scores on game rankings, and used new and delete and malloc and free.
On the other hand, I do kind of like fixed pools of entities, only because then I can use indices into the array instead of pointers, and fwrite and fread the entire block to the hard-drive as a savegame without rebasing anything. That's how we used to do savegame back in the day. Doing everything with pointers was considered a premature optimization.
Eh, I'll leave my pro-STL thoughts for another day.
I'd be curious to hear your pro-STL thoughts. I hate it because like anything that opacifies underlying data structures, it's murder on the PS2 cache and lends to spurious copying. But I do respect that STL lets you preallocate, and so between that and a fixed block allocator, anyone blaming fragmentation on STL just means they're not planning ahead and giving the STL-centric developers around them the tools the STL heads need in order to focus on coding.
Posted by: Most | August 19, 2006 at 06:37 PM
I've shipped PS2 titles that extensively use STL. Like any set of data structures, you need to use them wisely.
Currently we're using a licensed engine that comes with a set of built-in containers. In practice, they are bug-prone, have *worse* allocator behavior than STL, bad cache coherency, and tend to lead to people writing bad algorithms.
Yet somehow, the other developers in the company accept these shoddy containers just fine, but you mention "STL" around them and they go into histrionics. I really think a lot of it is just religion at this point.
Posted by: vince | August 19, 2006 at 11:07 PM
That should have read "some other developers." I didn't want to imply that everyone has those views.
In addition, there are varying qualities of STL implementations out there. Some of them, particularly in the console space, seem much worse than others.
Posted by: vince | August 19, 2006 at 11:10 PM
I think "needs to be used wisely" is true of pretty much _any_ non-trivial technique. I'm of the opinion that anything that can't be justified as absolutely necessary is better off re-written as simpler (and hopefully cleaner). I've seen too many virtual-for-the-sake-of-it or massive-inheritance-trees.
I was stuck at a company for a long while that refused to use STL or dynamic allocation, and I can see why. We did a lot of Lua usage, in amongst large chunky dynamic allocations; and in a fixed memory environment where you're using it to its limits thats not acceptable. Now I'm clear of there, I'm shifting to an approach where dynamic allocation is fine, but all allocations have to be grouped into heaps with other similar usage items.
So level data goes to one place, transitory allocations go to another, and so on. The key thing is, there's no such thing as 'allocate into the main heap' - every allocation has to be put somewhere, and that forces developers to think about where things belong. And it means that if fragmentation does come up in one heap, we have a much reduced set of allocations to look at and identify the fragmentation.
Almost always, its going to be someone putting in a small, long term allocation in between two big short term allocations, meaning that the heap fragments in a major way. In that case, you can just log out the allocation frequency, and figure out which allocation to shift into another (or its own) heap.
Posted by: MrCranky | August 20, 2006 at 02:42 AM
heh, certainly not trying to start a flame war ... but what's your opinion on the managed code that Microsoft is introducing on the xbox360. I know they're starting relatively small with the indie developers in mind and XBLA, but I can only imagine that they'll open it up for full on commercial games at some point.
Do you think that the managed development will eventually outstrip this kind of development in whole or in part at some point?
I for one rather like the fact that I can for the most part avoid headaches such as this. Of course, that's not to say that you don't have to worry about memory at all, you still have to code with the garbage collector in mind ... but all in all, still a much easier process than having to worry about fragmentation and stuff ;-)
Posted by: anoneemouse | August 20, 2006 at 07:11 AM
and just to anticipate some of the replies, I know that traditional c++ isn't going anywhere. Considering the fact that most platforms aside from windows and 360 won't support it, and it truly is the only way to get the fastest performance out of your code, it's clear to see that it will be around for a long while.
The main thing I was getting at with my question is, do you see some potential uptake where a studio might consider using it to have a smaller team, faster development cycle, less bugs due to memory issues, etc. etc. The way I look at it is that no, you likely won't be able to write the latest and greatest HDR enabled super graphics game. However, if you're forced to scale back on that front, maybe you can spend more time on the fun aspects of your game.
wow, I hope that doesn't sound like marketing fud ... I'm just a little indie guy who reads your blog and wanted to get your thoughts :-)
Posted by: anoneemouse | August 20, 2006 at 07:35 AM
Vince: The reason we're not complaining about flakey data structures with the new engine is that we now have bigger things to worry about. :)
Posted by: Most | August 21, 2006 at 08:05 AM
I worked on teams that used extensive dynamic allocation and didn't worry about memory. The team I'm on right now preplanned memory sections, used static allocation blocks (i.e. fixed block heaps) and is very aggressive about fitting within target memory and providing an overflow safety net just in case. I've seen the actual benefit of early, aggressive, and continuous monitoring of resource usage. By getting the memory issues out of the way early, we're spending a lot more time on polish at then end of the game now than dealing with all the memory issues instead.
Posted by: Hacky | August 21, 2006 at 08:38 AM
I used Borland's alternative to STL in DOS in 1994. I pretty much stuck with templates and the whole C++ thing until 1999 when I started working on under-powered embedded processors in set-top boxes (game consoles without the graphics). The memory allocation technique we used there was specifically designed for a low amortized time in the allocation function. But we built up and debugged our own template libraries for the situation.
The best stuff to read on allocation on multiprocessors is anything you can find on implementations of JVMs. Or read about lock-free queue implementations.
Posted by: Mike Enright | August 21, 2006 at 09:51 PM
I had problems with the malloc on gcc before while working on Shrek 2. The process would hang for like 3 hours on free(). Inside the free routine was something like if count == 63356 then garbe_collect(). So we switched to ptmalloc and the same process ran in seconds.
Its almost like the whole Java vs C memory management issue. Java has generational garbage collectors and in C we still have to manage it outselves.
Posted by: Hector Yee | August 22, 2006 at 01:51 PM
Sounds like you were smashing your heap data structures. That doesn't sound right even with the most pathological behavior.
Posted by: jvalenzu | August 22, 2006 at 02:33 PM
(Noone cares but) I agree, memory allocation and STL are best avoided. STL is a mixed bag of things some useful some not as much in different situtations. The template syntax for hooking in a custom allocator to use with STL is for sure mind boggling. Memory wise i'd say: (fixed memory mapped -> fixed -> stack -> fixed pool -> something more complex) would be my order of consideration. I'd really think twice before using anything more complex than stack since it's really nice for multithreaded apps - you don't have to worry about locking any globals, the hardware conveniently saves the stack pointer for you. Btw Jamie fixed pool is a shorthand for "fixed size pool", those only work for fixed sized objects. That doesn't work very well for entities unless you use a composition based approach (vs traditional hierarchy). I wonder how soon will mainstream programmers begin to realize that inheritance is bad? Oh wait that sounds like heresy :)
Posted by: alienspy77 | August 22, 2006 at 09:23 PM
Re: pathological. It was a hash data structure with variable sized data members. GCC's old allocator loved it when you allocated and released data in a consistent manner. But if you released data in a random manner (say in order of hash key which distributes the data pseudo randomly in your list or vector) it would perform poorly.
You can reproduce it by allocating blocks of random size in a for loop and release it in a for loop and time it. Then repeat the same experiment with the release order randomized.
Microsoft's malloc seems to handle it better than gcc 2.3.1.
Posted by: Hector Yee | August 23, 2006 at 11:37 AM
"STL bad" is a gross oversimplification. Of course, the STL's heap-walking container classes are no good for videogames, but they are just one tiny part of the STL. Those who ignore the rest of the STL are doomed to reinvent it, poorly.
Here is just one tiny example: STL's nth_element will sort a range just enough such that only the nth element is guaranteed to be in the right place. On average this works in O(N), or linear time.
It is possible to find the 38th largest integer in an array of integers, in linear time, just by typing this:
int foo[100];
...
nth_element( foo, foo+38, foo+100 );
nth_element is just one in a great variety of optimally efficient, useful code in the STL. Forbidding STL in a game project because its containers are slow is an unfortunate mistake.
Posted by: Bryan McNett | August 27, 2006 at 02:48 PM
Enjoyable discussion! Long ago I worked at Rogue Wave working on their high-performance product line. There were some gaming customers that found 2 RW products useful:
*NewCode MTS -- an extremely high-performance malloc/free() replacement that RW exclusively licensed up to when RW was bought out by Quovadx and the rights were terminated. MTS is still available directly through NewCode... if you want the fastest malloc/free() implementation (without fragmentation or scalability concerns, then I encourage you to check them out.) http://www.newcodeinc.com
* The RW STL -- getting faster with every release and it was just donated earlier in the year to the Apache group (pretty flexible licensing terms.) Check it out:
http://incubator.apache.org/stdcxx/
I don't work for RW any longer, but I have friends at both RW and Newcode -- just trying to pass on information when it may be useful.
Posted by: Brand | September 06, 2006 at 11:52 PM