« Notes on Search For The Emperor's Treasure and Battlelore | Main | Stupid C# Tricks: Dealing With Non-Generational Garbage Collection »

July 02, 2007


Sean Barrett

A reference, no.

There are two obvious reasons on the face of it.

First, parallel arrays are harder to manage; each time you add another field you have to add another array. If you want to pass 'the entity as a whole' to another function, you have to pull out all the items.

Second, parallel arrays that aren't just acting as raw storage suck. Want to swap two 'elements'? Whoops, you have to swap all the array pairs individually. Want to _sort_ the parallel arrays? Whoops, you're fucked.

Mat Noguchi

You aren't completely fucked if you want to sort parallel arrays, but it sure ain't easy, and in-place isn't much fun either.



Isn't all your "object" access abstracted anyway? Probably depends on the specific application. Parallel arrays could offer increased data locality. About the best you can say is that objects are a lot more idiomatic to modern languages, and you'll get more of the attendant benefits by using them.


@Sean Barrett:
if (GetDistance(enemy, player) < 5)
AdministerHit(player, enemy);

Using parallel arrays you would have
int X[totalThings];
int Y[totalThings];
int Health[totalThings];
int Strength[totalThings];

Easy enough to add a field. Refer to players or enemies by their array indices. Sort can be done easily enough. I'm not sure if the standard sort function could be made to work.

I don't think the data locality is increased in the typical code I showed above. A struct would be better.

With a struct you "know" what properties you have. With parallel arrays some cowboy coder is going to sneak a parallel array into another file for his nefarious purposes and that might be bad. I don't think I ever saw this stuff in a reference (thinking back to the mid 70's). Maybe a magazine article. I wouldn't be surprised if Knuth's MIX programs use parallel arrays sometimes.

Nathan McKenzie

I feel like I've seen many places mention before that, particularly if you're dealing with structs with lots of glommed together state, that parallel arrays can actually be much better for cache coherency. I'm sure this has no relevance to what your non-programmers are doing, of course.

But more than anything, I'd imagine that the fact that the IDEs can give you all sorts of autocompletion help with structs (and not so with parallel arrays) ought to be convincing enough.

BTW, Jamie, you've mentioned something before (but I can't remember what) about C# on the 360 not having... something related to new and delete. I can't remember what. But I remember you mentioning that you have to use lots of fixed size pools.

How does that relate to delegates? I've been using delegates a lot, and to be honest, for my ways of thinking, they seem like vastly better ways of extending code than tons of nearly empty single function classes with giant amounts of boilerplate code, but as far as I can tell, delegates do hit up the heap a bit. Can you use them on the 360? And what about anonymous functions and closures (which I'm also growing to love and embrace - of course, I guess that's only .NET 2.0)?

Danny Thorpe

If there is no performance concern, then it's a matter of programming style.

If performance is in question, then arrays vs structs depends on what and how you're using the data.

If the data you would put in a struct is typically going to be accessed about the same time as other stuff in the struct, then putting the data close together in physical memory will help performance with cache hits. If the same data were scattered all over physical memory in parallel arrays, then you'll get poor cache hit rates and even cache thrashing.

On the other hand, if a piece of data you're thinking about putting in the struct is accessed intensely across all the samples without needing anything else in the struct, then that piece of data is probably a better candidate for a parallel array of storage.

Parallel arrays give good data locality when you need to operate on a lot of the same kind of data at once. Structs give better data locality when you need to pick out multiple pieces of related data close together in time.

Data alignment and internal padding should also be considered carefully. Putting a 4 byte int next to an 8 byte float in a struct is likely to introduce padding to push the float over to a memory address that is a multiple of 8. If you turn padding off to save space, you'll probably incur wait cycles at runtime and double reads so that the CPU can piece together the 8 byte float that is straddling two 8 byte address slots or cache lines. (Intel tolerates misaligned reads, but some RISC CPUs just keel over with a hardware fault)

Patrick Hughes

Casual use of parallel arrays signifies to me that the coder involved is creative and self-taught yet has not actually reviewed much "real life" code, he's an amateur who wanted to have some fun and parallel arrays are an extremely logical solution to handling data if you're unfamiliar with language features ;) In old timery BASIC those arrays were the shiznit.

Being an amateur, what he lacks is the vocabulary to express his idea.

If you want to ween him from those arrays, then a practical example of how structs are used should give him the vocabulary to express code in the new way and under his own steam.

As for something to show him why structs are "better," my call is that once he has the vocabulary then he will be able to make that observation for himself.


Parallel arrays are generally hard to edit. You decide you want to edit the Ogre which happens to be entry number 127 so you find the hitpoint table, scroll down to the 127th line and edit the hitpoints. Then you go to the speed table and scroll down the 127 line to edit the speed, then you go to the shield table, scroll down to line 127 and edit the shields value. One mistake and you'll spend hours trying to figure out why things are not working anymore. It gets worse when some of those parallel arrays are arrays of function pointers.

Of course I suppose that's an implementation detail. You could purposely store things in parallel arrays for speed reasons but it seems unlikely that anyone would start with parallel arrays as their source data. I used tons of parallel arrays in the 8bit days but I wrote tools so I could enter the data as structs that at compile time would get turned into parallel arrays because it took way to long to edit them in parallel arrays and was way too error prone.

Emmanuel Deloget

From a design point of view, structs are obviously superior to parallel arrays. It's not even a problem of programming style, it's really a maintenance problem - like greggman said, editing arrays is more than error prone. A single error in an array size, or in an array cell, and you're going to waste your time debugging something that could have been a lot simpler.

Unless you are doing very weird things with the arrays (like looping through one of them constantly), it's going to be a waste of (programmer) time to use them. And even if you are looping through one of them constantly, the runtime gain that would have come from using (a struct of) arrays vs (an array of) structs is to be profiled - chances are that the hit you'll get with the AoS will be negligeable anyway.


parallel arrays in the form of "structures of arrays" show up in various places from time to time.

they can be powerful where you know the range of possible inputs to a type, but not the exact set of inputs.

when programming vertex streams you might know you have 0-2 color channels per vertex, but build a single shader ( or other piece of code ) that can handle all three cases.

as another example: the component objects ( aka. facets ) pattern is more SOA than AOS. you define a base object and a base component class. your base object contains a generic list of of components. you allow designers and programmers to add components (possibly) dynamically to that object.

the *real* answer is tho, that AOS is less typing :)

struct SOA {
int a[25], b[25], c[25];
SOA soa;

struct AOS {
int a, b, c;
AOS aos[25];

Tom Plunket

I can't be sure, but I think there is discussion about things along these lines in The Pragmatic Programmer by Andy Hunt and Dave Thomas.

The closest principle that I can relate though is of collaboration. Are these bits of information intended to be conceptually grouped with one another? If so, put them into a structure together so they can represent a "thing" (if for some reason "object" is a dirty word?). Consider too that when you reference a thing (let's say a GameObject), conceptually all of its parameters are A Thing, and if you want to pass That Thing to a function for some reason, it's nice if the parameters are all bundled together rather than having to pass all of the parameters explicitly. For the love of all that is holy I hope folks aren't starting to think that global data is ever ok! ;)

Iain McClatchie

In the early 00's, the Sun C compilers gained the ability to restructure an array of structs into a struct of arrays... and the ability to determine (sometimes) when this is higher performance.

I don't think there is any automated way to turn a struct of arrays into an array of structs when the latter is better.

So, if you want to give the compiler the ability to make the best choice (which will be a hybrid, of course), write an array of structs. And wait a decade for the compilers to learn how to really do it right.

The comments to this entry are closed.

Jamie's Bragging Rights

  • Spider-Man 2
    The best superhero games of all time Game Informer
    Top five games of all time Yahtzee Croshaw
    Top five superhero games of all time MSNBC
    Top 100 PS2 games of all time Official Playstation 2 Magazine
    1001 Games You Must Play Before You Die Nomination for Excellence in Gameplay Engineering Academy of Interactive Arts & Sciences
  • Schizoid
    Penny Arcade PAX 10 Award
    Nominated for XBLA Best Original Game
    Nominated for XBLA Best Co-Op Game