Continuing my n00b data science. With most of the AB testing I've been doing on Dungeon Life so far the only thing I've learned is that my ideas for making it more engaging or happy-making don't seem to have any effect and sometimes even make the game worse. I experimented with making the UI buttons 'juicier' - more animated and responsive - and to my surprise that lowered retention. (Alpha of 10%.) I experimented with moving people to the front of the queue to be heroes when they were first-time players. I experimented with hiding extraneous UI buttons for new players. A whole lot of nothing.
But finally I've made some useful changes!
One was adding VO that I did my own damn self for the tutorial. Just a few lines of me speaking in bad-roleplayer voice. At first it seemed like there was no effect, but then I filtered for people playing the game in English and the effect was super-clear. Tutorial VO improved all of my metrics with practically twice the retention! Not bad for someone who was once told "Welp. You're not a voice actor" when doing temp audio for Spider-Man. :)
I'd realized for a while that it takes too long to get to the final boss in a given 7-floor dungeon. My original conception of the game was that it would be endless until it got too hard and the hero would die. Later I decided that in case some players got OP I needed some way to cap the experience so somebody else could have a turn. Enter the superboss, a way to 'win' - but mostly a way to give other players a turn. I eventually concluded that 7 floors was still too long though. But how short to make it?
This required a little new tech. Up until now, I'm only testing differences between players. Two players with different experiences, for example one with VO and one without, could play on the same server and it was fine. Now I needed different server test groups. Doing that was different enough, but now the queries looked a bit different. Frex:
-- this automatically filters out players who played before the test group came into play, because if the category didn't exist for their first session it won't be in the join
-- thus it is only providing the first impressions for people playing that server and nothing else
SELECT value, rating FROM (
(SELECT server_key, category, action, value, time FROM events where category='ServerTestGroup' and action = 'NumDungeonLevels') as test_group
JOIN
(SELECT DISTINCT ON(player_key) player_key, value AS rating, time, server_key FROM events WHERE category='PlayerRating' ORDER BY player_key, time ) AS session_length
ON session_length.server_key = test_group.server_key
)
ORDER BY value
At first, as I was watching this test as it went along I noticed that it seemed to just have a trend for better session lengths the smaller the number of levels got. As the creator of Sixty Second Shooter this didn't really surprise me - I've been a firm believer that the shorter you can make your sessions the more engaging your game will be...all things being equal. So I made some more changes and made it so people could play anywhere from a 2-floor dungeon (one regular floor, one boss floor) to the old standby. Then, after some difficulties where the server was down while I was doing onsite interviews every day and too busy to check, I finally got a pile of data. There were thousands of test cases here:
This is all totally significant; t-tests and chi squares below 5%.
So *that's* weird - though there is a trend towards better retention and ratings as the number of levels get shorter, session length is kind of all over the place. And there's this odd-even effect. If there are an even number of levels retention is up but ratings are down and vice-versa. I can only hazard a guess as to what that is - something to do with how long it takes monsters to build up their resources to do cooler things as far as they're concerned but less cool as far as the heroes are concerned?
So now I'm in a quandary. Do I make the game a quick two-level experience because that's the favorite, even though the session length is lower? Or do I go for the best session length even though that version is less well liked? I've been talking a good game about how player experience is the most important thing to me but now that the choice between engagement and joy is staring me in the face I want the engagement.
So I decide I'm going to put my money where my mouth is and make the two floor game.
And after playing it for a while I start to feel despair.
I HATE it.
It's broken, for one thing. During the game the monsters are gradually building up their Dungeon Points which, by the third level (and sometimes the second if they're good) they can become mid-level bosses. With only two levels it's very hard to do that. A whole part of the game has been practically eliminated.
The balance feels terrible also. The first floor is a cakewalk, like first dungeon floors usually are. Then the second floor has a boss that a new player was never intended to be able to beat. It goes from too easy to sudden death instantly.
So WTF? Why do my players seem to prefer this mode.
Here is how I've rationalized it: the ratings I'm looking at are actually first impressions. They've been playing for in the neighborhood of 5-20 minutes. They're enjoying the one hit kills, whether it's from being a hero on the first floor of a dungeon or getting lucky and becoming the superboss on the second floor. They have yet to realize that the game lacks depth this way. If I'm going to maximize the joy this game provides, I need to do different queries; a player's peak rating and their average rating are more important than their first. But their peak rating and average rating can't be tied to a single server, so I am just going to ignore their ratings this time.
And I decided to go with the 4 level dungeon, which is my favorite length and a good compromise for session length and retention. Yes, it gives a worse first impression, but since players are more likely to come back in this scenario, maybe their opinion will improve.
Since I've made the changes permanent more people have been playing DL, but I'm afraid that has nothing to with me. It's currently in the international featured sort, and getting a little more play than it usually does. When that's over, we'll see who sticks around.
Comments