Performance-Based Loot Systems

While browsing the US guild relations forum, I was struck by this post: PerLoot – a new Loot System

Not struck by the brilliance of the system, mind, but the process by which a reasonable goal (rewarding people who perform better) fell apart in the implementation.  What’s worse is that the original poster didn’t seem to realize how much things had fallen apart.

In summary, the poster proposed a loot system whose rewards were based upon performance in raids.  The better you perform, the more loot you get.  They proposed to measure performance by the meters – your DPS divided by your GearScore times the cubic root of the number of dispels or interrupts you perform.  The post made no allowance for how tanks would be handled, but did say that they would gauge Discipline priests differently “because they heal by prevention”.

The premise that gave birth to this loot system is attractive: ultimately, loot distribution should reward those who perform well.   I’m sure most people who generally perform above the average of their raid feel they should be rewarded for doing so.  But the loot system as proposed fails on so many levels.

How Do You Fail Me?  Let Me Count the Ways

Those who have better gear will do more dps / hps / tanky-stuff, while those with lesser gear will do less.  Thus the rich will get richer while the poor stay poor.  When this is brought up, the original poster suggests to compute a ratio of DPS/HPS to the player’s GearScore.  This is based on the assumption that a higher GearScore means that you will perform better while a lower GearScore means you will perform worse.

The same player with the same level of skill should perform better in gear with a GearScore of 5000 than in gear with a GearScore of 4000.  But good players swap out gear based on the stats that the fight calls for.  For my block set, I wear several pieces of gear with a lower item level because they make me unhittable with Holy Shield up.  How do you adjust what my expected performance should be on a fight that is made easier by wearing lower level gear?

Even if you could normalize expected performance based on gear, then compare that to actual performance, are you qualified to judge what every class and spec should be doing on every fight?  All casters benefit on fights where melee has to avoid whirlwind, while DoT casters will out-do nuke casters on high-mobility fights.  How would you adjust healing measurement for Anub’arak phase 3 where less is more?

What about hybrids who intentionally sacrifice some of their personal dps to buff everyone else?  Do you attribute part of the damage done by every caster and healing doen by every healer to the Shaman who drops Totem of Wrath?  Do you only count absorption from Power Word: Shield, or also the 3% reduced damage from Renewed Hope?  But what if you had an extra prot paladin to put Sanctuary, which no longer stacks with the priest buff?  Who gets the extra 3% credit?  The Paladin because it’s passive for 30 minutes, or the Priest who has to bubble someone once every 20 seconds to keep the buff up?

PerLoot also makes no allowance for damage taken by non-tanks.  You can boost your numbers by not moving out of the fire or staying in for whirlwind, so long as your healers can keep you up.  If you aren’t penalized for doing so but are rewarded for doing more DPS, then PerLoot incentives DPS to make stupid decisions.

The system is also full of opportunities for collusion and corruption:

Priest to Rogue: I’ll heal you throgh the fire to boost  my HPS

Rogue to Priest: Sweet, not moving out will boost my DPS

Very hard to detect and very easy to deny in small numbers per fight, but it can add up to a significant extra healing done over several hours of raiding.

And how exactly do you measure tank performance?  TPS?  What about offtanks?  What about events in which you have to hold multiple small adds – you’ll have enough threat on each to hold them, but measuring your aggregate threat is near-impossible to do in real time.  Even if you do, can you really compare the performance of an add tank to a main tank?  Surely the performance threshold for a tank is “they didn’t lose a mob and none of the DPS can say they were holding back”.  Yet all your plate tanks are going to want the same gear, so how do you decide who performed better on a one-shot boss kill?

Someone who has extensive experience playing all specs of all classes would have a tough time normalizing expectations for every encounter.  Anyone with less than extensive experience would be lost.  You’d be sitting with a World of Logs parse for twelve hours per boss fight trying to figure out who gets loot.  The poster’s suggestion that this could somehow be wrapped up in an addon to do the calculations for you reflects a shallow understanding of the problem.

Really, This Wasn’t Supposed To Be a Rant!

My goal when I started writing this post wasn’t really to rip the post apart – I just got into “debunk mode” as is my wont to do.  But I do find myself asking why the questions I pose weren’t asked by the poster of themselves before thrusting the idea upon the guild relations community.

The reasonable initial premise of “better players get better loot” is implemented in a way that almost guarantees huge amounts of work and uneven loot distribution.  The poster just doesn’t appreciate the complexity of the subject, yet they feel that they are ready to create a policy.  I have to agree with one of the responses:

It is virtually impossible to account for all variables with a “magical” formula.

Your own words. Your system is also ‘virtually impossible’ to implement.

It’s OK to be inexperienced.  It’s not OK to not know that you’re inexperienced.  That goes the same for guild leaders creating policy, DPS with their rotation, or tanks with situational awareness in a fight.

As everyone should know by now, Recount and GearScore are tools.  They give you a snapshot of an aspect of a player’s performance on one fight.  Because they cannot give you a comprehensive view of player, they are unsuitable as data points to drive a loot system.

Leading vs Using Leadership Tools

If the real issue is that people don’t like seeing gear go to someone who stands in the fire and sucks up healer’s mana, then that needs to be determined and dealt with on a personal level.  There are addons to alert you to failures, but some are unavoidable – if you lose a tank on a dragon but another tank picks them up, many of the raid will “fail” by getting hit by breath or tail swipe when the dragon spins.  How do you know which failures were truly avoidable?

As readers know, I’m a fan of EP/GP because it quantifies effort and allows for bonuses and/or penalties to be given without corrupting the system.  But these aren’t automatic – you have to detect and choose to reward or punish someone.  If you don’t, then someone who stands in the fire and does low DPS will receive the same number of effort points as someone who performs at the top of their game.

Performance should be parsed using the Mark 1 Eyeball because it requires experience and an analytical mind.  Effort and attendance is an accounting task and should be given over to addons and web-based packages. Depending on your loot system, both may factor into who gets loot.  But you can’t run a comprehensive loot system with just one or the other.

Ideally, the way that you prevent under-performers from getting loot is by not inviting them to the raid after you realize that they’re under-performing.

I have considered the option of Loot Council.  Loot Council is unique in that the decision as to who gets loot can change from fight to fight without prior notice to members.  It’s entirely a human decision, though often influenced by performance metrics and failure counts.  My issue (which I’m sure many people share) with loot council is that the system only works when there is a high level of trust between the council and the members.  It’s all too easy for council members to collude to help friends in the ranks, and this can be hard to detect until a pattern emerges (by which time the recipient of the goodwill may have received a number of upgrades).

If relationships between members of the council sour or as members of the council come and go, the nature of decisions can change.  I believe that members deserve a stable and mostly deterministic loot system, so I’ve never been comfortable signing up to a guild that used the council method.  I also believe that many of the problems we try to solve with technology in WoW are at their heart people problems.  Sometimes we have to remember that we’re dealing with people, not data points on a chart.

Wait, I’m Supposed To End My Rants With a Lesson, Right?

Ok, so we get it: loot based on meters is bad.  But what is a guild leader to take away from this?  Well, I have to assume that the original poster had a reason for coming up with this system and asking for feedback.  As some of the responses point out, those reasons may have been selfish, but nonetheless the issue of “how do I distribute loot better” is one any of us might face.

The lesson is to be critical of yourself.  Find a sounding board, and have them play the Devil’s Advocate.  Remember the maxim “if it sounds too good to be true, it usually is”.  Imagine yourself presenting the idea to your peers and think of the questions they might answer.  Do you have solid responses, or can you only dismiss their concerns as not being relevant to “your” system?  Ask yourself if you are too closely invested in the implementation as opposed to the goal.

Above all, ask yourself this question: if my idea is so good, why has nobody thought of it before?  I love asking this one at work when someone thinks that they’re being brilliant.  I’m not saying that nobody I work with comes up with brilliant ideas, or that no guild leader will come up with a brilliant new loot system.  I’m just saying that truly revolutionary ideas don’t happen every day.  If they did,there would be hundreds of viable loot systems instead of the generally accepted five or six ones that people use variants of.  Be a bit circumspect – is the reason people aren’t already doing this this way because the idea has never occured to them, or because there are subtle pitfalls that you haven’t thought of?

Until Next Time