They That Sow the Wind, Shall Reap the Whirlwind

Published in

Skeptical Agile

5 min readSep 26, 2017

The main problem with metrics is not that they don’t work. It is that they do work, with brutal efficiency. Therefore: be careful what you measure, because that is what you will get.

If you are a manager in any but the smallest of organizations, you are probably thinking how to measure and increase performance in your teams. Maybe you need to evaluate people in order to give them bonuses? Or to advance them in the corporate hierarchy? So, let’s look at a couple of ways how you can accomplish this.

Individual metrics

The most obvious and easy-to-implement solution is to measure individual performance based on hard data, a.k.a. numbers.

Such metrics may include number of bugs reported, number of issues solved, lines of code written, count of pull requests submitted — all on an individual level. The logic behind such measures is simple — who has higher numbers is more productive, therefore eligible for bigger incentive pay.

Unfortunately, individual metrics pose a number of dangers that you should seriously consider before even thinking about implementing such a scheme. Let me show you a couple of them.

Not everything can be measured.

Imagine a “team mother” — not really doing great coding, but making the team jell and creating a great atmosphere for everyone. Or a “team skeptic” — usually asking hard questions, but preventing stupid decisions and making others think about better ways how to do things. How would you capture these in hard numbers?

Think about other things that you believe are valuable: teamwork, communication and collaboration between people. Can these be measured on an individual basis, when they by definition happen in between people? As William Bruce Cameron once said:

Not everything that can be counted counts. And not everything that counts can be counted.

Measures do work. With brutal consequences.

What makes individual metrics really deadly is when they are tied with incentives (bonuses, reward schemes etc.). As Alfie Kohn found out, the main thing about rewards is that they do work. More than that he claims in his book Punished by Rewards:

Rewards usually improve performance only at extremely simple — indeed, mindless — tasks, and even then they improve only quantitative performance.

All metrics, when tied to incentives, tell a sober message: what is important and based on am I going to be judged. If it’s individual contribution, why would you expect people working in teams? What motivation are you giving them to help others?

If you don’t believe me, look at some facts: soviet factories which when given targets on the basis of numbers of nails produced many tiny useless nails and when given targets on basis of weight produced a few giant nails. Numbers and weight both correlated well in a pre-central plan scenario. This is now known as the Goodhart’s Law which — simply put — says:

When a measure becomes the target, it ceases to be a useful metric.

Team metrics

Now, when you know that individual metrics — although tempting and bringing some benefits in the short term — are not the best solution, let’s examine other options. What I am going to suggest you try is to separate metrics and incentive schemes. Also, important thing about all these metrics is that they should never be just the idea of managers, they should come from either the customers or the teams themselves.

KPIs

As David Anderson wrote, KPIs should represent fitness criteria for your product — meaning they should be something by which your customers will evaluate whether they are going to use whatever you’ve built or not.

To get to KPIs / fitness criteria, you generally need to ask your customers and users. After that you can measure trends and also changes in the criteria themselves to see how your product is evolving.

Health metrics

Imagine a the equivalent of a heart rate. There is no single ideal heart rate, but sudden changes or irregularities may indicate a potential problem. Examples of such metrics in software development could be:

lead / cycle time distribution and changes in it
number of completes stories over time (or velocity changes over time, if you are using velocity)
average age of both planned and unplanned product backlog items
product backlog size growth or reductions over time
number of created vs. resolved bugs

All of these, and especially sudden changes or unwanted trends in them may indicate a problem that the team should discuss.

Improvement metrics

Constant striving for better ways of working is one of the key agile principles. And this is where improvement metrics come in: once the team decides to improve something, encourage them to create a metric how they will monitor their progress and how they will know whether they are really improving so that they can reflect later on.

One example I recently heard was to measure the average size of pull requests because the team became aware that huge pull requests are problematic (reviewing them is painful, they tend to break code in unexpected ways etc.). If all the team members agree to such an improvement, they can then track the trend of pull request size so that they know whether and how much they are improving — and they can discuss what could they do to improve even more or when to stop.

Editor’s note: How do you measure hurricanes?

When you think of the spiral nature of iteration based development with frequent delivery, it’s easy to picture a swirling hurricane, growing, hitting land with devastating effect, and then dissipating.

This might be a model to consider when measuring an agile team:

We want to determine size, potential impacts, direction of movement, as well as velocity of large weather events. We view storms, and products, in relationship to their environment.
We want to avoid huge costly storms that go the wrong direction and blow themselves out.
We want to ensure the direction is optimal for what we plan to achieve.
We know that there is no truly “indefinite” project, so we ought to know in advance how we might dissipate the work, and dissolve the elements of the team to find other productive pursuits.

How do you measure your teams? As a collection of individuals competing against each other, with sabotage incentivized? Or do you measure, direct, and reward your team as a team, with shared victories and losses?

Let us know your thoughts in the comments below.

Fred the ed. :-)