“You can’t manage what you don’t measure.” This quote has been attributed to both Drucker and Denning, and I’m not sure of its genuine origin. It’s been oft-repeated, and recently it’s become just as fashionable to criticise it. The criticisms though usually fall into two different, but sometimes related, logical fallacies. The statement that you “can’t manage what you don’t measure” does not mean that if you do measure something you are therefore managing it [effectively] or have the means to do so. To say this is to change the meaning of the phrase from describing a necessary condition to being a sufficient condition. The other common logical fallacy is that the statement is not true because once you start to measure things, those measurements become an objective in themselves, and thus the original goal of the process being managed is subverted. Once again, this is to imply that the statement is itself a wholly sufficient condition for effective management.
It is absolutely the case that you cannot manage something effectively without some sort of measurement. How do you even know whether you are succeeding or failing, how well you are doing, let alone answer any other more sophisticated or subtle questions?
I think perhaps this is related to the question of responsibility versus accountability see: Accountability, Responsibility, Dependability. If you are responsible for a process, then as long as you carry out your prescribed duties there is no a priori need for you to measure those duties. If you are accountable for the outcomes however, you need - by definition - to be able to account for them: to demonstrate knowledge and understanding of the reasons for the outcome. It’s very difficult to do this without some sort of quantitative measure, and to try and do so without (presumably relying instead just upon anecdote or guesswork, or by dint of claimed experience or authority (=’trust me’) is really not to provide any sort of meaningful account whatsoever.
To provide data and measurements with an argument or proposal makes that position so very much more convincing. It is to adopt the scientific approach - supporting a hypothesis with meaningful data - and since the Enlightenment we have seen the power of science to improve outcomes and advance knowledge and understanding.
Now it is very true that there are risks with measurements. The first, mentioned above, is that the measurement itself becomes an objective, causing all sorts of unintended consequences. This danger is exacerbated if the measurement becomes known as a Key Performance Indicator, or if it is tied to performance, reward or assessments. Language matters, and the right use of language (e.g. referring to a particular metric as one of many indicators of performance, rather than as a KPI) can help mitigate this risk, as can overall management behaviour. Better still, use of multiple (sometimes intentionally antagonistic) metrics as part of a Balanced Scorecard can help reinforce this perspective. Focussing on trends rather than targets also helps to reduce this danger.
A second significant risk with measurements is in having an insufficient understanding of how the measurements were made, and the context within which they were made, rendering it impossible to properly infer useful conclusions from the data. Often a metric can be named something seemingly self-descriptive and obvious, but the depth of detail beneath the name is not revealed, shared or known making it all too easy to draw the wrong conclusion. A metric called “BugCount” for example sounds very straightforward: surely it’s the “number of bugs”? Well yes, but what is the definition of a bug in this context? Is it all reported bugs? Or confirmed bugs? Is a bug limited to a logic error in code, or to any issue with the system, e.g. a transient network issue? Even if this is well understood, what does a particular value of this metric mean? How do you know whether a value or even a trend is anomalous, a sign of good things or of bad? Number of bugs presumably is correlated with the number of lines of code, so a bigger system will have more bugs than a smaller system. A system undergoing lots of change will presumably have more bugs than an older more established and static system. Perhaps bugs rise with the number of users who report them. Understanding all this context is vital if you want to be able to make meaningful use of this seemingly key metric.
Understanding these risks - and demonstrating this understanding in the interpretation and presentation of metrics - is the crucial step to earning the “trust me” sense of dependability: that confidence one seeks to earn, where your judgement is known to be something that can be depended upon by others.