Talks: Actionable insights vs ranking: How to use and how NOT to use code quality metrics?

Presented by:


In this talk, we want to make two major points:

  • Metrics can facilitate better conversation about code quality. They help you focus more on technical problems and improvements instead of personal preferences and organizational issues.
  • Metrics can be misused very easily. Knowing their limitations is crucial.


For each metric, we'll discuss:

  • code examples in Python
  • how to calculate
  • interpretation (incl. some comparison across open source Python projects)
  • actions
  • limitations

Method Length

The simple.

  • You can calculate it without specific tools.
  • First step: Extract functions.
  • It shows well some general limitations of code quality metrics.

Cyclomatic Complexity

The old.

  • Show the formula, but don't explain it in detail. :-)
  • Extract functions. Remove redundant if conditions.
  • It doesn't account for nested coding constructs. It ignores some modern language patterns.

Cognitive Complexity

The human.

  • Calculation and interpretation: see
  • Actions: Extract functions. Use shorthand structures. More Pythonic code is also more readable.
  • Limitations: It ignores both the length of a linear block and the complexity of the expressions used in it.

Working Memory

Another aspect of human understanding.

  • Calculation: see
  • Interpretation: The 7 +/- 2 rule of the human working memory.
  • Actions: Extract functions, some more specific refactorings this metric rewards.
  • Limitations: It ignores the structure.

Limitations And Pitfalls


  • They can be gamed.
  • They easily encourage one-sided thinking and behaviour.

Specific For Code Quality Metrics

  • Great as warning signs, not good as "proof of excellence".

Compound Metrics

Giving a more versatile picture than a single metric.

What Metrics Don't Capture

  • naming, consistent terminology, ubiquitous language (DDD)
  • project structure
  • correctness