Writing by Peter Hilton

Numbers that aren’t numbers

‘Numbers’ that you can’t represent numerically 2020-09-29 #data #DDD

A mailbox for house number 13a

Daniel von Appen

Software that lets you configure data input - from spreadsheets to forms - often gives you a way to specify whether each input expects a number, date, email address or other text fragment. These data types help keep the data clean, and also allow specific input user interfaces, such as date pickers. So far so good, but occasionally, you’ll choose the obvious data type instead of the correct one.

Non-numeric ‘numbers’

It turns out that some ‘numbers’ don’t work properly if you use a number data type, as in these examples.

Data name Example Problem with number data type
National insurance number QQ123456C Letters
International Bank Account Number NL20 INGB 0001 2345 67 Letters and spaces
Telephone number 088 585 1585 Significant leading zero
House number 13a Letter suffix
Aircraft tail number G-KELS No digits at all!

The list goes on, with myriad order numbers and customer numbers in business, many of which contain letters that you can’t enter on the numeric keypad your phone helpfully shows when you need to enter a number.

Store data with ‘number’ in the name as text

The table above suggests the following data modelling rule of thumb for data types in business software.

Use a text data type for any data whose name includes the word number.

This deserves an explanation. It turns out that the word number means more than one thing.

Programming language vs everyday language

I recently heard about a hospital patient information system that uniquely identified patients by their patient number. When this system’s replacement used the term patient ID (PID), hospital staff took no time to start referring to a patient’s PID number.

In the context of programming, number refers to a strictly numeric value. However, in everyday language, number sometimes means unique identifier. (And in mathematics, some numbers use symbolic names, such as π.)

While this may surprise some programmers, normal people (i.e. non-programmers) don’t habitually use the phrase unique identifier in conversation. Instead, if they make the mistake of using software development jargon in a user interface, people using the software will probably correct the name.

The data type naming problem

Non-numeric ‘numbers’ present a design challenge for software that requires non-programmers to select the appropriate data type. Relying on documentation, such as this article, to explain why they should use a text type seems unsatisfying.

Introducing a new text data type for the other kind of number might help avoid confusion, and might enable useful functionality such as enforcing a particular format or validation rules. So what should we call it?

We actually use serial number in some cases, but while it sounds right for a bicycle, for example, it doesn’t work as well for a patient. Unfortunately, the generic term for the bicycle or patient - entity - takes us back to where we started: programming jargon.

To conclude, maybe English has a word for these non-numeric numbers, but until I can figure it out, this problem will remain unsolved.

Share on TwitterShare on LinkedIn