Numbers that aren’t numbers
‘Numbers’ that you can’t represent numerically 2020-09-29 #data #DDD
Software that lets you configure data input - from spreadsheets to forms - often gives you a way to specify whether each input expects a number, date, email address or other text fragment. These data types help keep the data clean, and also allow specific input user interfaces, such as date pickers. So far so good, but occasionally, you’ll choose the obvious data type instead of the correct one.
Non-numeric ‘numbers’
It turns out that some ‘numbers’ don’t work properly if you use a number data type, as in these examples.
Data name | Example | Problem with number data type |
---|---|---|
National insurance number | QQ123456C | Letters |
International Bank Account Number | NL20 INGB 0001 2345 67 | Letters and spaces |
Telephone number | 088 585 1585 | Significant leading zero |
House number | 13a | Letter suffix |
Aircraft tail number | G-KELS | No digits at all! |
The list goes on, with myriad order numbers and customer numbers in business, many of which contain letters that you can’t enter on the numeric keypad your phone helpfully shows when you need to enter a number.
Store data with ‘number’ in the name as text
The table above suggests the following data modelling rule of thumb for data types in business software.
Use a text data type for any data whose name includes the word number.
This deserves an explanation. It turns out that the word number means more than one thing.
Programming language vs everyday language
I recently heard about a hospital patient information system that uniquely identified patients by their patient number. When this system’s replacement used the term patient ID (PID), hospital staff took no time to start referring to a patient’s PID number.
In the context of programming, number refers to a strictly numeric value. However, in everyday language, number sometimes means unique identifier. (And in mathematics, some numbers use symbolic names, such as π.)
While this may surprise some programmers, normal people (i.e. non-programmers) don’t habitually use the phrase unique identifier in conversation. Instead, if they make the mistake of using software development jargon in a user interface, people using the software will probably correct the name.
The data type naming problem
Non-numeric ‘numbers’ present a design challenge for software that requires non-programmers to select the appropriate data type. Relying on documentation, such as this article, to explain why they should use a text type seems unsatisfying.
Introducing a new text data type for the other kind of number might help avoid confusion, and might enable useful functionality such as enforcing a particular format or validation rules. So what should we call it?
- Unique number
- Identifying number
- Serial number
We actually use serial number in some cases, but while it sounds right for a bicycle, for example, it doesn’t work as well for a patient. Unfortunately, the generic term for the bicycle or patient - entity - takes us back to where we started: programming jargon.
To conclude, maybe English has a word for these non-numeric numbers, but until I can figure it out, this problem will remain unsolved.