Naming smells
Code smells that indicate bad names so you can avoid using them 2016-10-31 #naming #programming
Sometimes, when you’re reading code that looks nice and clean but which doesn’t quite make sense, the problem lies in the naming. Naming smells are code smells that come from bad names.
A code smell indicates where you can improve your code, and often points to some deeper problem. A particular code smell often has a corresponding refactoring that removes that particular smell, improving the code. Naming smells appear in many forms, but all have the same refactoring: Rename. So how hard could it be?
This article describes some common naming smells. Most come from names that don’t carry as much meaning as they might, or suffer from ambiguity. Beyond that, bad names lack precision or accuracy.
Meaningless names
foo
doesn’t mean anything at all - the coding equivalent of untitled
.
Nor does bar
.
In theory, foo
indicates a placeholder in an example where the name has no significance - a metasyntactic variable.
In practice, foo
typically indicates a deliberate refusal to name something, with the excuse that the code won’t last long.
Tens of millions of occurrences on GitHub suggest programmers frequently succumb to this temptation.
The solution to meaningless names like foo
lies in the realisation that people find concrete examples easier to understand than abstract examples.
Rename foo
to a concrete real-world concept, such as food
or foot
.
What is the worst ever variable name?
dataWhat is the second-worst name?
data2What is the third-worst name ever?
data_2
Abstract names
data
and object
lack precision.
No doubt they do name data and objects, but you already knew that without the vague name.
You might as well use thing
, which is equally abstract but at least has humour value in its uselessness.
Rename data
to a more descriptive name that identifies what kind of data it is.
Numeric suffixes
employee2
is a more subtle case of a name that is too abstract.
employee2
has more meaning than data
, say, but the numeric suffix probably separates it from something already called employee
.
When this happens, the reader knows is that there are two employees, but not how one differs from the other.
Rename both employee
and employee2
to names that communicate the difference between the two employees.
Perhaps you have a managementEmployee
and a recentlyHiredEmployee
.
If your programming language has types, both identifiers probably have the type Employee
, so you can remove the employee
prefix, leaving a manager
and a recentHire
.
Note that although code that includes employee
, employee2
and employee_2
makes you laugh, you’ll stop seeing the funny side when you see it in legacy production code that your team has just inherited.
You’ll probably also stop being able to read The Daily WTF at the same time.
Trust me on this.
Abbreviated names
acc
and pos
look like abbreviated forms of words, which causes problems when you realise that you have to choose between words like accumulator
and accuracy
, or between position
and point of sale
.
Abbreviations tend to suffer from ambiguity since multiple words often share the same abbreviation.
char
, mod
and auth
are more subtly ambiguous, because people may use them as standard abbreviations within specific contexts.
This stops working when you switch context: when character
becomes characteristic
and modulus
becomes model
.
Meanwhile, no-one ever knows whether auth
is authentication
or authorisation
(or both).
Refactor abbreviations to the complete word, which is easy as long as you know which word that is.
Short names
a
and other single-letter names have more potential explanations than abbreviation, which makes them worse.
a
might abbreviate a word, like answer
, but it might also be used as a meaningless placeholder, either because it comes first in the alphabet or because foo
was too long to type.
Single-letter names are meta-ambiguous because they’re ambiguous about which kind of ambiguity you’re dealing with. Refactor to a complete word, using the refactoring for either meaningless names or abbreviated names.
Traditionally, at least in certain programming languages, coders use names like i
, j
and k
for loop variables and would prefer to ignore these problems and carry on using single-letter names.
(If they’re particularly hardcore, they use ii
, jj
and kk
, which are easier to search for in a text editor with no programming language support.)
for (int i = 1; i < 42; ++i)
If you’re wondering whether it’s okay to make a for-loop index exception, you should probably switch to a more modern programming language that iterates over collection types without using a loop index. And before you whine that single-letter names are okay because Haskell, one letter is too short in functional programming as well.
/** Modify the value viewed through the lens, returning a `C` on the side. */
def modp[C](f: B1 => (B2, C), a: A1): (A2, C) = {
val (b, c) = f(get(a))
(set(a, b), c)
}
This method from the Lens
class in Scalaz defies comprehension.
This is what Scala code written with a Haskell accent looks like.
At least there’s a comment that tells you what modp
is supposed to mean.
Haskell presumably uses single-letter names because the mathematicians who invented the language use single-letters in written mathematics. What this Haskell-style code seems to ignore is that written mathematics includes paragraphs of text and precise symbolic definitions that explain what each name means.
Symbolic names
>=>
ends up being called the fish operator by the Scalaz programmers who don’t know what it’s called.
This ASCII-art fish follows mathematics’ tradition of starting with single-letter names, running out of letters, subsequently exhausting the Greek and Hebrew alphabets, and then inventing new symbols in the vain hope that they’ll catch on.
Meanwhile, we have already reached peak Haskell, as noted in a tweet by @aisamanra:
Vague words
InvoiceManager
suffers from a vague noun.
Who knows what a manager really does?
Alan Green wrote about the ‘-Manager’ problem, and suggests some more meaningful alternatives, such as bucket
, supervisor
, planner
and builder
.
getScore
, on the other hand, suffers from a vague verb.
Rename names with a get
prefix to use a more specific verb, such as calculateScore
or estimateScore
.
The get
prefix is largely a Java-programming anti-pattern that comes from the JavaBean Specification’s requirement that the accessor method for a score
property must be called getScore
.
Java developers habitually use the prefix for all kinds of methods, not just JavaBean properties.
A related anti-pattern results from the names isScore
and hasScore
for a Boolean score
property: Boolean variables often have names like isVictory
, with a redundant is-
prefix.
Vestigial Hungarian notation
isVictory
and dateCreated
have prefixes that indicate Boolean and date types, respectively.
This is a kind of vestigial Hungarian notation.
In naming, Hungarian Notation takes the title of the most spectacular example of something that seemed like a good idea at the time but isn’t anymore, now that we have languages with types.
These names may be useful in stringly-typed languages, but you don’t need them in languages like Java and C#, where you can define your own types. Rename to drop the prefix - victory
and created
, and use the appropriate type.
dateOfBirth
or birthDate
are exceptions: widely-used domain terms in fields like human resources.
Although renaming to born
would make sense from a coding point of view, the domain likely never uses the term.
Multiple words
appointment_list
typifies many names for collections of things that simply concatenate the collection type and the item type.
company_person
names a relationship by using the names of two related entities.
This approach can lead to names that form a pidgin language of simple words, like text_correction_by_editor
, as if identifiers were written in Basic English.
We tend to write code in English, which has a rich vocabulary to use for naming.
This code smell is the observation that ‘there’s a word for that!’
Rename appointment_list
to calendar
.
Rename company_person
to employee
, owner
, shareholder
or some more specific relationship name.
In the case of a text_correction_by_editor
, referring to the person who edits a manuscript, rename to just edit
.
Wrong names
order
or carrier
, depending on the context, can be the right name for the wrong thing, i.e. the wrong name.
In a supply chain context, for example, an order might be related to a shipment but isn’t the same thing.
Similarly, a carrier probably isn’t the same kind of organisation as a broker.
Some names are just wrong.
Spotting and fixing wrong names can prove difficult because you need some other clue that the intended concept is something else. Documentation benefits from a little repetition, which builds confidence in the intended meaning. If you only have one clock, you don’t know if it has the right time. If you have two clocks that differ, you don’t know which one is wrong. Only three clocks give you more confidence in the correct time than one, assuming that it’s unlikely for more than one to be wrong. Naming and documentation are like a pocket watch and a clock, in that respect.